PHP's CURL normal crawling page procedure is as follows:
$url = 'http://www.baidu.com'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_VERBOSE, true); curl_setopt($ch, CURLOPT_HEADER, true); curl_setopt($ch, CURLOPT_NOBODY, true); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 20); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); $ret = curl_exec($ch); $info = curl_getinfo($ch); curl_close($ch);
If you crawl the 302 status, it is because during the crawling process, some jumps need to be given to the next page. The link passes parameters, and the next link also sets it as illegal access if the corresponding parameters are not received.
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, 'GET');
The display should be normal.
The above is used to grab the function, which should be almost no problem. You can check CURLOPT_CUSTOMREQUEST related information.
Use a custom request message instead of "GET" or "HEAD" as the HTTP request. This is useful for performing "DELETE" or other more covert HTTP requests. Valid values are "GET", "POST", "CONNECT", etc. That is, don't enter the entire HTTP request here. For example, entering "GET /index.html HTTP/1.0\r\n\r\n" is incorrect.
For more examples of PHP curl implementation of grabbing the page after 302 jump and related articles, please pay attention to the PHP Chinese website!