PHP’s CURL normal crawling page procedure is as follows:
$url = 'http://www.baidu.com'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_VERBOSE, true); curl_setopt($ch, CURLOPT_HEADER, true); curl_setopt($ch, CURLOPT_NOBODY, true); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET'); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_TIMEOUT, 20); curl_setopt($ch, CURLOPT_AUTOREFERER, true); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); $ret = curl_exec($ch); $info = curl_getinfo($ch); curl_close($ch);
If you grab a 302 status, it is because during the crawling process, some jumps need to pass parameters to the next link, and the next link is also set. If the corresponding parameters are not received, it is an illegal access. .
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, 'GET');
The display should be normal.
The above is used to grab the function, which should be almost no problem. You can check CURLOPT_CUSTOMREQUEST related information.
Use a custom request message instead of "GET" or "HEAD" for the HTTP request. This is useful for performing "DELETE" or other more covert HTTP requests. Valid values are "GET", "POST", "CONNECT", etc. That is, don't enter the entire HTTP request here. For example, entering "GET /index.html HTTP/1.0rnrn" is incorrect.