When using curl to capture web page content, you often need to know the request header information returned by the web page, and Relevant information about the request, especially when there is a redirection during the request process, obtaining the request return header information is very helpful for analyzing the request content
The following is an example of a redirection in a request. Our purpose is to obtain the final actual requested url address
$url='http://www.appchina.com/market/r/489267/com.appshare.android.ilisten.vapk?c=aplus.direct&uid=gAJ9cQEu1TlyZxsXN-aB4RaanvFL6t6Bj-vj0rIBs&p=aplus.detail&m=redirect'; $ch=curl_init(); curl_setopt($ch, CURLOPT_URL, $url); //curl_setopt($ch, CURLOPT_POST, 1); //curl_setopt($ch, CURLOPT_POSTFIELDS, $params); curl_setopt($ch, CURLOPT_HEADER, 1);//返回response头部信息 curl_setopt($ch, CURLOPT_NOBODY, 1);//不返回response body内容 //curl_setopt($ch, CURLOPT_MAXREDIRS, 1);//设置请求最多重定向的次数 curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);//不直接输出response curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);//如果返回的response 头部中存在Location值,就会递归请求 $content=curl_exec($ch); $rinfo=curl_getinfo($ch); echo $content,"</br>"; echo "<hr>"; print_r($rinfo);
The following is the output result
HTTP/1.1 200 OKServer: nginxDate: Sat, 22 Dec 2012 06:17:44 GMTContent-Type: application/vnd.android.package-archiveConnection: closeLast-Modified: Mon, 03 Dec 2012 16:00:00 GMTExpires: Tue, 03 Dec 2013 16:00:00 GMTCache-Control: max-age=31536000Content-Length: 2142149 Array( [url] => http://www.d.appchina.com/McDonald/r/489267/com.appshare.android.ilisten.vapk?c=aplus.direct&uid=gAJ9cQEu1TlyZxsXN-aB4RaanvFL6t6Bj-vj0rIBs&p=aplus.detail&m=redirect [content_type] => application/vnd.android.package-archive [http_code] => 200 [header_size] => 289 [request_size] => 196 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0.171621 [namelookup_time] => 0.135256 [connect_time] => 0.152913 [pretransfer_time] => 0.152916 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => 2142149 [upload_content_length] => 0 [starttransfer_time] => 0.171582 [redirect_time] => 0 [certinfo] => Array ( ))
As you can see, after a recursive request, you finally get a response of 200, but this method cannot get the URL of the last request, which is the final URL of the actual request. To get this URL, you need to recursively analyze each request. Returned response
The following is a recursive function I wrote to get the last request URL
$url='http://www.appchina.com/market/r/489267/com.appshare.android.ilisten.vapk?c=aplus.direct&uid=gAJ9cQEu1TlyZxsXN-aB4RaanvFL6t6Bj-vj0rIBs&p=aplus.detail&m=redirect'; [php] view plaincopy $realUrl=getRedirectLocation($url); echo "</br>--->",$realUrl; function getRedirectLocation($url){ $realUrl=$url; echo $url,"</br>"; $ch=curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER, 1);curl_setopt($ch, CURLOPT_TIMEOUT, 3);//设置curl执行时间不超过3秒 //curl_setopt($ch, CURLOPT_NOBODY, 1);//这行不能要,如果添上,那么在遇到302重定向的时候就会得不到真正的请求url curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); $content=curl_exec($ch); //echo $content; $rinfo=curl_getinfo($ch); $matches=array(); if(preg_match('/Location:\s+?(.+?)\s+?/', $content,$matches)){ //echo $matches[1],"</br>"; unset($content); $realUrl=getRedirectLocation($matches[1]); } if(isset($content)){ unset($content); } return $realUrl; }