求高手,模拟浏览器抓取网页,该如何处理

WBOY
Release: 2016-06-13 11:46:28
Original
898 people have browsed it

求高手,模拟浏览器抓取网页
如抓取http://map.sogou.com/api/这个网页,我写的程序,如果不带网址后面的"/",会抓取得不到,但是站上网(http://tool.chinaz.com/Tools/PageCode.aspx),不带最后面的"/"即可抓取到(即:http://map.sogou.com/api),他是什么原理?下面贴出我的代码,请改进

<br /><br />function file_get($url){<br />	 ob_start();<br />	 $ch = curl_init();<br />	 <br />	 curl_setopt($ch, CURLOPT_COOKIEJAR, "./cookie.txt");<br />	 curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; InfoPath.1; CIBA)");<br />	curl_setopt($ch, CURLOPT_URL, $url);<br />	 curl_setopt($ch, CURLOPT_HEADER, FALSE);<br />	 curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);<br />	 curl_setopt($ch, CURLOPT_NOBODY, FALSE);<br /><br />	 curl_exec($ch);<br />	 curl_close($ch);<br />	 $content = ob_get_clean();<br />	 <br />	 <br /><br />	return $content;<br /><br />}<br />
Copy after login

------解决方案--------------------
CURLOPT_FOLLOWLOCATION

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template