Use PHP's curl function to implement simulated login method.
To extract part of the data from Google search, I found that Google is very shielded from software capturing its data. In the past, by forging USER-AGENT, the data could be captured, but now it does not work. Using packet capture data, we found that Google determined cookies. When you do not have cookies, it will directly return 302 jumps, and there are dozens of 302 jumps in a row, and no data can be captured at all.
Therefore, when sending a search command, you need to extract and save the cookies first, and then use the saved cookies to send the search command again to capture the data normally. This is actually the same as the simulated login of the forum. You need to POST to log in first, get the cookies and save them, and then use the cookies to access.
PHP code is as follows:
Code is as follows:
<?php header('Content-Type: text/html; charset=utf-8'); $cookie_file = dirname(FILE).'/cookie.txt'; //$cookie_file = tempnam("tmp","cookie"); //先获取cookies并保存 $url = "http://www.google.com.hk"; $ch = curl_init($url); //初始化 curl_setopt($ch, CURLOPT_HEADER, 0); //不返回header部分 curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //返回字符串,而非直接输出 curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file); //存储cookies curl_exec($ch); curl_close($ch); //使用上面保存的cookies再次访问 $url = "http://www.google.com.hk/search?oe=utf8&ie=utf8&source=uds&hl=zh-CN&q=qq"; $ch = curl_init($url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file); //使用上面获取的cookies $response = curl_exec($ch); curl_close($ch); echo $response; ?>
The above is the detailed content of PHP CURL obtains cookies to simulate login method code example. For more information, please follow other related articles on the PHP Chinese website!