There are many functions in PHP that can collect remote pages. For example, file_get_contents(), fopen, file(), these functions can collect remote server data, but curl is the best for performance. It supports multi-threading.
Example
The code is as follows |
Copy code |
代码如下 |
复制代码 |
$curlPost = 'a=1&b=2';//模拟POST数据
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array('X-FORWARDED-FOR:0.0.0.0', 'CLIENT-IP:0.0.0.0')); //构造IP
curl_setopt($ch, CURLOPT_REFERER, "http://www.bkjia.com/"); //构造来路
curl_setopt($ch,CURLOPT_URL, 'http://www.bkjia.com');//需要抓取的页面路径
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_POSTFIELDS, $curlPost);//post值
$file_contents = curl_exec($ch);//抓取的内容放在变量中
curl_close($ch)
|
$curlPost = 'a=1&b=2';//Simulate POST data
$ch = curl_init();
curl_setopt($ch, CURLOPT_HTTPHEADER, array('X-FORWARDED-FOR:0.0.0.0', 'CLIENT-IP:0.0.0.0')); //Construct IP
curl_setopt($ch, CURLOPT_REFERER, "http://www.bkjia.com/"); //Construction origin
curl_setopt($ch,CURLOPT_URL, 'http://www.bkjia.com');//Page path to be crawled
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_POSTFIELDS, $curlPost);//post value
$file_contents = curl_exec($ch);//The captured content is placed in variables
curl_close($ch)
|
Another way is to use the file_get_contents() remote file acquisition function to obtain the remote page content
Notes
curl() is very efficient and supports multi-threading, but it needs to enable curl extension. The following are the steps to enable curl extension:
1. Copy the three files php_curl.dll, libeay32.dll, ssleay32.dll under the PHP folder to system32;
2. Remove the semicolon in extension=php_curl.dll in php.ini (c:WINDOWS directory);
3. Restart apache or IIS.
http://www.bkjia.com/PHPjc/445611.html
www.bkjia.comhttp: //www.bkjia.com/PHPjc/445611.htmlTechArticleThere are many functions that can collect remote pages in php, such as file_get_contents(), fopen, file(), etc. All functions can collect remote server data, but curl is the best for performance. It supports...