Using php to capture the content of the page is very useful in actual development, such as making a simple content collector, extracting part of the content in the webpage, etc.
The captured content is processed through Regular ExpressionYou can get the content you want by filtering. As for how to use regular expressions to filter, I won’t introduce it here. For those who are interested, the following are several commonly used methods to capture using PHP. How to retrieve content from a web page.
1.file_get_contents
<?php $url = "http://www.jb51.net"; $contents = file_get_contents($url); //如果出现中文乱码使用下面代码 //$getcontent = iconv("gb2312", "utf-8",$contents); echo $contents; ?>
2.curl
PHP code
<?php $url = "http://www.jb51.net"; $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); //在需要用户检测的网页里需要增加下面两行 //curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY); //curl_setopt($ch, CURLOPT_USERPWD, US_NAME.":".US_PWD); $contents = curl_exec($ch); curl_close($ch); echo $contents; ?>
3.fopen->fread->fclose
PHP code, code is as follows:
<?php $handle = fopen ("http://www.jb51.net", "rb"); $contents = ""; do { $data = fread($handle, 1024); if (strlen($data) == 0) { break; } $contents .= $data; } while(true); fclose ($handle); echo $contents; ?>
Note:
1 .Using file_get_contents and fopen must enable allow_url_fopen. Method: Edit php.ini and set allow_url_fopen = On. When allow_url_fopen is turned off, neither fopen nor file_get_contents can open remote files.
2. To use curl, you must have space to enable curl. Method: Modify php.ini under Windows, remove the semicolon in front of extension=php_curl.dll, and copy ssleay32.dll and libeay32.dll to C:\WINDOWS\system32; under Linux, installation curl extension.
The above is the detailed content of How to get web page content in php. For more information, please follow other related articles on the PHP Chinese website!