Sample code 1: Use file_get_contents to get content in get mode
Copy code The code is as follows:
$url='http://www.baidu.com/';
$html=file_get_contents($url);
//print_r($http_response_header);
ec($html) ;
printhr();
printarr($http_response_header);
printhr();
?>
Example code 2: Open the url with fopen and get Method to obtain content
Copy code The code is as follows:
$fp=fopen($url,' r');
printarr(stream_get_meta_data($fp));
printhr();
while(!feof($fp)){
$result.=fgets($fp,1024) ;
}
echo "url body:$result";
printhr();
fclose($fp);
?>
Sample code 3: Use the file_get_contents function to get the url in post mode
Copy the code The code is as follows:
$data=array('foo'=>'bar');
$data=http_build_query($data);
$opts=array(
'http'=>array(
'method'=>'POST',
'header'=>"Content-type: application/x-www-form-urlencodedrn".
"Content-Length: ".strlen($data) ."rn",
'content'=>$data
),
);
$context=stream_context_create($opts);
$html=file_get_contents('http:/ /localhost/e/admin/test.html',false,$context);
echo$html;
?>
Sample code 4: Use the fsockopen function to open the url, Get the complete data in get mode, including header and body
Copy code The code is as follows:
functionget_url($url,$cookie=false){
$url=parse_url($url);
$query=$url[path]."?".$url[query];
ec( "Query:".$query);
$fp=fsockopen($url[host],$url[port]?$url[port]:80,$errno,$errstr,30);
if (!$fp){
returnfalse;
}else{
$request="GET$queryHTTP/1.1rn";
$request.="Host:$url[host]rn";
$request.="Connection: Closern";
if($cookie)$request.="Cookie:$cookien";
$request.="rn";
fwrite($fp ,$request);
while(!@feof($fp)){
$result.=@fgets($fp,1024);
}
fclose($fp);
return$result;
}
}
//Get the html part of the url, remove the header
functionGetUrlHTML($url,$cookie=false){
$rowdata=get_url($url ,$cookie);
if($rowdata)
{
$body=stristr($rowdata,"rnrn");
$body=substr($body,4,strlen($body ));
return$body;
}
returnfalse;
}
?>
Sample code 5: Use fsockopen function to open url and POST Method to obtain complete data, including header and body
Copy code The code is as follows:
functionHTTP_Post( $URL,$data,$cookie,$referrer=""){
// parsing the given URL
$URL_Info=parse_url($URL);
// Building referrer
if($ referrer=="")// if not given use this script. as referrer
$referrer="111";
// making string from $data
foreach($dataas$key=>$ value)
$values[]="$key=".urlencode($value);
$data_string=implode("&",$values);
// Find out which port is needed - if not given use standard (=80)
if(!isset($URL_Info["port"]))
$URL_Info["port"]=80;
// building POST-request:
$request.="POST ".$URL_Info["path"]." HTTP/1.1n";
$request.="Host: ".$URL_Info["host"]."n";
$request.="Referer:$referern";
$request.="Content-type: application/x-www-form-urlencodedn";
$request.="Content-length: ". strlen($data_string)."n";
$request.="Connection: closen";
$request.="Cookie:$cookien";
$request.="n";
$request.=$data_string."n";
$fp=fsockopen($URL_Info["host"],$URL_Info["port"]);
fputs($fp,$request);
while(!feof($fp)){
$result.=fgets($fp,1024);
}
fclose($fp);
return$result;
}
printhr();
?>
Sample code 6: Use the curl library. Before using the curl library, you may need to check php.ini to see if it has been opened. curl extension
Copy code The code is as follows:
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, 'http://www.baidu.com/');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
echo $file_contents;
?>
About curl library:
curl official website http://curl.haxx.se/
curl is a file transfer tool using URL syntax, supporting FTP, FTPS, HTTP HTTPS SCP SFTP TFTP TELNET DICT FILE and LDAP. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploads, Kerberos, HTTP-based uploads, proxies, cookies, user + password proof, file transfer recovery, http proxy channels and a lot of other useful tricks
Copy code The code is as follows:
functionprintarr(array$arr)
{
echo"
";
foreach($arras$key=>$value)
{
echo"$key=$value < ;br>";
}
}
?>
====================== ================================
The code for PHP to capture remote website data
There may still be Many programming enthusiasts will encounter the same question, that is, how to crawl the HTML code of other people's websites like a search engine, and then collect and organize the code into useful data for themselves! Let me introduce some simple examples today.
Ⅰ. Example of grabbing the title of a remote web page:
The following is the code snippet:
Copy the code The code is as follows:
/*
+------------------------------------ ------------------------
+Catch the code for the web page title, directly copy this code snippet, save it as a .php file and execute it .
+------------------------------------------------ ----------------
*/
error_reporting(7);
$file = fopen ("http://www.dnsing.com/", "r");
if (!$file) {
echo "Unable to open remote file.n";
exit;
}
while (!feof ($file)) {
$line = fgets ($file, 1024);
if (eregi ("
(.*)", $line, $out)) {
$title = $out[1];
echo "".$title."";
break;
}
}
fclose ($file);
//End
?>
Ⅱ. Example of grabbing the HTML code of a remote web page:
The following is a code snippet:
Copy code The code is as follows:
php
/*
+---------- ------
+DNSing Sprider
+----------------
*/
$fp = fsockopen("www.dnsing. com", 80, $errno, $errstr, 30);
if (!$fp) {
echo "$errstr ($errno)
n";
} else {
$out = "GET / HTTP/1.1rn";
$out .= "Host:www.dnsing.comrn";
$out .= "Connection: Close rnrn";
fputs ($fp, $out);
while (!feof($fp)) {
echo fgets($fp, 128);
}
fclose($fp);
}
//End
?>
Copy the above two code snippets directly and run them back to see the effect. The above example is just a prototype of grabbing web page data. To make it It is more suitable for your own use, and the situation varies. Therefore, all program enthusiasts should study it for themselves.
======================== =========
The slightly more meaningful functions are: get_content_by_socket(), get_url(), get_content_url(), get_content_object. These functions may give you some ideas.
Copy code The code is as follows:
//获取所有内容url保存到文件
function get_index($save_file, $prefix="index_"){
$count = 68;
$i = 1;
if (file_exists($save_file)) @unlink($save_file);
$fp = fopen($save_file, "a+") or die("Open ". $save_file ." failed");
while($i<$count){
$url = $prefix . $i .".htm";
echo "Get ". $url ."...";
$url_str = get_content_url(get_url($url));
echo " OKn";
fwrite($fp, $url_str);
++$i;
}
fclose($fp);
}
//获取目标多媒体对象
function get_object($url_file, $save_file, $split="|--:**:--|"){
if (!file_exists($url_file)) die($url_file ." not exist");
$file_arr = file($url_file);
if (!is_array($file_arr) || empty($file_arr)) die($url_file ." not content");
$url_arr = array_unique($file_arr);
if (file_exists($save_file)) @unlink($save_file);
$fp = fopen($save_file, "a+") or die("Open save file ". $save_file ." failed");
foreach($url_arr as $url){
if (empty($url)) continue;
echo "Get ". $url ."...";
$html_str = get_url($url);
echo $html_str;
echo $url;
exit;
$obj_str = get_content_object($html_str);
echo " OKn";
fwrite($fp, $obj_str);
}
fclose($fp);
}
//遍历目录获取文件内容
function get_dir($save_file, $dir){
$dp = opendir($dir);
if (file_exists($save_file)) @unlink($save_file);
$fp = fopen($save_file, "a+") or die("Open save file ". $save_file ." failed");
while(($file = readdir($dp)) != false){
if ($file!="." && $file!=".."){
echo "Read file ". $file ."...";
$file_content = file_get_contents($dir . $file);
$obj_str = get_content_object($file_content);
echo " OKn";
fwrite($fp, $obj_str);
}
}
fclose($fp);
}
//获取指定url内容
function get_url($url){
$reg = '/^http://[^/].+$/';
if (!preg_match($reg, $url)) die($url ." invalid");
$fp = fopen($url, "r") or die("Open url: ". $url ." failed.");
while($fc = fread($fp, 8192)){
$content .= $fc;
}
fclose($fp);
if (empty($content)){
die("Get url: ". $url ." content failed.");
}
return $content;
}
//使用socket获取指定网页
function get_content_by_socket($url, $host){
$fp = fsockopen($host, 80) or die("Open ". $url ." failed");
$header = "GET /".$url ." HTTP/1.1rn";
$header .= "Accept: */*rn";
$header .= "Accept-Language: zh-cnrn";
$header .= "Accept-Encoding: gzip, deflatern";
$header .= "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; InfoPath.1; .NET CLR 2.0.50727)rn";
$header .= "Host: ". $host ."rn";
$header .= "Connection: Keep-Alivern";
//$header .= "Cookie: cnzz02=2; rtime=1; ltime=1148456424859; cnzz_eid=56601755-rnrn";
$header .= "Connection: Closernrn";
fwrite($fp, $header);
while (!feof($fp)) {
$contents .= fgets($fp, 8192);
}
fclose($fp);
return $contents;
}
//获取指定内容里的url
function get_content_url($host_url, $file_contents){
//$reg = '/^(#|javascript.*?|ftp://.+|http://.+|.*?href.*?|play.*?|index.*?|.*?asp)+$/i';
//$reg = '/^(down.*?.html|d+_d+.htm.*?)$/i';
$rex = "/([hH][rR][eE][Ff])s*=s*['"]*([^>'"s]+)["'>]*s*/i";
$reg = '/^(down.*?.html)$/i';
preg_match_all ($rex, $file_contents, $r);
$result = ""; //array();
foreach($r as $c){
if (is_array($c)){
foreach($c as $d){
if (preg_match($reg, $d)){ $result .= $host_url . $d."n"; }
}
}
}
return $result;
}
//获取指定内容中的多媒体文件
function get_content_object($str, $split="|--:**:--|"){
$regx = "/hrefs*=s*['"]*([^>'"s]+)["'>]*s*(.*?)/i";
preg_match_all($regx, $str, $result);
if (count($result) == 3){
$result[2] = str_replace("多媒体: ", "", $result[2]);
$result[2] = str_replace("", "", $result[2]);
$result = $result[1][0] . $split .$result[2][0] . "n";
}
return $result;
}
?>
================================================== =======
When the same domain name corresponds to multiple IPs, PHP's function to obtain the content of the remote web page
fgc simply reads it and encapsulates all operations
fopen also performs some encapsulation , but you need to read in a loop to get all the data.
fsockopen This is a straight-line socket operation.
If you just read an html page, fgc is better.
If the company accesses the Internet through a firewall, the general file_get_content function will not work. Of course, it is also possible to directly write http requests to the proxy through some socket operations, but it is more troublesome.
If you can confirm that the file is small, you can choose any of the above two methods fopen,join('',file($file));. For example, if you only operate files smaller than 1k, it is best to use file_get_contents.
If you are sure that the file is large, or if you cannot determine the size of the file, it is best to use file streams. There is no obvious difference between fopening a 1K file and fopening a 1G file. The longer the content, the longer it takes to read, rather than letting the script die.
-------------------------------------------------- -----
http://www.phpcake.cn/archives/tag/fsockopen
PHP has many ways to obtain remote web content, such as using its own functions such as file_get_contents and fopen.
Copy code The code is as follows:
echo file_get_contents("http://blog.s135 .com/abc.php");
?>
However, in load balancing such as DNS polling, the same domain name may correspond to multiple servers and multiple IPs. Assume that blog.s135.com is resolved to three IP addresses: 72.249.146.213, 72.249.146.214, and 72.249.146.215 by DNS. Every time a user accesses blog.s135.com, the system will access one of the servers based on the load balancing algorithm.
When I was working on a video project last week, I encountered such a requirement: I needed to access a PHP interface program (assumed to be abc.php) on each server in order to query the transmission status of this server.
At this time, you cannot directly use file_get_contents to access http://blog.s135.com/abc.php, because it may keep accessing a certain server repeatedly.
By visiting http://72.249.146.213/abc.php, http://72.249.146.214/abc.php, http://72.249.146.215/abc.php in sequence, in these three servers This is also not possible when the Web Server on the computer is equipped with multiple virtual hosts.
It is not possible to set local hosts, because hosts cannot set multiple IPs corresponding to the same domain name.
That can only be achieved through PHP and HTTP protocols: when accessing abc.php, add the blog.s135.com domain name to the header. So, I wrote the following PHP function:
Copy the code The code is as follows:
/ **************************
* Function usage: When the same domain name corresponds to multiple IPs, obtain the remote web page content of the specified server
* Parameter description :
* $ip server IP address
* $host server host name
* $url server URL address (excluding domain name)
* Return value:
* Obtained Remote web page content
* false Failed to access remote web page
****************************/
function HttpVisit($ip, $host, $url)
{
$errstr = '';
$errno = '';
$fp = fsockopen ($ip, 80, $errno, $errstr, 90);
if (!$fp)
{
return false;
}
else
{
$out = "GET {$url} HTTP/1.1rn";
$out .= "Host:{$host}rn";
$out .= "Connection: closernrn";
fputs ( $fp, $out);
while($line = fread($fp, 4096)){
$response .= $line;
}
fclose( $fp );
//Remove Header information
$pos = strpos($response, "rnrn");
$response = substr($response, $pos + 4);
return $response;
}
}
//Calling method:
$server_info1 = HttpVisit("72.249.146.213", "blog.s135.com", "/abc.php");
$server_info2 = HttpVisit("72.249.146.214 ", "blog.s135.com", "/abc.php");
$server_info3 = HttpVisit("72.249.146.215", "blog.s135.com", "/abc.php");
?>
http://www.bkjia.com/PHPjc/327908.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/327908.htmlTechArticleSample code 1: Use file_get_contents to get the content in get mode. Copy the code as follows: ?php $url='http: //www.baidu.com/'; $html=file_get_contents($url); //print_r($http_response_h...