Home > Backend Development > PHP Tutorial > PHP classic method of crawling network data index.php forum.php phpno

PHP classic method of crawling network data index.php forum.php phpno

WBOY
Release: 2016-07-29 08:53:38
Original
1772 people have browsed it

1. file_get_contents

Get data by get method

$url = 'blog.csdn.net/guugle2010';
$html = file_get_contents($url);
echo $html;
Copy after login

Get data by post method

        $data = array(
                'name' => 'guugle',
                'blog' => 'blog.csdn.net/guugle2010'
        );      
        $data = http_build_query($data);
        $options = array(
                'http' => array(
                        'method' => 'POST',
                        'header' => 'Content-type:application/x-www-form-urlencode',
                        'content' => $data
                )
        );
        $url = "http://localhost/test.php";
        $context = stream_context_create($options);
        $result = file_get_contents($url, false, $context);
        echo $result;
Copy after login
2. fopen method
$url = 'http://blog.csdn.net/guugle2010';
$handle = fopen($url, r);
$html = '';
while(!feof($handle)){
    $html .= fgets($handle);
}
echo $html;
fclose($handle);
Copy after login
3. Curl library, you need to open the curl extension
$url = 'http://blog.csdn.net/guugle2010';
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
 
echo $file_contents;
Copy after login

4. Open the link with fsocketopen

Get the complete data by get method Data (including header and body)

$url = 'http://blog.csdn.net/guugle2010';
function get_url($url,$cookie=false)
{
    $url = parse_url($url);
    $query = $url['path']."?".$url['query'];
    echo "Query:".$query;
    $fp = fsockopen( $url['host'], $url['port']?$url['port']:80 , $errno, $errstr, 30);
    if (!$fp)
    {
        return false;
    }
    else {
        $request = "GET $query HTTP/1.1\r\n";
        $request .= "Host: $url[host]\r\n";
        $request .= "Connection: Close\r\n";
        if($cookie) $request.="Cookie:   $cookie\n";
        $request.="\r\n";
        fwrite($fp,$request);
		$result = '';
        while(!feof($fp))
        {
            $result .= @fgets($fp, 1024);
        }
        fclose($fp);
        return $result;
    }
}
//获取url的html部分,去掉header
function GetUrlHTML($url,$cookie=false)
{
    $rowdata = get_url($url,$cookie);
    if($rowdata)
    {
        $body= stristr($rowdata,"\r\n\r\n");
        $body=substr($body,4,strlen($body));
        return $body;
    }

    return false;
}

echo get_url($url);

echo GetUrlHTML($url);
Copy after login

Get complete data (including header and body) in post mode
$url = 'http://blog.csdn.net/guugle2010';
function HTTP_Post($URL,$data,$cookie, $referer="")
{

	// parsing the given URL
    $URL_Info=parse_url($URL);

	// Building referrer
    if($referer=="") // if not given use this script as referrer
        $referer="<span style="font-family: Arial, Helvetica, sans-serif;">blog.csdn.net</span><span style="font-family: Arial, Helvetica, sans-serif;">";</span>

	// making string from $data
    foreach($data as $key=> $value)
    $values[]="$key=".urlencode($value);
    $data_string=implode("&",$values);

	// Find out which port is needed - if not given use standard (=80)
    if(!isset($URL_Info["port"]))
        $URL_Info["port"]=80;
	
	$request = '';
	// building POST-request:
    $request.="POST ".$URL_Info["path"]." HTTP/1.1\n";
    $request.="Host: ".$URL_Info["host"]."\n";
    $request.="Referer: $referer\n";
    $request.="Content-type: application/x-www-form-urlencoded\n";
    $request.="Content-length: ".strlen($data_string)."\n";
    $request.="Connection: close\n";

    $request.="Cookie:   $cookie\n";

    $request.="\n";
    $request.=$data_string."\n";

    $fp = fsockopen($URL_Info["host"],$URL_Info["port"]);
    fputs($fp, $request);
	$result = '';
    while(!feof($fp))
    {
        $result .= fgets($fp, 1024);
    }
    fclose($fp);

    return $result;
}

$data = array(
	'site'=>'<span style="font-family: Arial, Helvetica, sans-serif;">blog.csdn.net/guugle2010</span><span style="font-family: Arial, Helvetica, sans-serif;">', </span>
	'name'=>'guugle'); 
	
$cookie = '';
$referer = 'http://blog.csdn.net/';
	
echo HTTP_Post($url, $data, $cookie, $referer);
Copy after login

The above introduces the classic PHP method of capturing network data, including PHP and network content. I hope it will be helpful to friends who are interested in PHP tutorials.

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template