Home > Backend Development > PHP Tutorial > php抓取页面内容

php抓取页面内容

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
Release: 2016-06-23 14:17:03
Original
1017 people have browsed it

先抓取一个页面内有用的链接,再抓取页面上链接的内容,用for循环,可是循环到第二次的时候就出错,
想了半天,不知道问题出在哪,请各位大神帮忙看下。

$url = 'http://www.meishij.net/chufang/diy/?page=1#listnav';$opts = array(   'http'=>array(     'user_agent' => "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)",  ) ); $context = stream_context_create($opts); $con = file_get_contents("$url", false, $context);//条目$preg ='#<strong class="title"><a target="_blank" title="(.*)" href="(.*)">(.*)</a></strong>#';   preg_match_all($preg, $con, $arr); //搜索内容赋值给数组 for($i=0;$i<20;$i++)  //单页面截取{     //print_r($arr[0][$i]);    $ss=$arr[2][$i];      echo $ss;   echo "</br>";  $opts = array(   'http'=>array(     'user_agent' => "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)",  ) ); $context = stream_context_create($opts); $cons = file_get_contents("$ss", false, $context);//标题$preg ='#<h2 class="cpc_h2">(.*)</h2>#';   preg_match_all($preg, $cons, $arr); //搜索内容赋值给数组 print_r($arr[0][0]);//内容$preg ='#<p><em class=(.*)>(.*)</em>(.*)</p>#';   preg_match_all($preg, $cons, $arr); //搜索内容赋值给数组 print_r($arr[0][0]);echo "</br>";print_r($arr[0][1]);echo "</br>";print_r($arr[0][2]);echo "</br>";//图片$preg ='#<p><img class="conimg" src="(.*)" alt="(.*)" width="(.*)" height="(.*)" /></p>#';   preg_match_all($preg, $cons, $arr); //搜索内容赋值给数组 print_r($arr[0][0]);echo "</br>";print_r($arr[0][1]);echo "</br>";print_r($arr[0][2]);echo "</br>";print_r($arr[0][3]);echo "</br>";print_r($arr[0][4]);echo "</br>";print_r($arr[0][5]);echo "</br>";print_r($arr[0][6]);     } 
Copy after login


回复讨论(解决方案)

哪位高手知道告诉下,3Q、

求解,为啥循环不过去。

print_r $arr;//看看$arr什么内容

preg_match_all($preg, $con,  $arr); //搜索内容赋值给数组 

for($i=0;$i {
   
   //print_r($arr[0][$i]); 
   $ss=$arr[2][$i];   
   echo $ss;
   echo "";
   
$opts = array( 
  'http'=>array( 
    'user_agent' => "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)",
  ) 
); 
$context = stream_context_create($opts); 
$cons = file_get_contents("$ss", false, $context);
print_r($cons);
die(); 
//标题
$preg ='#

(.*)

#';   
preg_match_all($preg, $cons,  $arr); //搜索内容赋值给数组 
print_r($arr[0][0]);
 
//内容
$preg ='#

(.*)(.*)

#';   
preg_match_all($preg, $cons,  $arr); //搜索内容赋值给数组 
print_r($arr[0][0]);
echo "";


看上面的红色部分,循环内和循环外的数组名是同一个,不会有问题么?
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template