"/> ">

PHP获取页面中所有链接的正则

WBOY
Release: 2016-06-20 13:03:49
Original
1105 people have browsed it

总结了一个PHP获取页面中的所有链接的函数,函数代码如下:

<p>/*</p>*PHP获取页面中的所有链接<br />*/<br />function getPageLink($url){<br />	set_time_limit(0);<br />	$html=file_get_contents($url);<br />	preg_match_all("/<a(s*[^>]+s*)href=([\"|']?)([^\"'>\s]+)([\"|']?)/ies",$html,$out);<br />	$arrLink=$out[3];<br />	$arrUrl=parse_url($url);<br />	$dir='';<br />	if(isset($arrUrl['path'])&&!empty($arrUrl['path'])){<br />		$dir=str_replace("\\","/",$dir=dirname($arrUrl['path']));<br />		if($dir=="/"){<br />			$dir="";<br />		}<br />	}<br />	if(is_array($arrLink)&&count($arrLink)>0){<br />		$arrLink=array_unique($arrLink);<br />		foreach($arrLink as $key=>$val){<br />			$val=strtolower($val);<br />			if(preg_match('/^#*$/isU',$val)){<br />				unset($arrLink[$key]);<br />			}elseif(preg_match('/^\//isU',$val)){<br />				$arrLink[$key]='http://'.$arrUrl['host'].$val;<br />			}elseif(preg_match('/^javascript/isU',$val)){<br />				unset($arrLink[$key]);<br />			}elseif(preg_match('/^mailto:/isU',$val)){<br />				unset($arrLink[$key]);<br />			}elseif(!preg_match('/^\//isU',$val)&&strpos($val,'http://')===FALSE){<br />				$arrLink[$key]='http://'.$arrUrl['host'].$dir.'/'.$val;<br />			}<br />		}<br />	}<br />	sort($arrLink);<br />	return $arrLink;<br />}
Copy after login

函数用法如下:

<p>$links=getPageLink('http://www.scutephp.com');</p>echo "<pre class="brush:php;toolbar:false">";<br />print_r($links);
Copy after login


Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template