PHP gets all links in the specified URL page-PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

PHP gets all links in the specified URL page

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Aug 08, 2016 am 09:25 AM

href html nbsp url

form:http://www.uphtm.com/php/253.html

This thing is actually commonly used by us developers. We used it when we were doing a project to capture friendly links from other websites. Today I saw a friend compiled a PHP function to get all the link functions in the specified URL page. Let’s take a look at it. Take a look.

The following code can obtain all links in the specified URL page, that is, the href attribute of all a tags:

// Get the HTML code of the link
$html = file_get_contents('http://www.111cn.net');
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate('/html/body//a');
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
echo $url.'
';
}

This code will get the href attribute of all a tags, but the href attribute value is not necessarily a link. We can filter it and only keep the link address starting with http:

// Get the HTML code of the link
$html = file_get_contents('http://www.111cn.net');
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate('/html/body//a');
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
// Keep links starting with http
if(substr($url, 0, 4) == 'http')
echo $url.'
';
}

fopen() function reads all the links in the specified web page and counts the number. This code is suitable for use in some places where the content of the web page needs to be collected. In this example, reading the Baidu homepage is used as an example to find out all the links in the Baidu homepage. Link address, the code has been tested and is fully usable:

if(empty($url))$url = "http://www.baidu.com/";//The URL address of the link that needs to be collected
$site=substr($url,0,strpos($url,"/",8));
$base=substr($url,0,strrpos($url,"/")+1);//The directory where the file is located
$fp = fopen($url, "r" );//Open the url address page
while(!feof($fp))$contents.=fread($fp,1024);
$pattern="|href=['"]?([^ '"]+)['" ]|U";
preg_match_all($pattern,$contents, $regArr, PREG_SET_ORDER);//Use regular expressions to match all href=
for($i=0;$i
if(!eregi("://",$regArr[$i][1]))//Determine whether it is a relative path, that is, whether there is still ://
if(substr($regArr[$i][1],0,1)=="/")//Is it the root directory of the site
echo "link".($i+1).":".$site.$regArr[$i][1]."
";//Root directory
else
echo "link".($i+1).":".$base.$regArr[$i][1]."
";//Current directory
else
echo "link".($i+1).":".$regArr[$i][1]."
";//relative path
}
fclose($fp);
?>

form:http://www.uphtm.com/php/253.html

The above introduces PHP to get all the links in the specified URL page, including the content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn