Home > Backend Development > PHP Tutorial > PHP gets all links in the specified URL page

PHP gets all links in the specified URL page

WBOY
Release: 2016-08-08 09:25:58
Original
955 people have browsed it

form:http://www.uphtm.com/php/253.html

This thing is actually commonly used by us developers. We used it when we were doing a project to capture friendly links from other websites. Today I saw a friend compiled a PHP function to get all the link functions in the specified URL page. Let’s take a look at it. Take a look.

The following code can obtain all links in the specified URL page, that is, the href attribute of all a tags:

  1. // Get the HTML code of the link
  2. $html = file_get_contents('http://www.111cn.net');
  3. $dom = new DOMDocument();
  4. @$dom->loadHTML($html);
  5. $xpath = new DOMXPath($dom);
  6. $hrefs = $xpath->evaluate('/html/body//a');
  7. for ($i = 0; $i < $hrefs->length; $i++) {
  8. $href = $hrefs->item($i);
  9. $url = $href->getAttribute('href');
  10. echo $url.'
    ';
  11. }

This code will get the href attribute of all a tags, but the href attribute value is not necessarily a link. We can filter it and only keep the link address starting with http:

  1. // Get the HTML code of the link
  2. $html = file_get_contents('http://www.111cn.net');
  3. $dom = new DOMDocument();
  4. @$dom->loadHTML($html);
  5. $xpath = new DOMXPath($dom);
  6. $hrefs = $xpath->evaluate('/html/body//a');
  7. for ($i = 0; $i < $hrefs->length; $i++) {
  8. $href = $hrefs->item($i);
  9. $url = $href->getAttribute('href');
  10. // Keep links starting with http
  11. if(substr($url, 0, 4) == 'http')
  12. echo $url.'
    ';
  13. }

fopen() function reads all the links in the specified web page and counts the number. This code is suitable for use in some places where the content of the web page needs to be collected. In this example, reading the Baidu homepage is used as an example to find out all the links in the Baidu homepage. Link address, the code has been tested and is fully usable:

  1. if(empty($url))$url = "http://www.baidu.com/";//The URL address of the link that needs to be collected
  2. $site=substr($url,0,strpos($url,"/",8));
  3. $base=substr($url,0,strrpos($url,"/")+1);//The directory where the file is located
  4. $fp = fopen($url, "r" );//Open the url address page
  5. while(!feof($fp))$contents.=fread($fp,1024);
  6. $pattern="|href=['"]?([^ '"]+)['" ]|U";
  7. preg_match_all($pattern,$contents, $regArr, PREG_SET_ORDER);//Use regular expressions to match all href=
  8. for($i=0;$i
  9. if(!eregi("://",$regArr[$i][1]))//Determine whether it is a relative path, that is, whether there is still ://
  10. if(substr($regArr[$i][1],0,1)=="/")//Is it the root directory of the site
  11. echo "link".($i+1).":".$site.$regArr[$i][1]."
    ";//Root directory
  12. else
  13. echo "link".($i+1).":".$base.$regArr[$i][1]."
    ";//Current directory
  14. else
  15. echo "link".($i+1).":".$regArr[$i][1]."
    ";//relative path
  16. }
  17. fclose($fp);
  18. ?>

form:http://www.uphtm.com/php/253.html

The above introduces PHP to get all the links in the specified URL page, including the content. I hope it will be helpful to friends who are interested in PHP tutorials.

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template