PHP article collection URL completion function (FormatUrl)_PHP tutorial

WBOY
Release: 2016-07-21 15:17:06
Original
1061 people have browsed it

Write a function that is necessary for collection, the URL completion function, which can also be called FormatUrl.
The purpose of writing this function is to develop a collection program. When collecting articles, you will often encounter that the path in the page is "relative path" or "absolute root path" and is not "absolute full path", so the URL cannot be collected.

Therefore, this function is needed to format the code and format all hyperlinks, so that the correct URL can be collected directly.

Popularization of path knowledge
Relative path: "../" "./" or add nothing in front
Absolute root path: /path/xxx.html
Absolute full path: http://www.xxx.com/path/xxx.html
Usage example:

Copy code The code is as follows:

$surl="http://www.jb51.net/";
$gethtm = 'HomepageResolution';
echo formaturl($gethtm,$surl);
?>

Output: HomepageSolution
---------Demo Example------------
Original path code: http:/ /www.newnew.cn/newnewindex.aspx
Output demo code: http://www.maifp.com/aaa/test.php
The following is the function code
Copy code The code is as follows:

function formaturl($l1,$l2){
if (preg_match_all("/(< img[^>]+src="([^"]+)"[^>]*>)|(]+href="([^"]+)"[ ^>]*>)|(]+src='([^']+)'[^>]*>)|(]+ href='([^']+)'[^>]*>)/i",$l1,$regs)){
foreach($regs[0] as $num => $url ){
$l1 = str_replace($url,lIIIIl($url,$l2),$l1);
}
}
return $l1;
}
function lIIIIl ($l1,$l2){
if(preg_match("/(.*)(href|src)=(.+?)( |/>|>).*/i",$l1, $regs)){$I2 = $regs[3];}
if(strlen($I2)>0){
$I1 = str_replace(chr(34),"",$I2);
$I1 = str_replace(chr(39),"",$I1);
}else{return $l1;}
$url_parsed = parse_url($l2);
$scheme = $ url_parsed["scheme"];if($scheme!=""){$scheme = $scheme."://";}
$host = $url_parsed["host"];
$l3 = $scheme.$host;
if(strlen($l3)==0){return $l1;}
$path = dirname($url_parsed["path"]);if($path[0] ==="\"){$path="";}
$pos = strpos($I1,"#");
if($pos>0) $I1 = substr($I1,0, $pos);
//Judge type
if(preg_match("/^(http|https|ftp):(//|\\)(([w/\+-~`@:%] )+.)+([w/\.=?+-~`@':!%#]|(&)|&)+/i",$I1)){return $l1; }//Start with http The url type should be skipped
elseif($I1[0]=="/"){$I1 = $l3.$I1;}//Absolute path
elseif(substr($I1,0,3 )=="../"){//Relative path
while(substr($I1,0,3)=="../"){
$I1 = substr($I1,strlen( $I1)-(strlen($I1)-3),strlen($I1)-3);
if(strlen($path)>0){
$path = dirname($path);
}
}
$I1 = $l3.$path."/".$I1;
}
elseif(substr($I1,0,2)=="./ "){
$I1 = $l3.$path.substr($I1,strlen($I1)-(strlen($I1)-1),strlen($I1)-1);
}
elseif(strtolower(substr($I1,0,7))=="mailto:"||strtolower(substr($I1,0,11))=="javascript:"){
return $l1 ;
}else{
$I1 = $l3.$path."/".$I1;
}
return str_replace($I2,""$I1"",$l1);
}
?>

The link below is the place to learn PHP regular expressions. Leave a link here to prevent it from being lost. . .

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/325775.htmlTechArticleWrite a function that is necessary for collection, the URL completion function, which can also be called FormatUrl. The purpose of writing this function is to develop a collection program. When collecting articles, you will often encounter the path in the page that is "phase...
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!