Parse PHP and use regular expressions to solve the problem of collecting content layout_PHP tutorial

WBOY
Release: 2016-07-21 15:04:57
Original
929 people have browsed it

A common problem encountered when doing collection is content layout. I spent some time writing a function that replaces HTML tags and styles with regular expressions. I will share it with you.

Copy code The code is as follows:

/**
* Format content
* @param string $content It is best to use utf-8 encoding for the content
* @return string
*! This function needs to enable tidy extension
*/
function removeFormat($content) {
$replaces = array (
"//i" => '',
"//i" => '',
"//i" => '',
"/
/i" => '',
"// i" => '',
"//i" => '',
"//i" => " "/
/i" => "

",
"// i"=>'',
/* "//i" => '',//Do not enable it when encountering table content
"/< /table>/i" => '',
"//i" => '',
"//i" => '',
"//i" => '

',
"//i" => ' "//i" => '', */
"/style=.+?['|"]/i" => '' ,
"/class=.+?['|"]/i" => '',
"/id=.+?['|"]/i"=>'',
"/lang=.+?['|"]/i"=>'',
//"/width=.+?['|"]/i"=>'',/ /It’s hard to control and comment out
//"/height=.+?['|"]/i"=>'',
"/border=.+?['|"]/i" =>'',
"/face=.+?['|"]/i"=>'',
"/[ ]*/i" = > "

",
"/.*/i" => '',
"/ /i " => ' ',//Replace spaces with
"/[ |x{3000}|rn]*/ui" => '

',// Replace half-width and full-width spaces and line breaks, and use to eliminate encoding problems that occur when writing to the database

);
$config = array(
//'indent' => TRUE, // Whether to indent
'output-html' => TRUE,//Whether it is output xhtml
'show-body-only'=>TRUE,//Whether only the body is obtained
'wrap' => 0
);
$content = tidy_repair_string($content, $config, 'utf8');//First use the tidy class library that comes with php to repair the html tags, otherwise various problems will easily occur when replacing them. A weird situation
$content = trim($content);
foreach ( $replaces as $k => $v ) {
$content = preg_replace ( $k, $v, $content ) ;
}

if(strpos($content,'

')>6)//Some content may be missing the

tag at the beginning
$content = '< p> '.$content;

$content = tidy_repair_string($content, $config, 'utf8');//Repair it again to remove the html empty tags
$content = trim($content );
return $content;
}


www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/327743.htmlTechArticleA common problem encountered when doing collection is content layout. It took me some time to write a regular expression to replace html tags. and style functions, share them. Copy the code. The code is as follows: /** * Formatting...
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!