PHP internship tips (how to generate a simple summary)-PHP Tutorial-php.cn

PHP internship tips (how to generate a simple summary)

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2016-07-29 08:50:51

Original

1086 people have browsed it

Generate summary

Recently, I need to add a requirement. There is a send_article interface. I need to extract Chinese characters from the html code and turn it into a summary. I have tried many methods, such as:

<code><span>//匹配中文utf8编码</span><span><span>function</span><span>utf8_summary</span><span>(<span>$article</span>)</span> {</span><span>$match</span> = <span>"/^[\x{4e00}-\x{9fa5}]+$/u"</span>;<span>//正则表达式，匹配中文</span>
    preg_match_all(<span>$match</span>,<span>$article</span>,<span>$temp</span>);
    <span>$summary</span> = <span>""</span>;
    <span>foreach</span> (<span>$temp</span><span>as</span><span>$key</span> => <span>$value</span>) {
        <span>$sum</span> = implode(<span>''</span> , <span>$value</span>);
        <span>$summary</span> = <span>$sumary</span> . <span>$sum</span>;
    }<span>//将中文拼接起来</span><span>return</span><span>$summary</span>;
}
</code>

Copy after login

The question is:
1. When consecutive Chinese characters appear, it will be impossible to take them out
2. This method is effective when Chinese characters and characters are mixed
Reason:
Maybe when it is pure Chinese, the encoding will be changed to something else, so the regular expression cannot match, but when there is a mix of Chinese and characters, the character encoding is utf8, so it can match. In fact, the client can wrap Chinese in the label , and added the header, using setchars=utf8 to specify, but the client's entity class has been written, and it is too troublesome to change it. I had to find a way in the background, so I tried the second method:

<code><span>$function</span> url_summary(<span>$article</span>) {
    <span>$article</span> = urlencode();
    <span>$match</span> = <span>"/^%[a-zA-Z0-9]{2}/"</span>;
    preg_match_all(<span>$match</span>,<span>$article</span>,<span>$temp</span>);
    <span>$summary</span> = <span>""</span>;
    <span>foreach</span> (<span>$temp</span><span>as</span><span>$key</span> => <span>$value</span>) {
        <span>$sum</span> = implode(<span>''</span> , <span>$value</span>);
        <span>$summary</span> = <span>$sumary</span> . <span>$sum</span>;    
    }
<span>$summary</span> = decode(<span>$summary</span>);
<span>return</span><span>$summary</span>;
}</code>

Copy after login

The idea of this method is: observation After non-letters and numbers are URL-encoded, they will become strings similar to %e7, so these are taken out, then spliced together, and after decoding, Chinese characters will be obtained.

Later I found out: It’s actually a function

I found out that there is a function that can change the encoding

<code><span>iconv(<span>"gbk"</span>,<span>"utf-8"</span>,<span>"php中文转码"</span>)</span>;<span>//把中文gbk编码转为utf8</span><span>iconv(<span>"utf-8"</span>,<span>"gbk"</span>,<span>"php中文转码"</span>)</span>;<span>//把中文utf8编码转为gbk</span></code>

Copy after login

But if you want to use this function, you need to go to the php.ini file and set extension=php_iconv.dll Open it and install the iconv function library to use it
,kind of hard.
Finally, I found that using the strip_tags() function can solve the problem
This function can remove html tags and then intercept a section,
mb_substr(summary,0,50);//Intercept a character
and need to remove escape characters, such as
str_replace(’ ’,‘’,summary); //Remove escape characters
A summary can be generated, and more functions can be added later, such as sentence segmentation and line wrapping;

').addClass('pre-numbering').hide(); $(this).addClass('has-numbering').parent().append($numbering); for (i = 1; i ').text(i)); }; $numbering.fadeIn(1700); }); });

The above introduces PHP internship tips (how to generate a simple summary), including tips and PHP content. I hope it will be helpful to friends who are interested in PHP tutorials.