学習に関して問題が発生した場合は、経験者にアドバイスを求めてください。-PHPチュートリアル-php.cn

ホームページ

バックエンド開発

PHPチュートリアル

学習に関して問題が発生した場合は、経験者にアドバイスを求めてください。

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 13, 2016 pm 01:46 PM

amp decode html str

学习遇到问题了，请过来人给予指点
当看框架源码时(例:ci)
遇到其中不懂的写法身边又没有牛人指点该怎么办？
就如下边这段代码:当strtolower($charset) != 'utf-8'时是一种处理，非而是另一种处理为什么那？像这样的问题如果到论坛里去咨询别人可能要等半天才能解决，如果搜索查找的话像这样的问题又没法下手。苦恼中..... 我该怎么办呢? 希望大家给予指点笔人感激不尽。

PHP code

<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->    public function entity_decode($str, $charset='UTF-8')
    {
        if (stristr($str, '&') === FALSE) return $str;

        // The reason we are not using html_entity_decode() by itself is because
        // while it is not technically correct to leave out the semicolon
        // at the end of an entity most browsers will still interpret the entity
        // correctly.  html_entity_decode() does not convert entities without
        // semicolons, so we are left with our own little solution here. Bummer.

        if (function_exists('html_entity_decode') &&
            (strtolower($charset) != 'utf-8'))
        {
            $str = html_entity_decode($str, ENT_COMPAT, $charset);
            $str = preg_replace('~&#x(0*[0-9a-f]{2,5})~ei', 'chr(hexdec("\\1"))', $str);
            return preg_replace('~&#([0-9]{2,4})~e', 'chr(\\1)', $str);
        }

        // Numeric Entities
        $str = preg_replace('~&#x(0*[0-9a-f]{2,5});{0,1}~ei', 'chr(hexdec("\\1"))', $str);
        $str = preg_replace('~&#([0-9]{2,4});{0,1}~e', 'chr(\\1)', $str);

        // Literal Entities - Slightly slow so we do another check
        if (stristr($str, '&') === FALSE)
        {
            $str = strtr($str, array_flip(get_html_translation_table(HTML_ENTITIES)));
        }

        return $str;
    }

ログイン後にコピー

------解决方案--------------------
。。。我也不怎么懂。帮你刷上去看看。
------解决方案--------------------
ok,首先,它这个函数是有缺陷的,

试一下:
echo entity_decode('叶','utf-8');
echo "\n";
echo html_entity_decode('叶',ENT_COMPAT,'utf-8');

叶是 "叶"字.

所以,我觉得他把
&& (strtolower($charset) != 'utf-8')
这部分从条件里拿掉更好一点,换句话说,能用html_entity_decode就先用,然后再处理无分号结尾的.

因为你只问了这个utf8的问题,相信你别的部分都没啥问题,我也就不多嘴了.

看别人的代码,可以细看,也可以粗看,
比如这个函数,如果你更关心其它地方,只要知道它是html_entity_decode一个变形版就行.

------解决方案--------------------
addslashes(）
------解决方案--------------------
注释一下了：

public function entity_decode($str, $charset='UTF-8')
{
if (stristr($str, '&') === FALSE) return $str; // 如果没能&，直接返回

// The reason we are not using html_entity_decode() by itself is because
// while it is not technically correct to leave out the semicolon
// at the end of an entity most browsers will still interpret the entity
// correctly. html_entity_decode() does not convert entities without
// semicolons, so we are left with our own little solution here. Bummer.
// 为什么不直接用html_entity_decode() , 因为 html_entity_decode()不直接转换不带分号的实体。

if (function_exists('html_entity_decode') &&
(strtolower($charset) != 'utf-8'))
{

// 如果不是utf8;

$str = html_entity_decode($str, ENT_COMPAT, $charset); //根据编码解码
$str = preg_replace('~&#x(0*[0-9a-f]{2,5})~ei', 'chr(hexdec("\\1"))', $str); //替换
return preg_replace('~&#([0-9]{2,4})~e', 'chr(\\1)', $str); //替换返回

}
// 如果是utf8；
// Numeric Entities 如果有数字实体则替换
$str = preg_replace('~&#x(0*[0-9a-f]{2,5});{0,1}~ei', 'chr(hexdec("\\1"))', $str);
$str = preg_replace('~&#([0-9]{2,4});{0,1}~e', 'chr(\\1)', $str);

このウェブサイトの声明

この記事の内容はネチズンが自主的に寄稿したものであり、著作権は原著者に帰属します。このサイトは、それに相当する法的責任を負いません。盗作または侵害の疑いのあるコンテンツを見つけた場合は、admin@php.cn までご連絡ください。