Home > Backend Development > PHP Tutorial > Parsing the escape function in php_PHP tutorial

Parsing the escape function in php_PHP tutorial

WBOY
Release: 2016-07-21 15:02:22
Original
1423 people have browsed it

Use js to escape encode the Chinese characters in the URL.
The effect after clicking the link is:
Quote: http://127.0.0.1/shop/product_list.php?p_sort=PHP%u5F00%u53D1%u8D44%u6E90%u7F51
To generate such an effect, it is obvious to use PHP's urldecode() or base64_decode() cannot be decoded.
Solution, use PHP to write an inverse solution function:

Copy the code The code is as follows:

function js_unescape($str){ 
{                                                                                                      str, $i+2, 4)); 0xc0|($val>>6)).chr(0x80|($val&0x3f)); >6)&0x3f)).chr(0x80|($val&0x3f)); == '%')                                      🎜>{                                                                                                                                                                    
else $ret .= $str[$i ]; But if you use UTF-8 encoding, this step is not necessary.

The code is as follows: print iconv('utf-8', 'gb2312', js_unescape($_REQUEST['p_sort']));

At this point we have successfully decoded js The escape is encoded.
As follows:
In addition, I found a function that uses PHP to implement escape encoding of js:



Copy the code

The code is as follows:

function phpescape($str)
{        
$sublen=strlen($str);
      $retrunString="";        
for ($i=0;$i<$sublen;$i++)        
{                 
if(ord($str[$i])>=127)                 
{                          
$tmpString=bin2hex(iconv("gb2312","ucs-2",substr($str,$i,2)));                          
//$tmpString=substr($tmpString,2,2).substr($tmpString,0,2);window下可能要打开此项                          
$retrunString.="%u".$tmpString;                          
$i++;                 
} else
{                          
$retrunString.="%".dechex(ord($str[$i]));                 
}        
}        
return $retrunString;
}

在json中不支持中文,用它传送中文数据就会出现数据丢失或者乱码,必须在传 送前对要发送的字符串进行编码,由于传送过去需要用js进行数据解析,考虑到js中有unescape函数,故若在php中有个escape函数,对数据 进行编码,在客户端用unescape进行 解码,这样就会方便很多。
先在网上搜索一把,很多用php实现的escape函数,大同小异,比如下面一个:
复制代码 代码如下:

function phpEscape($str) {
preg_match_all("/[x80-xff].|[x01-x7f]+/",$str,$r);
$ar = $r[0];
foreach($ar as $k=>$v) {
    if(ord($v[0]) < 128)
      $ar[$k] = rawurlencode($v);
    else
      $ar[$k] = "%u".bin2hex(iconv("GB2312","UCS-2",$v));
}
return join("",$ar);
}

这个函数可以很好的工作,但是,也许有新手不理解这个函数的原理(比如我),用起来总是不放心,现在我就来解释一下这个函数的原理。而且我认为,拿别人的代码来复用,好比站在了巨人的肩膀上,但是若不理解别人的代码,迟早要掉到地面上。
第一句:preg_match_all("/[x80-xff].|[x01-x7f]+/",$str,$r);这个是用正则表达式匹配 字符串中所有的字符,[x80-xff]. 匹配的是汉字,x表示匹配字符的16进制编码,[ ] 是类选择符,“.” 表示任意一个字符,这样[x80-xff].匹配的是两个字符,其中第一个就是16进制从80到ff的字符,而这恰好就是汉字编码的第一个字符。这样 就能完整的匹配一个汉字。关于unicode中汉字的编码,大家可以到网上搜索一下。同理,[x01-x7f]+英文字符串,因为最早的英文是 ASCII编码,编码值小于128,也就是16进制的从01到7f,"+"表示一个或者多个字符,这样[x01-x7f]+就能匹配连续多个英文字符 串。
复制代码 代码如下:

$ar = $r[0]; //$r[0] stores the matched array
foreach($ar as $k=>$v) {
if (ord($v[0]) < 128) //If the character encoding value is less than 128, it means it is an English character
$ar[$k] = rawurlencode($v); //Use rawurlencode to encode directly
else
$ar[$k] = "%u".bin2hex(iconv("GB2312","UCS-2",$v)); // Otherwise, use the iconv function to convert Chinese characters into ucs-2 Encoding, that is, unicode encoding
}

can be decoded with unescape in javascript
u0391-uFFE5 and u4e00-u9fa5 to match Chinese
but it seems that the former contains Chinese characters The latter A-¥ and so on below may be pure Chinese characters.
The decoding function is:
Copy code The code is as follows:

function unescape($str) {
$str = rawurldecode($str);
preg_match_all("/%u.{4}|&#x.{4};|&#d+;|.+/U",$str,$r);
$ar = $r[0];
foreach($ar as $k=>$v) {
if(substr($v,0,2) == "%u")
                     $ar[$k] = iconv("UCS-2","GBK",pack("H4",substr($v,-4))); ,3) == "")
                                                                                       );
                  elseif(substr($v,0,2) == "") { ",substr($v,2,-1)));
}
}
return join("",$ar);
}



1. Encoding range1. GBK (GB2312/GB18030)
x00-xff GBK double-byte encoding range
x20-x7f ASCII
xa1- xff Chinese
x80-xff Chinese

2. UTF-8 (Unicode)
u4e00-u9fa5 (Chinese)
x3130-x318F (Korean
xAC00-xD7A3 ( Korean)
u0800-u4e00 (Japanese)
ps: Korean is a character larger than [u9fa5]

Regular example:
preg_replace("/([x80-xff]) /","",$str);
preg_replace("/([u4e00-u9fa5])/","",$str);


http://www.bkjia.com/PHPjc/327931.html

truehttp: //www.bkjia.com/PHPjc/327931.htmlTechArticleUsing js to escape encode the Chinese characters in the URL. a href="" onclick="window.open('product_list.php?p_sort='+escape('Script Home'));"The effect after clicking the link is: Reference: http://...
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template