Home > Backend Development > PHP Tutorial > PHP encoding conversion function automatically converts character sets and supports array conversion_PHP tutorial

PHP encoding conversion function automatically converts character sets and supports array conversion_PHP tutorial

WBOY
Release: 2016-07-21 15:14:55
Original
970 people have browsed it

Copy code The code is as follows:

// Automatic conversion of character sets supports array conversion
function auto_charset($fContents, $from='gbk', $to='utf-8') {
$from = strtoupper($from) == 'UTF8' ? 'utf-8' : $from;
$to = strtoupper ($to) == 'UTF8' ? 'utf-8' : $to;
if (strtoupper($from) === strtoupper($to) || empty($fContents) || (is_scalar($ fContents) && !is_string($fContents))) {
//Do not convert if encoding is the same or non-string scalar
return $fContents;
}
if (is_string($fContents)) {
if (function_exists('mb_convert_encoding')) {
return mb_convert_encoding($fContents, $to, $from);
} elseif (function_exists('iconv')) {
return iconv( $from, $to, $fContents);
} else {
return $fContents;
}
} elseif (is_array($fContents)) {
foreach ($fContents as $key => $val) {
$_key = auto_charset($key, $from, $to);
$fContents[$_key] = auto_charset($val, $from, $to);
if ($key != $_key)
unset($fContents[$key]);
}
return $fContents;
}
else {
return $fContents;
}
}

When we accept data submitted by unknown clients, because the encoding of each client is not uniform, our server can only use one encoding method in the end. Processing, in this case will involve a problem of converting the received characters into a specific encoding.
At this time, you may think of using iconv directly to transcode, but we know that the two parameters that the iconv function needs to provide are input encoding and output encoding, and we don’t know what encoding the accepted string is. It would be great if we could get the encoding of the received characters at this time.
For such problems, there are generally two solutions.

Option 1
When you want the client to submit data, specify the submitted encoding. In this case, you need to provide an additional variable to specify the encoding.
$string = $_GET['charset'] === 'gbk' ? iconv('gbk','utf-8',$_GET['str']) : $_GET['str'];
For this situation, if there is no agreement or we cannot control the client, it seems that this solution is not very good to use.

Option 2
The server side directly detects the received data encoding.
This solution is of course the most ideal. Now the question is how to detect the encoding of a character? For this situation, in PHP, the mb_check_encoding in the mb_string extension provides the functionality we need.
$str = mb_check_encoding($_GET['str'],'gbk') ? iconv('gbk','utf-8',$_GET['str']) : $_GET['str'];
But this requires turning on the mb_string extension. Sometimes this extension may not be turned on in our production server. In this case, you need to use the following function to determine the encoding.
The following functions were not written by me
Copy code The code is as follows:

function isGb2312($string) {
for($i=0; $i 127) {
if( ($v >= 228) && ($v < = 233) )
{
if( ($i+2 ) >= (strlen($string) - 1)) return true;
$v1 = ord( $string[$i+1] );
$v2 = ord( $string[$i+2 ] );
if( ($v1 >= 128) && ($v1 < =191) && ($v2 >=128) && ($v2 < = 191) )
return false;
else
return true;
}
}
}
return true;
}
function isUtf8($string) {
return preg_match('% ^(?:
[x09x0Ax0Dx20-x7E] # ASCII
| [xC2-xDF][x80-xBF] # non-overlong 2-byte
| xE0[xA0-xBF][x80-xBF] # excluding overlongs
| [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
| xED[x80-x9F][x80-xBF] # excluding surrogates
| xF0[x90 -xBF][x80-xBF]{2} # planes 1-3
| [xF1-xF3][x80-xBF]{3} # planes 4-15
| xF4[x80-x8F][x80 -xBF]{2} # plane 16
)*$%xs', $string);
}

Here we can use any of the above functions to detect encoding . and convert it to the specified encoding.
$str = isGb2312($_GET['str'],'gbk') ? iconv('gbk','utf-8',$_GET['str']) : $_GET['str'];

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/326216.htmlTechArticleCopy code The code is as follows: // Automatic conversion of character sets supports array conversion function auto_charset($fContents, $from=' gbk', $to='utf-8') { $from = strtoupper($from) == 'UTF8' ? 'utf-8' :...
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template