Regarding the detection of file encoding, there are a lot of them on Baidu, but there is really nothing that can be used. Many people suggested mb_detect_encoding detection, but for some reason I did not succeed. Nothing was output. I saw someone writing I bought an enhanced version and used BOM to judge, but I ignored it decisively. This thing was completely unreliable. Finally, I wrote a detection function based on the example below the mb_detect_encoding function in the PHP manual,
It also includes functions and source code for automatically detecting encoding and reading files according to the pointed encoding.
Copy the code The code is as follows:
/**
* Detect file encoding
* @param string $file file path
* @return string|null returns encoding name or null
*/
function detect_encoding($file) {
$list = array('GBK', 'UTF-8', 'UTF-16LE', 'UTF-16BE', 'ISO-8859-1');
$str = file_get_contents($file);
foreach ($list as $item) {
$tmp = mb_convert_encoding($str, $item, $item);
if (md5($tmp) == md5($str)) {
return $item;
} }
}
return null;
}
/**
* Automatically parse the encoding and read the file
* @param string $file file path
* @param string $charset read encoding
* @return string Returns the read content
*/
function auto_read($file, $charset='UTF-8') {
$list = array('GBK', 'UTF-8', 'UTF-16LE', 'UTF-16BE', 'ISO-8859-1');
$str = file_get_contents($file);
foreach ($list as $item) {
$tmp = mb_convert_encoding($str, $item, $item);
if (md5($tmp) == md5($str)) {
return mb_convert_encoding($str, $charset, $item);
} }
}
return "";
}