What is BOM header?
In a UTF-8 encoded file, the BOM is in the header of the file, occupying three bytes, which is used to indicate that the file belongs to UTF-8 encoding. There are many softwares that recognize the BOM header, but there are still some that cannot recognize the BOM header. For example, PHP cannot recognize the BOM header, which is why an error occurs after editing UTF-8 encoding with Notepad.
The code for batch removal of BOM headers is as follows:
<?php if (isset($_GET['dir'])){ //设置文件目录 $basedir=$_GET['dir']; }else{ $basedir = '.'; } $auto = 1; checkdir($basedir); function checkdir($basedir){ if ($dh = opendir($basedir)) { while (($file = readdir($dh)) !== false) { if ($file != '.' && $file != '..'){ if (!is_dir($basedir."/".$file)) { echo "filename: $basedir/$file ".checkBOM("$basedir/$file")." <br>"; }else{ $dirname = $basedir."/".$file; checkdir($dirname); } } } closedir($dh); } } function checkBOM ($filename) { global $auto; $contents = file_get_contents($filename); $charset[1] = substr($contents, 0, 1); $charset[2] = substr($contents, 1, 1); $charset[3] = substr($contents, 2, 1); if (ord($charset[1]) == 239 && ord($charset[2]) == 187 && ord($charset[3]) == 191) { if ($auto == 1) { $rest = substr($contents, 3); rewrite ($filename, $rest); return ("<font color=red>BOM found, automatically removed._<a href=http://www.joyphper.net>http://www.joyphper.net</a></font>"); } else { return ("<font color=red>BOM found.</font>"); } } else return ("BOM Not Found."); } function rewrite ($filename, $data) { $filenum = fopen($filename, "w"); flock($filenum, LOCK_EX); fwrite($filenum, $data); fclose($filenum); } ?>
PS: There are two simple ways to remove the bom:
1. How to remove BOM header with editplus
After the editor is adjusted to UTF8 encoding format, there will be a string of hidden characters (that is, BOM) in front of the saved file, which is used by the editor to identify whether the file is UTF8 encoded.
Run Editplus, click Tools, select Preferences, select the file, select UTF-8 identification and always delete the signature,
Then after editing and saving the PHP file, the PHP file will not have a BOM.
2. Ultraedit method to remove BOM
After opening the file, select the encoding format in the save as option (utf-8 without BOM header), and confirm it.
How about it? It’s easy to remove the bom
Another paragraph discussing the BOM information of utf8
BOM means that the storage method of the PHP file itself is UTF-8 with BOM. The Chinese garbled way of ordinary pages is generally not caused by this reason.
header("Content-type: text/html; charset=utf-8");
This sentence controls the encoding method of the html output page,
BOM only exists when "Notepad" is used to store it as UTF-8 under WINDOWS. You can use WINHEX to delete the first 2 bytes.
In the encoding settings in Dreamweaver, you can set whether to include BOM. Generally, as long as the output of PHP is not a picture (GDI Stream), BOM will not cause problems.
GDI Stream will be displayed as a red cross if there are extra characters at the beginning.