1. The first is the encoding of the PHP web page
1. The encoding of the PHP file itself and the encoding of the web page should match
a. If you want to use gb2312 encoding, then PHP should output the header: header("Content-Type: text/html;charset= gb2312"), add to the static page. The encoding format of all files is ANSI, which can be opened with Notepad and saved as the selected encoding. For ANSI, overwrite the source file.
b. If you want to use utf-8 encoding, then php should output the header: header("Content-Type: text/html;charset=utf-8"), and add , the encoding format of all files is utf-8. Saving as utf-8 may be a bit troublesome. Generally, utf-8 files will have BOM at the beginning. If you use session, there will be problems. You can use editplus to save. In editplus, go to Tools->Parameter Selection->File-> UTF-8 signature, select Always delete, then save to remove the BOM information.
2. PHP itself is not Unicode, all functions such as substr must be changed to mb_substr (mbstring extension needs to be installed); or iconv can be used to transcode.
2. Data interaction between PHP and Mysql
The encoding of PHP and the database should be consistent
1. Modify the mysql configuration file my.ini or my.cnf. It is best to use utf8 encoding for mysql
[mysql]
default-character-set=utf8
[mysqld]
default-character-set=utf8
default-storage-engine=MyISAM
Add under [mysqld]:
default-collation=utf8_bin
init_connect='SET NAMES utf8'
2. When you need to create a database Add mysql_query("set names'coding'"); before the operating PHP program. The encoding is consistent with the PHP encoding. If the PHP encoding is gb2312, then the mysql encoding is gb2312. If it is utf-8, then the mysql encoding is utf8. In this way, insert or retrieve There will be no garbled characters in the data
3. PHP is related to the operating system
The encoding of Windows and Linux is different. In the Windows environment, when calling PHP functions, if the parameters are utf-8 encoding, errors will occur, such as move_uploaded_file(), filesize(), readfile() etc. These functions are often used when processing uploads and downloads. The following error may occur when calling:
Warning: move_uploaded_file()[function.move-uploaded-file]: failed to open stream: Invalid argument in . ..
Warning: move_uploaded_file()[function.move-uploaded-file]:Unable to move '' to '' in ...
Warning: filesize() [function.filesize]: stat failed for ... in ...
Warning: readfile() [function.readfile]: failed to open stream: Invalid argument in ..
Although these errors will not occur when using gb2312 encoding in a Linux environment, the file name after saving will appear. The garbled code makes it impossible to read the file. In this case, you can first convert the parameters into the encoding recognized by the operating system. The encoding conversion can be done with mb_convert_encoding (string, new encoding, original encoding) or iconv (original encoding, new encoding, string). The file name saved later will not appear garbled, and the file can be read normally to achieve uploading and downloading of files with Chinese names.
In fact, there is a better solution, which is to completely separate from the system, so there is no need to consider the encoding of the system. You can generate a sequence of only letters and numbers as the file name, and save the original name with Chinese characters in the database. In this way, there will be no problem when calling move_uploaded_file(). When downloading, you only need to change the file name to the original name with Chinese characters. Chinese name. The code to implement downloading is as follows
header("Pragma: public");
header("Expires: 0");
header("Cache-Component: must-revalidate, post-check=0, pre-check= 0");
header("Content-type: $file_type");
header("Content-Length: $file_size");
header("Content-Disposition: attachment; filename="$file_name"" );
header("Content-Transfer-Encoding: binary");
readfile($file_path);
$file_type is the type of file, $file_name is the original name, $file_path is the file saved on the service address.
Four. Let’s summarize why garbled characters appear. Generally speaking, there are two reasons for the occurrence of garbled characters. The first is due to the incorrect encoding (charset) setting, which causes the browser to parse with the wrong encoding, resulting in a screen full of messy "heavenly books". , followed by the file being opened with the wrong encoding and then saved. For example, a text file was originally encoded in GB2312, but was opened in UTF-8 encoding and then saved. To solve the above garbled code problem, you first need to know which aspects of development involve encoding:
1. File encoding: refers to the encoding in which the page file (.html, .php, etc.) itself is saved. Notepad and Dreamweaver will automatically recognize the file encoding when opening the page, so there will be less problems. However, ZendStudio does not automatically recognize the encoding. It will only open the file in a certain encoding according to the configuration of the preferences. If you accidentally open the file with the wrong encoding while working, and save it after making the modification, garbled characters will appear ( I feel it deeply).
2. Page declaration encoding: In the HTML code HEAD, you can use to tell the browser what the web page uses Encoding. Currently, XXX mainly uses GB2312 and UTF-8 in Chinese website development.
3. Database connection encoding: refers to which encoding is used to transmit data to the database when performing database operations. What needs to be paid attention to here is. Don’t confuse it with the encoding of the database itself. For example, MySQL’s internal default encoding is latin1 encoding, which means that Mysql stores data in latin1 encoding. Data transmitted to Mysql in other encodings will be converted into latin1 encoding.
Understand what is used in WEB development. When it comes to encoding, we know the reason for the garbled characters: the above three encoding settings are inconsistent. Since most of the various encodings are ASCII-compatible, English symbols will not appear, and Chinese characters will be out of luck.
Five. Some common error situations and solutions:
1. The database uses UTF8 encoding, and the page declaration encoding is GB2312. This is the most common cause of garbled characters. At this time, the data that is directly SELECTed in the PHP script is garbled, and needs to be added. Before querying, use: mysql_query("SET NAMES GBK"); to set the MYSQL connection encoding and ensure that the page declaration encoding is consistent with the connection encoding set here (GBK is an extension of GB2312). If the page is UTF-8 encoded. You can use: mysql_query("SET NAMES UTF8");
Note that it is UTF8 instead of the commonly used UTF-8. If the encoding declared by the page is consistent with the internal encoding of the database, you do not need to set the connection encoding.
Note: In fact, MYSQL. Data input and output are more complicated than mentioned above. There are two default encodings defined in the MYSQL configuration file my.ini, which are the default-character-set in [client] and the default-character-set in [mysqld]. Set the encoding used by the client connection and the database internally by default. The encoding we specified above is actually the command line parameter character_set_client when the MYSQL client connects to the server, to tell the MYSQL server what encoding the client data received is, and The default encoding is not used.
2. The page declaration encoding is inconsistent with the encoding of the file itself. This rarely happens, because if the encoding is inconsistent, what you see in the browser when you create the page is more likely to be modified after publishing. Some small bugs are caused by opening the page in the wrong encoding and then saving it. Or using some FTP software to directly modify the file online, such as CuteFTP, which causes the wrong encoding to be converted due to the wrong encoding configuration of the software.
3. Some friends who rented virtual hosts found that even though the above three encodings were set correctly, there were still garbled characters. For example, if the web page is encoded in GB2312, it is always recognized as UTF-8 when opened by browsers such as IE. The HEAD of the web page has already stated that it is GB2312. After manually changing the browser encoding to GB2312, the page displays normally. The reason is that the server Apache sets the global default encoding of the server and adds AddDefaultCharset UTF-8 in httpd.conf. At this time, the server will first send the HTTP header to the browser, and its priority is higher than the encoding declared in the page. Naturally, the browser will recognize it incorrectly. There are two solutions. Administrators should add AddDefaultCharset GB2312 to the configuration file of their own virtual machine to override the global configuration, or configure it in .htaccess in their own directory.
Summary: In a word, the best and fastest way to solve PHP Chinese garbled code is to make the encoding declared by the page consistent with the internal encoding of the database. If the page number applied for the page is inconsistent with the internal encoding of the database, set the connection encoding. mysql_query("SET NAMES XXX"); &b=你好
Passing parameters will cause an internal error
Solution: "test.php ?a=".urlencode(你好)."&b=".urlencode(你好)