utf-8 - character encoding in php-PHP Tutorial-php.cn

utf-8 - character encoding in php

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2023-03-01 21:04:02

Original

1304 people have browsed it

<code>$str1 = "\xe4\xb8\xad";

$str2 = '\xe4\xb8\xad';

$str3 = '中';</code>

Copy after login

Can you explain in detail the difference between the three and whether they can be converted into each other

Reply content:

<code>$str1 = "\xe4\xb8\xad";

$str2 = '\xe4\xb8\xad';

$str3 = '中';</code>

Copy after login

Can you explain in detail the difference between the three and whether they can be converted into each other

First time answering a question on segmentfault. .

PHP string variables, double quotes and single quotes have different meanings

Escape when using double quotes. Not escape when using single quotes
When using double quotes, the $xxxx text will be replaced by the value of the corresponding variable. Single quotes have no such effect

Eg.

$abc='123';
echo "$abc"; //这样会输出123
echo '$abc'; //这样会输出$abc
echo "\n"; //这样会输出一个换行符
echo '\n'; //这样会输出\n两个字符（一个斜杠一个n）

Copy after login

Back to the question,
The hexadecimal encoding of the Chinese character "中" in UTF-8 is 0xe4, 0xb8, 0xad
So in a double-quoted string, it will be escaped as "中". The beginning of x means that this is a string starting with The characters expressed in hexadecimal are the same as &xe4; in HTML
In a single quote string, xe4xb8xad is directly output

If your environment encoding is under UTF-8, str1 and str3 are equivalent. If you echo directly, "medium" will be output. If it is a three-byte comparison at the binary level, it is also completely equal. Strings in PHP are directly Locally encoded binary storage

If your environment encoding is non-UTF-8 (such as GBK), str1 is basically a garbled code, and str1 and str3 are no longer equivalent

As for str2, it will output 'xe4xb8xad' at any time (without quotation marks. In a single-quoted string, only the single quotation mark itself needs to be escaped to ', otherwise it will be treated as an ordinary character.

Only explain the difference between the first and the second, that is, the difference between single quotes and double quotes

Double quotes: The quotes inside will be escaped
Single quotes: The quotes inside will not be escaped

$a = 123;

echo "output:$a";//output:123
echo 'output:$a';//output:$a

//下面的示例仅限linux的php-cli
echo "new line\nsecond line";
/*
会换行，输出：
new line
second line
*/

echo 'no new line\n aaa';
/*
不会换行，输出：
no new line\n aaa
*/

Copy after login

Escaping works, nothing else works . PHP itself does not distinguish character encodings. In other words, $str1 is a three-byte string, and the three bytes of the string are (hexadecimal encoding) E4 B8 AD. If it is in UTF-8 encoding, it is the character in

. This is not necessarily the case in other encodings.

And $str2 is a 12-byte string, which is the characters you entered. And $str3

is a string. If you save the file in UTF-8 encoding, it is the same as

$str1. If you save in GBK, it is two bytes D6 D0

, if you save in BIG5, it is

A4 A4. Whether it is UTF-8, GBK or BIG5, or even many other language encodings, all follow EUC, which means that for ASCII characters, their encodings are consistent, so no matter which encoding is used to save, it will not affect PHP. Your code work will not be affected. But there is a big difference for non-ASCII characters. So in order for non-ASCII characters in PHP to be displayed normally, you must ensure that your saving encoding and output encoding are consistent. If the output is HTML, the encoding is declared through the meta tag or in the HTTP Header. If they are inconsistent, garbled characters will appear.