1. HTML in [UTF16 header] format written in Notepad, but UTF8 is specified in the HTML file. How does the browser know to use UTF8 mode to decode? In order for the browser to understand it, it needs to decode it, but this decoding method is written in the file. How does the browser choose the correct method?
2. Why is placed in the Head instead of at the front of the document? Logically speaking, shouldn’t the encoding method be lower level than the so-called document type? ?
Hope everyone can guide~
If your server cannot read the UTF16 header, it will be a mess. .
If your server understands the UTF16 header, then there is the next step:
This is to tell the (parsing HTML) module of the server to use the things specified by to output.
If your server cannot read the UTF16 header, it will be a mess.
If your server understands the UTF16 header, then there is the next step:
This is to tell the (parsing HTML) module of the server to use the things specified by to output.
In other words, when the user requests it, it will be sent in UTF8 encoding?
That’s wrong. The following statement should be correct:
Because these things of yours will eventually be placed on the server.
It doesn’t matter whether it is UTF16 or UTF8, the key is that your server must understand it.
Server sent: It doesn’t matter what you send, the key is that the browser can understand it.
The browser is an HTML syntax parser. Follow the syntax of HTML syntax. (XML syntax)
This
<!DOCTYPE html><html><head><meta charset="UTF-8"><title>Insert title here</title></head>
What to send? Doesn’t the server have a default value?
You can change it like this in php
header(content-type:xx)
The user can only say that the data I send is in this format. What I can accept is this format. I cannot Ask the server what format it should use to return it.
What to send? Doesn’t the server have a default value?
You can change it like this in php
header(content-type:xx)
The user can only say that the data I send is in this format. What I can accept is this format. I cannot Ask the server what format it should use to return it.
I probably understand that when things are placed on the server, it doesn’t matter what format they are saved in. The user’s browser cannot ask the server to send them in what format. However, when the user requests a web page, the browser will clearly provide the information according to the document’s The output format, such as UTF8, is sent to the user, and the client decodes it according to UTF8, right?
No, is used by the browser's HTML syntax parser. Parsed into a dom object. That is, a dom tree.
The encoding used by the server to send needs to be set in the server configuration. Or use dynamic language settings (php .net .java)
No, is used by the browser's HTML syntax parser. Parsed into a dom object. That is, a dom tree.
The encoding used by the server to send needs to be set in the server configuration. Or use dynamic language settings (php .net .java)
I’m sorry to have troubled you for so long.
However, there is one thing I don’t understand:
Premise: I save HTML in [UTF16 header] format, set the sending format to UTF8, and the is also UTF8 .
Server: According to what you said, the server can understand this HTML, and we can set the sending format of this HTML ourselves, which is UTF8.
#My question is here#
Client browser:
Assumption 1: The client browser does not know the web page encoding method before browsing HTML, so it must If you cannot read the content of the HTML, you will naturally not know what is in the , so the code will be garbled;
Hypothesis 2: The client browser parses the first 2 bytes of the HTML document and finds that it is EFBB, that is, UTF8 format, so you can parse HTML files and read the content in , but because the charset in tells the browser that it is UTF8, the browser already knows the encoding format, so this The charset information is redundant;
This doesn’t make sense either way?
If the browser doesn’t know the HTML encoding method, it doesn’t know the document content. If it doesn’t know the document content, it doesn’t know what’s written in the meta. If it doesn’t know what’s written in the meta, it doesn’t know what format to use for decoding. .
If the browser wants to know the encoding method of HTML, it must either parse the file directly, read out EFBB from the first 2 bytes of the file, and find that it is UTF8 encoding; or the server has told the client the UTF8 encoding method. If the device is already installed, why do we still need the charset in ?
I have just been exposed to HTML for a few days, so please forgive me if the question is too low-level.
You first search for binary text editor
I found http://www.editpadpro.com/hexadecimal.html which should be available.
Pay attention to text, and binary.
-》HTML is placed on the server and must be read by the server first.
HTML allows Notepad to read to the server. What is read is a binary string. The header you mentioned is the header of the binary string. He could read this header, so he started converting it according to the conversion program he set up, and turned it into text that we can understand.
-》The web server sent it. It also sent the header first, which can be used through web debug Look at the tools.
The browser analyzes the header sent and finds that what follows is a piece of HTML. Make sure to use the HTML syntax analyzer to process it.
-》HTML syntax parser, it also has its own fixed syntax.
You first search for binary text editor
I found http://www.editpadpro.com/hexadecimal.html which should be available.
Pay attention to text, and binary.
-》HTML is placed on the server and must be read by the server first.
HTML allows Notepad to read to the server. What is read is a binary string. The header you mentioned is the header of the binary string. He could read this header, so he started converting it according to the conversion program he set up, and turned it into text that we can understand.
-》The web server sent it. It also sent the header first, which can be used through web debug Look at the tools.
The browser analyzes the header sent and finds that what follows is a piece of HTML. Make sure to use the HTML syntax analyzer to process it.
-》HTML syntax parser, it also has its own fixed syntax.
Thank you for such a patient reply :)