Overview of the development process of http protocol-HTML Tutorial-php.cn

This article brings you an overview of the http protocol, which has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Here I just briefly organize some knowledge to facilitate my own understanding and memory. There are still many imperfections. For more details, you need to check books or other articles

The development process of http protocol

HTTP is an application layer protocol based on TCP/IP protocol. It does not involve packet transmission, but mainly specifies the communication format between the client and the server. Port 80 is used by default.

http/0.9

Released in 1991, there is only one command GET. The protocol stipulates that the server can only respond to strings in HTML format and cannot respond to anything else. Format.

http/1.0

Released in May 1996, the HTTP/1.0 version was released, and the content was greatly increased. First, content in any format can be sent. This allows the Internet to transmit not only text, but also images, videos, and binary files. This laid the foundation for the great development of the Internet. In addition to the GET command, the POST command and the HEAD command are also introduced, enriching the means of interaction between the browser and the server.

The format of HTTP requests and responses has also changed. In addition to the data part, each communication must include header information (HTTP header) to describe some metadata.

Other new features include status code, multi-character set support, multi-part type, authorization, cache, and content encoding )wait.

**Disadvantages:**

Each TCP connection can only send one request. After sending the data, the connection is closed. If you want to request other resources, you must create a new connection.
The cost of establishing a TCP connection is high because it requires a three-way handshake between the client and the server, and the sending rate is slow at the beginning (slow start). Therefore, the performance of HTTP 1.0 version is relatively poor. As web pages load more and more external resources, this problem becomes more and more prominent.

In order to solve this problem, some browsers use
list items

when requesting a non-standard Connection field.

Connection: keep-alive

A reusable TCP connection is established until the client or server actively closes the connection. However, this is not a standard field and may behave inconsistently across implementations, so it is not a fundamental solution.

http/1.1

Released in January 1997, HTTP/1.1 version was released, only half a year later than version 1.0. It further improved the HTTP protocol and has been used to this day 20 years later, and it is still the most popular version.

The biggest change in version 1.1 is the introduction of persistent connection, that is, the TCP connection is not closed by default and can be reused by multiple requests without declaring Connection: keep-alive.

The client and server can actively close the connection if they find that the other party has not been active for a period of time. However, the standard approach is for the client to send Connection: close during the last request, explicitly requesting the server to close the TCP connection.

Version 1.1 also adds many new verb methods: PUT, PATCH, HEAD, OPTIONS, DELETE.

**Disadvantages**

Although version 1.1 allows reuse of TCP connections, all data communications within the same TCP connection are performed in sequence. The server will not proceed to the next response until it has processed one response. If the previous response is particularly slow, there will be many requests waiting in line later. This is called "Head-of-line blocking".

In order to avoid this problem, there are only two methods:
One is to reduce the number of requests;
The other is to open multiple persistent connections at the same time. This has led to many web optimization techniques, such as merging scripts and style sheets, embedding images into CSS code, domain sharding, etc. If the HTTP protocol was better designed, this extra work could be avoided.

SPDY
In 2009, Google disclosed its self-developed SPDY protocol, which mainly solved the problem of low efficiency of HTTP/1.1. After this protocol was proven feasible on the Chrome browser, it was regarded as the basis of HTTP/2, and its main features were inherited in HTTP/2.

HTTP/2
In 2015, HTTP/2 was released. It is not called HTTP/2.0 because the standards committee does not plan to release any more subversions. The next new version will be HTTP/3.

The header information of HTTP/1.1 version must be text (ASCII encoding), and the data body can be text or binary. HTTP/2 is a complete binary protocol.

One benefit of the binary protocol is that additional frames can be defined. HTTP/2 defines nearly ten types of frames, laying the foundation for future advanced applications. If you use text to implement this function, parsing the data will become very troublesome, but binary parsing is much more convenient.

HTTP/2 reuses TCP connections. In one connection, both the client and the browser can send multiple requests or responses at the same time, and do not need to correspond one to one in order, thus avoiding "head of line congestion" ".

HTTPS
HTTPS is a secure version of the HTTP protocol. The data transmission of the HTTP protocol is clear text and is unsafe. HTTPS uses the SSL/TLS protocol for encryption.

Characteristics of http protocol

Stateless - each HTTP request is independent, and there is no necessary connection between any two requests. However, this is not entirely the case in actual applications. Cookie and Session mechanisms are introduced to correlate requests.
Connectionless - disconnect immediately after each request is completed
One-way application layer protocol - communication requests can only Initiated by the client, the server responds to the request.
Multiple requests——When the client requests a web page, in most cases it is not a single request that can succeed. The server first responds to the HTML page, and then the browser receives the response. It is found that the HTML page also references other resources, such as CSS, JS files, pictures, etc., and HTTP requests are automatically sent to these required resources.

The current HTTP version supports the pipeline mechanism (that is, in the same TCP connection, the client can send multiple requests at the same time), and can request and respond to multiple requests at the same time, which greatly improves efficiency.

http message structure

Overview of the development process of http protocol

Request line

url——Request URL
Request Method——Request Method
Status Code——Status Code
Server address——Remote Address

When cross-domain rejection occurs, it may be that the method is options, the status code is 404/405, etc. (Of course, there are actually many possible combinations)

**Common status codes: **
200 - Indicates that the request was successfully completed and the requested resource was sent back to the client
304 - Since the last request , the requested web page has not been modified, please use the local cache on the client side
400 - The client request is wrong (for example, it can be intercepted by the security module)
401 - The request is unauthorized
403 - Forbidden Access (for example, it can be prohibited when not logged in)
404 - Resource not found
500 - Server internal error
503 - Service unavailable
...
**HTTP request Method **
In HTTP1.1 version, it supports nearly 10 methods such as GET and POST.

Overview of the development process of http protocol

General Header field

Overview of the development process of http protocol

##HTTP Cookie

Essentially, cookies are an extension of http. There are two http headers specifically responsible for setting and sending cookies, they are Set-Cookie and Cookie.

HTTP Cookie (also called Web Cookie or browser Cookie) is a small piece of data sent by the server to the user's browser and saved locally. It will be retrieved the next time the browser makes a request to the same server. carried and sent to the server. Usually, it is used to tell the server whether two requests come from the same browser, such as keeping the user logged in. Cookies make it possible to record stable state information based on the stateless HTTP protocol.

Cookies are mainly used in the following three aspects:

Session state management (such as user login status, shopping cart, game scores or other information that needs to be recorded)
Personalized settings (such as user-defined settings, themes, etc.)
Browser behavior tracking (such as tracking and analyzing user behavior, etc.)

Cookies were once used to store client data as the only storage method because there was no other suitable storage method at the time. But now as modern browsers begin to support a variety of Storage method, Cookie is gradually being eliminated. Since the server specifies a cookie, each request by the browser will carry cookie data, which will bring additional performance overhead (especially in a mobile environment). New browser APIs already allow developers to store data directly locally, such as using the Web storage API (local storage and session storage) or IndexedDB.

Create cookie
When the server receives an HTTP request, the server can add a Set-Cookie option in the response header. After the browser receives the response, it usually saves the cookie, and then sends the cookie information to the server through the Cookie request header in every subsequent request to the server. In addition, the cookie expiration time, domain, path, validity period, and applicable sites can all be specified as needed.

How to set cookies on the server side in nodejs

request.setHeader('Set-Cookie', ['type=ninja', 'language=javascript']);

Copy after login

Cookies are stored in the client. According to the location stored in the client, they can be divided into memory cookies and hard disk cookies.

Memory cookie
If the lifetime of the cookie is the entire session, the browser will save the cookie in memory, and the cookie will be automatically cleared when the browser is closed

Hard disk cookie
The cookie is saved on the client's hard drive. If the browser is closed, the cookie will not be cleared. The next time you open the browser to visit the corresponding website, the cookie will be Automatically sent to the server again.

Cookies cannot cross domains
Many websites use cookies. For example: Google will issue cookies to the client, and Baidu will also issue cookies to the client. Will the browser also carry cookies issued by Baidu when accessing Google? Or can Google modify the cookies issued by Baidu?
The case is negative. Cookies cannot cross domain names. According to the cookie specification, when a browser accesses Google, it will only carry Google's cookies and not Baidu's cookies. Google can only operate Google's cookies, but not Baidu's cookies.
Cookies are managed by the browser on the client side. The browser can ensure that Google will only operate Google's cookies and not Baidu's cookies, thereby ensuring user privacy and security. The browser determines whether a website can operate the cookie of another website based on the domain name. The domain names of Google and Baidu are different, so Google cannot operate Baidu's cookies.

Two second-level domain names under the same first-level domain name, such as www.helloweenvsfei.com and images.helloweenvsfei.com, cannot use cookies interchangeably because the domain names of the two are not strictly the same. If you want all second-level domain names under the helloweenvsfei.com name to be able to use this cookie, you need to set the domain parameter of the cookie, for example:

Cookie cookie = new Cookie("time","20080808"); // 新建Cookie
cookie.setDomain(".helloweenvsfei.com"); // 设置域名
cookie.setPath("/"); // 设置路径
cookie.setMaxAge(Integer.MAX_VALUE); // 设置有效期
response.addCookie(cookie); // 输出到客户端

Copy after login

The validity period of the cookie
Determined by the maxAge of the cookie The validity period of the cookie, in seconds. In Cookie, the maxAge attribute is read and written through the getMaxAge() method and the setMaxAge(int maxAge) method. If the maxAge attribute is a positive number, it means that the cookie will automatically expire after maxAge seconds. The browser will persist the cookie with a positive maxAge, that is, write it to the corresponding cookie file. Regardless of whether the customer closes the browser or the computer, as long as it is maxAge seconds ago, the cookie is still valid when logging in to the website.

If maxAge is a negative number, it means that the cookie is only valid within this browser window and the sub-windows opened by this window. The cookie will become invalid after closing the window. Cookies with a negative maxAge are temporary cookies and will not be persisted or written to the cookie file. Cookie information is stored in the browser's memory, so the cookie disappears when you close the browser.

Note: When reading Cookie from the client, other attributes including maxAge are unreadable and will not be submitted. When the browser submits a cookie, it will only submit the name and value attributes. The maxAge attribute is only used by the browser to determine whether the cookie has expired.

Cookie security attributes
The HTTP protocol is not only stateless, but also insecure. Data using the HTTP protocol is transmitted directly on the network without any encryption, and may be intercepted. Using the HTTP protocol to transmit very confidential content is a hidden danger. If you do not want cookies to be transmitted in non-secure protocols such as HTTP, you can set the secure attribute of the cookie to true. Browsers will only transmit such cookies over secure protocols such as HTTPS and SSL. The following code sets the secure attribute to true:

Cookie cookie = new Cookie("time", "20080808"); // 新建Cookie
cookie.setSecure(true); // 设置安全属性
response.addCookie(cookie); // 输出到客户端

Copy after login

Tip: The secure attribute does not encrypt the cookie content and therefore cannot guarantee absolute security. If high security is required, the cookie content needs to be encrypted and decrypted in the program to prevent leakage.

http session

session, like cookie, is a mechanism used to record http status, but the difference is that cookie exists on the client, and the size it carries is limited. The session exists on the server side, and the storage size is not limited.

When the program needs to create a session for a client's request, the server first checks whether the client's request already contains a session identifier - called session id. If it already contains a session id, it means that it has been previously This client has created a session, and the server will retrieve the session and use it according to the session id (if it cannot be retrieved, it may create a new one). If the client request does not include the session id, a session will be created for the client and a generated one will be created. The session id associated with this session. The value of the session id should be a string that is neither repeated nor easy to find patterns for counterfeiting. This session id will be returned to the client for storage in this response. The method of saving this session ID can use cookies, so that during the interaction process, the browser can automatically display this identification to the server according to the rules. Generally, the name of this cookie is similar to SEEESIONID.

Usually the creation of a session relies on cookies, but cookies can be artificially disabled. There needs to be other mechanisms to pass the session id back to the server when the cookie is disabled. The session id can be directly appended to the URL. After the path

Note:When talking about the session mechanism, we often hear the misunderstanding "As long as you close the browser, the session will disappear." In fact, you can imagine the example of a membership card. Unless the customer actively asks the store to cancel the card, the store will never delete the customer's information easily. The same is true for sessions. Unless the program notifies the server to delete a session, the server will keep it. The program usually sends an instruction to delete the session when the user logs off.
However, the browser never actively notifies the server that it is about to close before closing, so the server has no chance to know that the browser has been closed. The reason for this illusion is that most session mechanisms use Session cookie is used to save the session id, and the session id disappears after closing the browser, and the original session cannot be found when connecting to the server again. If the cookie set by the server is saved to the hard disk, or some method is used to rewrite the HTTP request header sent by the browser and send the original session id to the server, the original session can still be found when the browser is opened again.

The above is the detailed content of Overview of the development process of http protocol. For more information, please follow other related articles on the PHP Chinese website!