1. The relationship between HTTP protocol and TCP/IP protocol
The long and short connections of HTTP are essentially the long and short connections of TCP. HTTP is an application layer protocol, using TCP protocol at the transport layer and IP protocol at the network layer. The IP protocol mainly solves network routing and addressing problems, and the TCP protocol mainly solves how to reliably transmit data packets above the IP layer, so that the other end of the network receives all packets sent by the originator, and the order is consistent with the order of sending. TCP has reliable, connection-oriented characteristics.
2. How to understand that the HTTP protocol is stateless
The HTTP protocol is stateless, which means that the protocol has no memory for transaction processing, and the server does not know the state of the client. In other words, there is no connection between opening a web page on a server and the web page you opened on this server before. HTTP is a stateless connection-oriented protocol. Statelessness does not mean that HTTP cannot maintain a TCP connection, nor does it mean that HTTP uses the UDP protocol (no connection).
3. What are long connections and short connections?
In HTTP/1.0, short connections are used by default. In other words, every time the browser and server perform an HTTP operation, a connection is established, but the connection is interrupted when the task is completed. If an HTML or other type of Web page accessed by the client browser contains other Web resources, such as JavaScript files, image files, CSS files, etc.; every time the browser encounters such a Web resource, it will create a HTTP session.
But starting from HTTP/1.1, long connections are used by default to maintain connection characteristics. Using the HTTP protocol with long connections, this line of code will be added to the response header:
Connection:keep-alive
When using long connections, when a web page is opened, the client and server are used for transmission The TCP connection for HTTP data will not be closed. If the client accesses the web page on this server again, it will continue to use this established connection. Keep-Alive does not maintain the connection permanently. It has a retention time that can be set in different server software (such as Apache). To implement long connections, both the client and the server must support long connections.
The long connections and short connections of the HTTP protocol are essentially the long connections and short connections of the TCP protocol.
3.1 TCP connection
When the TCP protocol is used for network communication, a connection must be established between the server and the client before the actual read and write operations. After the read and write operations are completed, both parties no longer need this connection. They can To release this connection, the establishment of the connection requires three handshakes, and the release requires four handshakes, so the establishment of each connection requires resource consumption and time consumption. Classic three-way handshake diagram:
Classic four-way handshake closing diagram:
3.2 TCP short connection
We simulate the situation of TCP short connection. The client initiates a connection request to the server, the server receives the request, and then the two parties establish a connection. The client sends a message to the server, the server responds to the client, and then the read and write is completed. At this time, either party can initiate a close operation, but generally the client initiates the close operation first. Why? Generally, the server will not close the connection immediately after replying to the client. Of course, special circumstances cannot be ruled out. From the above description, short connections generally only transfer one read and write operation between client/server
The advantages of short connections are: they are relatively simple to manage, all existing connections are useful connections, and no additional control methods are required
3.3 TCP long connection
Next, let’s simulate the situation of a long connection. The client initiates a connection to the server, the server accepts the client connection, and the two parties establish a connection. After the client and server complete a read and write, the connection between them will not be actively closed, and subsequent read and write operations will continue to use this connection.
First let’s talk about the TCP keep-alive function mentioned in the detailed explanation of TCP/IP. The keep-alive function is mainly provided for server applications. The server application hopes to know whether the client’s host has crashed, so that it can use resources on behalf of the client. If the client has disappeared, leaving a semi-open connection on the server, and the server is waiting for data from the client, the server will wait for data from the client. The keep-alive function attempts to detect this semi-open connection on the server side. connect.
If there is no action on a given connection within two hours, the server will send a probe segment to the client. The client host must be in one of the following four states:
The client host is still running normally and will continue to operate from The server is reachable. The client's TCP response is normal, and the server also knows that the other party is normal. The server resets the keep-alive timer after two hours.
The client host has crashed and is shutting down or being rebooted. In either case, there is no response from the client's TCP. The server will not receive a response to the probe and will time out after 75 seconds. The server sends a total of 10 such probes, each spaced 75 seconds apart. If the server does not receive a response, it assumes that the client host has closed and terminates the connection.
The client host crashed and has been restarted. The server will receive a response to its keepalive probe, which is a reset, causing the server to terminate the connection.
The client is running normally, but the server is unreachable. This situation is similar to 2. What TCP can find is that no probe response is received.
3.4 Operation process of long connection and short connection
The operation steps of short connection are:
Establish connection-data transmission-close connection...Establish connection-data transmission-close connection
The operation steps of long connection It is:
Establish connection - data transfer... (keep connection)...data transfer - close connection
4. The advantages and disadvantages of long connection and short connection
As can be seen from the above, long connection It can save more TCP establishment and closing operations, reduce waste and save time. For customers who frequently request resources, long connections are more suitable. However, there is a problem here. The detection period of the survival function is too long, and it only detects the survival of TCP connections. It is a relatively gentle approach. When encountering a malicious connection, the keep-alive function is not enough. In the application scenario of long connection, the client generally does not actively close the connection between them. If the connection between the client and the server is not closed, there will be a problem. As the number of client connections increases, the server Sooner or later, there will be a time when it cannot bear it anymore. At this time, the server needs to adopt some strategies, such as closing some connections that have not had read or write events for a long time. This can avoid some malicious connections causing damage to the server-side service; if conditions permit again, it can The client machine is granular and limits the maximum number of long connections for each client. This can completely prevent a troublesome client from affecting the back-end service.
Short connections are relatively simple to manage for servers. All existing connections are useful connections and do not require additional control methods. But if the client requests frequently, time and bandwidth will be wasted on TCP establishment and shutdown operations.
The emergence of long connections and short connections lies in the closing strategies adopted by the client and server. Specific strategies are adopted for specific application scenarios. There is no perfect choice, only suitable choices.
5. When to use long connection and short connection?
Long connections are mostly used for frequent operations, point-to-point communication, and the number of connections cannot be too many. Each TCP connection requires a three-step handshake, which takes time. If each operation is connected first and then operated, the processing speed will be reduced a lot. Therefore, it is not disconnected after each operation and the data packet is sent directly during the first processing. It's OK, no need to establish a TCP connection. For example: long connections are used for database connections. Frequent communication with short connections will cause socket errors, and frequent socket creation is also a waste of resources.
HTTP services like WEB websites generally use short links, because long connections will consume a certain amount of resources for the server, and like WEB websites, where thousands or even hundreds of millions of client connections are so frequent, using short links will It saves some resources. If you use a long connection and there are thousands of users at the same time, it is conceivable if each user occupies a connection. Therefore, the amount of concurrency is large, but each user needs to use a short connection if they do not need frequent operations.