This article mainly introduces the relevant knowledge of Java IO reuse. It is very good and has reference value. Friends who need it can refer to it
For the concurrency processing ability of the server , what we need is: every millisecond the server can promptly process hundreds of messages on different TCP connections received within this millisecond. At the same time, there may be hundreds of thousands of messages on the server that have not been processed in the last few seconds. A relatively inactive connection for sending and receiving any messages. Processing multiple connections that occur in parallel at the same time is called concurrency; processing tens of thousands or hundreds of thousands of connections at the same time is high concurrency. What the server's concurrencyProgramming pursues is to process an infinite number of concurrent connections while maintaining efficient use of resources such as the CPU until the physical resources are first exhausted. There are many implementations of concurrent programming
Model, the simplest one is bundled with "threads", and one thread handles the entire life cycle of one connection. Advantages: This model is simple enough, it can implement complex business scenarios, and at the same time, the number of threads can be much greater than the number of CPUs. However, the number of threads cannot be increased infinitely. Why? Because when a thread is executed is determined by the operating system kernel scheduling algorithm, the scheduling algorithm does not consider that a certain thread may only serve one connection. It will adopt a unified gameplay: execute it when the time slice is up, even if this thread Once executed, you will have to continue sleeping. This back and forth of waking up and sleeping threads is cheap when the number of times is small, but if the total number of threads in the operating system is large, it is expensive (amplified), because this technical scheduling loss will affect the threads The time on which the business code is executed. For example, most of the threads with inactive connections at this time are like our state-owned enterprises. Their execution efficiency is too low. They always wake up and sleep to do useless work. When they wake up and compete for CPU resources, it means The number of private enterprise threads processing active connections is reduced and the opportunity to obtain the CPU is reduced. The CPU is the core competitiveness, and its inefficiency affects the total GDP throughput. What we are pursuing is to process hundreds of thousands of connections concurrently. When thousands of threads appear, the system's execution efficiency can no longer meet high concurrency. For high-concurrency programming, there is currently only one model, which is also the only essentially effective method. Message processing on the connection can be divided into two stages: waiting for the message to be ready and message processing. When using the default blocking socket (for example, one thread bundled to process one connection mentioned above), these two stages are often combined into one, so that the thread that operates the socket code must sleep. To wait for the message to be ready, this causes the thread to sleep and wake up frequently under high concurrency, thus affecting the CPU usage efficiency.
The high-concurrency programming method is, of course, to separate the two stages. That is, the code section that waits for the message to be ready is separated from the code section that processes the message. Of course, this also requires that the socket must be non-blocking. Otherwise, the code segment that processes the message can easily cause the thread to enter the sleep waiting stage when the conditions are not met. So the question is, how to achieve this stage of waiting for the message to be ready? After all, it is still waiting, which means that the thread still has to sleep! The solution is to actively
query, or let 1 thread wait for all connections! This is IO multiplexing. Multiplexing is all about waiting for messages to be ready, but it can handle multiple connections at the same time! It can also "wait", so it may also cause the thread to sleep, however this does not matter because it is one-to-many and it can monitor all connections. In this way, when our thread is awakened for execution, there must be some connections ready to be executed by our code, which is efficient! There are not so many threads competing to process the "waiting for message to be ready" phase, and the whole world is finally clear! There are many implementations of multiplexing. On linux
, before the 2.4 kernel, the main ones were select and poll. Now the mainstream is epoll. Their usage methods seem to be very different, but the essence is the same. The efficiency is also different, which is why epoll completely replaced select.
Let’s briefly talk about why epoll replaces select.
As mentioned earlier, the core solution for high concurrency is to have one thread handle "waiting for messages to be ready" for all connections. There is no dispute between epoll and select on this point. But select estimated one thing wrong. As we said in
Opening, when hundreds of thousands of concurrent connections exist, there may be only hundreds of active connections every millisecond, while the remaining hundreds of thousands of connections is inactive during this millisecond. The method of using select is as follows: Returned active connections ==select (all connections to be monitored)
When will the select method be called? You should call this when you think you need to find out which active connections have received packets. Therefore, calling select will be called frequently when concurrency is high. In this way, it is necessary to see whether this frequently called method is efficient, because its slight efficiency loss will be amplified by the word "frequent". Does it have an efficiency loss? Obviously, there are hundreds of thousands of connections to be monitored, and only hundreds of active connections are returned, which is inefficient in itself. After being amplified, you will find that select is completely unable to handle tens of thousands of concurrent connections.
Look at some pictures. When the number of concurrent connections is less than 1,000, the number of select executions is not frequent, and there does not seem to be much difference with epoll:
However, once the number of concurrent connections increases, The shortcomings of select are infinitely magnified by "frequent execution", and the more concurrent the number, the more obvious it is:
Let's talk about how epoll solves it. It very cleverly uses 3 methods to achieve what the select method does:
New epoll descriptor==epoll_create()
epoll_ctrl(epoll descriptor, add OrDeleteall connections to be monitored)
The returned active connection==epoll_wait(epoll descriptor)
The main benefit of doing this is: distinguish between frequent calls and Operations that are called infrequently. For example, epoll_ctrl is called less frequently, while epoll_wait is called very frequently. At this time, epoll_wait has almost no input parameters, which is much more efficient than select. Moreover, it will not increase the number of input parameters as concurrent connections increase, resulting in a decrease in kernel execution efficiency.
How is epoll implemented? In fact, it is very simple. It can be seen from these three methods that it is smarter than select in avoiding the need to pass in all the connections to be monitored every time when epoll_wait frequently calls "which connections are already in the message preparation stage". of. This means that it maintains a data structure in the kernel mode to save all connections to be monitored. This data structure is a red-black tree, and the addition and reduction of its nodes is completed through epoll_ctrl. It is very simple:
The red-black tree in the lower left corner of the picture consists of all the connections to be monitored. The linked list on the upper left shows all currently active connections. Therefore, when epoll_wait is executed, it only checks the upper left linked list and returns the connection in the upper left linked list to the user. In this way, can the execution efficiency of epoll_wait be low?
Finally, let’s take a look at the two gameplay methods ET and LT provided by epoll, which are the translated edge trigger and horizontal trigger. In fact, these two Chinese names are somewhat appropriate. These two usage methods are still aimed at efficiency issues, but they just become how to make the connection returned by epoll_wait more accurate.
For example, we need to monitor whether the write buffer of a connection is free. When it is "writable", we can send the response call write to the client from the user mode. However, perhaps when the connection is writable, our "response" content is still on the disk. What if the disk read has not been completed at this time? The thread must not be blocked, so the response will not be sent. However, the connection may be returned to you the next time you epoll_wait, and you have to check whether you want to process it. Probably, our program has another module that handles disk IO specifically, and it will send a response when the disk IO is completed. So, every time epoll_wait returns this "writable" connection that cannot be processed immediately, does it meet user expectations?
So, the ET and LT models came into being. LT is that every connection that meets the expected state must be returned in epoll_wait, so it treats everyone equally and is on a horizontal line. This is not the case with ET, which prefers more precise return connections. In the above example, after the connection becomes writable for the first time, if the program does not write any data to the connection, then epoll_wait will not return the connection next time. ET is called edge trigger, which means that epoll_wait will be triggered to return it only when the connection changes from one state to another. It can be seen that the programming of ET is much more complicated. At least the application must be careful to prevent the connection returned by epoll_wait from appearing: when it is writable, the data is not written but it expects the next "writable"; when it is readable, the data is not read but it expects the next time. Once "readable".
Of course, there won’t be any big difference in performance in general application scenarios. The possible advantage of ET is that the number of calls to epoll_wait will be reduced, and in some scenarios, the connection will not be awakened when it is not necessary. (This wake-up refers to the return of epoll_wait). But if it’s like the example I mentioned above, sometimes it’s not just a network problem, it’s related to the application scenario. Of course, most open source frameworks are written based on ET. As for the framework, it pursues purely technical issues, and of course strives for perfection
The above is the detailed content of Detailed graphic and text explanation of IO reuse in Java. For more information, please follow other related articles on the PHP Chinese website!