1. Facing the crowd
If the site architecture meets the following points, then the optimization plan in this article will be very suitable:
1) Use scripting languages such as php as the development language
2) Need to connect to back-end services, such as RPC services, memcache or redis, etc.
3) The traffic is very large
2. Problems to be solved
The common web architecture is as above:
1) The front end is the APP or web page
2) The upper layer of the server is web-server Access
3) The php script language calls back-end data, completes business logic, and splices pages
4) The last end is services, caches, and databases
php is a scripting language, unlike C++/Java where processes can It is resident, so it uses short connections to connect to back-end services:
The picture above is a typical scenario. The site php is deployed on machine A, and the cache memcache is deployed on machine B. They communicate through short connections. , the process is:
1) php establishes tcp short connection
2) sends data according to memcache protocol
3) receives data returned by memcache
4) php closes tcp short connection
When the site traffic is small, the above process does not work Any problem, when the site traffic is very large and the QPS is very high, the overhead of PHP establishing + closing the TCP short connection of memcache cannot be ignored. It may become a performance bottleneck. How to optimize it is the core of this article. .
3. Introduction to UNIX Domain Socket
Changing the subject, let’s first take a look at UNIX Domain Socket technology.
UNIX Domain Socket is an inter-process IPC communication mechanism. It does not need to go through the network protocol stack, packaging and unpacking, calculating checksums, maintaining sequence numbers and responses, etc. It just copies application layer data from one process to another. a process. It can be used for two unrelated processes on the same host, and is full-duplex, providing an IPC mechanism for reliable message delivery (messages are not lost, repeated, or confused).
4. Optimization plan
It can be seen that the efficiency of UNIX Domain Socket will be much higher than that of tcp short connection, but it can only be used for process communication between the same host, and PHP applications and back-end services are often deployed on On different machines, can it be used for optimization at this time? The answer is yes.
The simplified architecture diagram is as above. A local-proxy is deployed on the php application server. UNIX Domain Socket is used to communicate between php and local-proxy, and local-proxy makes a TCP long connection with the back-end service. Communication, this greatly improves communication efficiency and eliminates the overhead of establishing + closing TCP short connections for each request.
5. Key points of local-proxy
To realize the above optimization plan, local-proxy is the key point of implementation. When implementing local-proxy, there are several points that need to be paid attention to
1) Protocol design: local-proxy itself does not have any business Logic is only responsible for request forwarding. The upstream sends the memcache protocol and transparently transmits it to the back-end memcache. In this case, the upstream client does not need to make any code modifications
2) Communication method: As mentioned above, local-proxy and upstream Use UNIX Domain Socket to communicate, and use tcp long connection to communicate with the downstream
3) Efficient framework: This solution is to solve the efficiency loss of tcp short connection, so the efficiency requirements for local-proxy are very high, you can choose mature Efficient network frameworks (such as libevent) and tcp long connection connection pool technology are used to achieve
4) Request mapping: it is necessary to map the requests sent from the upstream to the requests sent to the downstream one by one, so that the upper request package can be correctly mapped with response package