Connection failure problem
Example
Among them, Redis is common The error report is:
Configuration item: timeout
Error message: Error while reading line from the server
Redis can Configure if the client does not send data to the Redis server after a certain number of seconds, the connection will be closed.
Recommended learning: swoole tutorial
MySQL common errors:
Configuration items: wait_timeout & interactive_timeout
Error message: has gone away
Like the Redis server, MySQL will also clean up useless connections regularly.
How to solve
1. Reconnect when using
2. Send heartbeats regularly to maintain the connection
When using Reconnecting
The advantage is that it is simple, but the disadvantage is that it faces the problem of short connections.
Send heartbeat regularly to maintain connection
Recommended.
How to maintain a long connection
tcp_keepalive implemented in the tcp protocol
The bottom layer of the operating system provides a set of tcpkeepalive
Configuration:
tcp_keepalive_time (integer; default: 7200; since Linux 2.2) The number of seconds a connection needs to be idle before TCP begins sending out keep-alive probes. Keep-alives are sent only when the SO_KEEPALIVE socket option is enabled. The default value is 7200 seconds (2 hours). An idle connection is terminated after approximately an additional 11 minutes (9 probes an interval of 75 seconds apart) when keep-alive is enabled. Note that underlying connection tracking mechanisms and application timeouts may be much shorter. tcp_keepalive_intvl (integer; default: 75; since Linux 2.4) The number of seconds between TCP keep-alive probes. tcp_keepalive_probes (integer; default: 9; since Linux 2.2) The maximum number of TCP keep-alive probes to send before giving up and killing the connection if no response is obtained from the other end. 8
The bottom layer of Swoole has opened up these configurations, for example:
?php $server = new \Swoole\Server('127.0.0.1', 6666, SWOOLE_PROCESS); $server->set([ 'worker_num' => 1, 'open_tcp_keepalive' => 1, 'tcp_keepidle' => 4, // 对应tcp_keepalive_time 'tcp_keepinterval' => 1, // 对应tcp_keepalive_intvl 'tcp_keepcount' => 5, // 对应tcp_keepalive_probes ]);
Among them:
'open_tcp_keepalive' => 1, // 总开关,用来开启tcp_keepalive 'tcp_keepidle' => 4, // 4s没有数据传输就进行检测 // 检测的策略如下: 'tcp_keepinterval' => 1, // 1s探测一次,即每隔1s给客户端发一个包(然后客户端可能会回一个ack的包,如果服务端收到了这个ack包,那么说明这个连接是活着的) 'tcp_keepcount' => 5, // 探测的次数,超过5次后客户端还没有回ack包,那么close此连接
Let’s Let’s experience the actual test. The server script is as follows:
<?php $server = new \Swoole\Server('127.0.0.1', 6666, SWOOLE_PROCESS); $server->set([ 'worker_num' => 1, 'open_tcp_keepalive' => 1, // 开启tcp_keepalive 'tcp_keepidle' => 4, // 4s没有数据传输就进行检测 'tcp_keepinterval' => 1, // 1s探测一次 'tcp_keepcount' => 5, // 探测的次数,超过5次后还没有回包close此连接 ]); $server->on('connect', function ($server, $fd) { var_dump("Client: Connect $fd"); }); $server->on('receive', function ($server, $fd, $reactor_id, $data) { var_dump($data); }); $server->on('close', function ($server, $fd) { var_dump("close fd $fd"); }); $server->start();
Let’s start this server:
~/codeDir/phpCode/hyperf-skeleton # php server.php
Then capture the packet through tcpdump:
~/codeDir/phpCode /hyperf-skeleton # tcpdump -i lo port 6666
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet) , capture size 262144 bytes
We are listening for data packets on port 6666 on lo at this time.
Then we use the client to connect to it:
~/codeDir/phpCode/hyperf-skeleton # nc 127.0.0.1 6666
At this time, the server will print out the message:
~/codeDir/phpCode/hyperf-skeleton # php server.php string(17) "Client: Connect 1"
The output information of tcpdump is as follows:
01:48:40.178439 IP localhost.33933 > localhost.6666: Flags [S], seq 43162537, win 43690, options [mss 65495,sackOK,TS val 9833698 ecr 0,nop,wscale 7], length 0 01:48:40.178484 IP localhost.6666 > localhost.33933: Flags [S.], seq 1327460565, ack 43162538, win 43690, options [mss 65495,sackOK,TS val 9833698 ecr 9833698,nop,wscale 7], length 0 01:48:40.178519 IP localhost.33933 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9833698 ecr 9833698], length 0 01:48:44.229926 IP localhost.6666 > localhost.33933: Flags [.], ack 1, win 342, options [nop,nop,TS val 9834104 ecr 9833698], length 0 01:48:44.229951 IP localhost.33933 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9834104 ecr 9833698], length 0 01:48:44.229926 IP localhost.6666 > localhost.33933: Flags [.], ack 1, win 342, options [nop,nop,TS val 9834104 ecr 9833698], length 0 01:48:44.229951 IP localhost.33933 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9834104 ecr 9833698], length 0 01:48:44.229926 IP localhost.6666 > localhost.33933: Flags [.], ack 1, win 342, options [nop,nop,TS val 9834104 ecr 9833698], length 0 // 省略了其他的输出
We will find that at the beginning, the three-way handshake packet will be printed:
01:48:40.178439 IP localhost.33933 > localhost.6666: Flags [S], seq 43162537, win 43690, options [mss 65495,sackOK,TS val 9833698 ecr 0,nop,wscale 7], length 0 01:48:40.178484 IP localhost.6666 > localhost.33933: Flags [S.], seq 1327460565, ack 43162538, win 43690, options [mss 65495,sackOK,TS val 9833698 ecr 9833698,nop,wscale 7], length 0 01:48:40.178519 IP localhost.33933 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9833698 ecr 9833698], length 0
Then, it will stay for 4s without any packet output.
After that, a group will be printed out every 1 second or so:
01:52:54.359341 IP localhost.6666 > localhost.43101: Flags [.], ack 1, win 342, options [nop,nop,TS val 9859144 ecr 9858736], length 0 01:52:54.359377 IP localhost.43101 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9859144 ecr 9855887], length 0
In fact, this is the strategy we configured:
'tcp_keepinterval' => 1, // 1s探测一次 'tcp_keepcount' => 5, // 探测的次数,超过5次后还没有回包close此连接
Because the bottom layer of our operating system will automatically The client responds with ack, so the connection will not be closed after 5 probes. The bottom layer of the operating system will continuously send a group of packets like this:
01:52:54.359341 IP localhost.6666 > localhost.43101: Flags [.], ack 1, win 342, options [nop,nop,TS val 9859144 ecr 9858736], length 0 01:52:54.359377 IP localhost.43101 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 9859144 ecr 9855887], length 0
If we want to close the connection after testing 5 times, we can disable the packets on port 6666:
~/codeDir/phpCode/hyperf-skeleton # iptables -A INPUT -p tcp --dport 6666 -j DROP
This will Disable all packets coming from port 6666. Naturally, the server will not be able to receive ack packets sent from the client.
Then the server will print out close after 5 seconds (the server actively calls the close method and sends a FIN packet to the client):
~/codeDir/phpCode/hyperf-skeleton # php server.php string(17) "Client: Connect 1" string(10) "close fd 1"
Let’s restore the iptables rules:
~/codeDir/phpCode # iptables -D INPUT -p tcp -m tcp --dport 6666 -j DROP
That is, the rules we set are deleted.
The heartbeat function is implemented through tcp_keepalive. The advantage is that it is simple. You can complete this function without writing code, and the heartbeat packet sent is small. The disadvantage is that it depends on the network environment of the system. It must be ensured that both the server and the client implement such functions, and the client needs to cooperate in sending heartbeat packets.
Another more serious shortcoming is that if the client and the server are not directly connected, but are connected through a proxy, such as the socks5 proxy, it will only forward application layer packets and not forward them. For lower-level TCP detection packets, the heartbeat function will be invalid.
So, Swoole provides other solutions, a set of configurations for detecting dead connections.
'heartbeat_check_interval' => 1, // 1s探测一次 'heartbeat_idle_time' => 5, // 5s未发送数据包就close此连接
heartbeat implemented by swoole
Let’s test it:
<?php $server = new \Swoole\Server('127.0.0.1', 6666, SWOOLE_PROCESS); $server->set([ 'worker_num' => 1, 'heartbeat_check_interval' => 1, // 1s探测一次 'heartbeat_idle_time' => 5, // 5s未发送数据包就close此连接 ]); $server->on('connect', function ($server, $fd) { var_dump("Client: Connect $fd"); }); $server->on('receive', function ($server, $fd, $reactor_id, $data) { var_dump($data); }); $server->on('close', function ($server, $fd) { var_dump("close fd $fd"); }); $server->start();
Then start the server:
~/codeDir/phpCode/hyperf- skeleton # php server.php
Then start tcpdump:
~/codeDir/phpCode # tcpdump -i lo port 6666 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
Then start the client:
~/codeDir/phpCode/hyperf-skeleton # nc 127.0.0.1 6666
At this time, the server prints:
~/codeDir/phpCode/hyperf-skeleton # php server.php string(17) "Client: Connect 1"
Then tcpdump prints:
02:48:32.516093 IP localhost.42123 > localhost.6666: Flags [S], seq 1088388248, win 43690, options [mss 65495,sackOK,TS val 10193342 ecr 0,nop,wscale 7], length 0 02:48:32.516133 IP localhost.6666 > localhost.42123: Flags [S.], seq 80508236, ack 1088388249, win 43690, options [mss 65495,sackOK,TS val 10193342 ecr 10193342,nop,wscale 7], length 0 02:48:32.516156 IP localhost.42123 > localhost.6666: Flags [.], ack 1, win 342, options [nop,nop,TS val 10193342 ecr 10193342], length 0
This is the three-way handshake information.
Then after 5 seconds, tcpdump will print out:
02:48:36.985027 IP localhost.6666 > localhost.42123: Flags [F.], seq 1, ack 1, win 342, options [nop,nop,TS val 10193789 ecr 10193342], length 0
02:48:36.992172 IP localhost.42123 > localhost.6666: Flags [.], ack 2, win 342, options [nop,nop,TS val 10193790 ecr 10193789], length 0
That is, the server sent a FIN packet. Because the client sent no data, Swoole closed the connection.
Then the server will print:
~/codeDir/phpCode/hyperf-skeleton # php server.php string(17) "Client: Connect 1" string(10) "close fd 1"
So, there are certain differences between heartbeat and tcp keepalive. Tcp keepalive has the function of keeping the connection alive, but heartbeat saves It simply detects a connection without data and then closes it. It can only be configured on the server side. If it needs to be kept alive, the client can also cooperate to send heartbeats.
如果我们不想让服务端close掉连接,那么就得在应用层里面不断的发送数据包来进行保活,例如我在nc客户端里面不断的发送包:
~/codeDir/phpCode/hyperf-skeleton # nc 127.0.0.1 6666 ping ping ping ping ping ping ping ping ping
我发送了9个ping包给服务器,tcpdump的输出如下:
// 省略了三次握手的包 02:57:53.697363 IP localhost.44195 > localhost.6666: Flags [P.], seq 1:6, ack 1, win 342, options [nop,nop,TS val 10249525 ecr 10249307], length 5 02:57:53.697390 IP localhost.6666 > localhost.44195: Flags [.], ack 6, win 342, options [nop,nop,TS val 10249525 ecr 10249525], length 0 02:57:55.309532 IP localhost.44195 > localhost.6666: Flags [P.], seq 6:11, ack 1, win 342, options [nop,nop,TS val 10249686 ecr 10249525], length 5 02:57:55.309576 IP localhost.6666 > localhost.44195: Flags [.], ack 11, win 342, options [nop,nop,TS val 10249686 ecr 10249686], length 0 02:57:58.395206 IP localhost.44195 > localhost.6666: Flags [P.], seq 11:16, ack 1, win 342, options [nop,nop,TS val 10249994 ecr 10249686], length 5 02:57:58.395239 IP localhost.6666 > localhost.44195: Flags [.], ack 16, win 342, options [nop,nop,TS val 10249994 ecr 10249994], length 0 02:58:01.858094 IP localhost.44195 > localhost.6666: Flags [P.], seq 16:21, ack 1, win 342, options [nop,nop,TS val 10250341 ecr 10249994], length 5 02:58:01.858126 IP localhost.6666 > localhost.44195: Flags [.], ack 21, win 342, options [nop,nop,TS val 10250341 ecr 10250341], length 0 02:58:04.132584 IP localhost.44195 > localhost.6666: Flags [P.], seq 21:26, ack 1, win 342, options [nop,nop,TS val 10250568 ecr 10250341], length 5 02:58:04.132609 IP localhost.6666 > localhost.44195: Flags [.], ack 26, win 342, options [nop,nop,TS val 10250568 ecr 10250568], length 0 02:58:05.895704 IP localhost.44195 > localhost.6666: Flags [P.], seq 26:31, ack 1, win 342, options [nop,nop,TS val 10250744 ecr 10250568], length 5 02:58:05.895728 IP localhost.6666 > localhost.44195: Flags [.], ack 31, win 342, options [nop,nop,TS val 10250744 ecr 10250744], length 0 02:58:07.150265 IP localhost.44195 > localhost.6666: Flags [P.], seq 31:36, ack 1, win 342, options [nop,nop,TS val 10250870 ecr 10250744], length 5 02:58:07.150288 IP localhost.6666 > localhost.44195: Flags [.], ack 36, win 342, options [nop,nop,TS val 10250870 ecr 10250870], length 0 02:58:08.349124 IP localhost.44195 > localhost.6666: Flags [P.], seq 36:41, ack 1, win 342, options [nop,nop,TS val 10250990 ecr 10250870], length 5 02:58:08.349156 IP localhost.6666 > localhost.44195: Flags [.], ack 41, win 342, options [nop,nop,TS val 10250990 ecr 10250990], length 0 02:58:09.906223 IP localhost.44195 > localhost.6666: Flags [P.], seq 41:46, ack 1, win 342, options [nop,nop,TS val 10251145 ecr 10250990], length 5 02:58:09.906247 IP localhost.6666 > localhost.44195: Flags [.], ack 46, win 342, options [nop,nop,TS val 10251145 ecr 10251145], length 0
有9组数据包的发送。(这里的Flags [P.]代表Push的含义)
此时服务器还没有close掉连接,实现了客户端保活连接的功能。然后我们停止发送ping,过了5秒后tcpdump就会输出一组:
02:58:14.811761 IP localhost.6666 > localhost.44195: Flags [F.], seq 1, ack 46, win 342, options [nop,nop,TS val 10251636 ecr 10251145], length 0
02:58:14.816420 IP localhost.44195 > localhost.6666: Flags [.], ack 2, win 342, options [nop,nop,TS val 10251637 ecr 10251636], length 0
服务端那边发送了FIN包,说明服务端close掉了连接。服务端的输出如下:
~/codeDir/phpCode/hyperf-skeleton # php server.php string(17) "Client: Connect 1" string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(5) "ping " string(10) "close fd 1"
然后我们在客户端那边ctrl + c来关闭连接:
~/codeDir/phpCode/hyperf-skeleton # nc 127.0.0.1 6666 ping ping ping ping ping ping ping ping ping ^Cpunt! ~/codeDir/phpCode/hyperf-skeleton #
此时,tcpdump的输出如下:
03:03:02.257667 IP localhost.44195 > localhost.6666: Flags [F.], seq 46, ack 2, win 342, options [nop,nop,TS val 10280414 ecr 10251636], length 0 03:03:02.257734 IP localhost.6666 > localhost.44195: Flags [R], seq 2678621620, win 0, length 0
应用层心跳
1、制定ping/pong协议(mysql等自带ping协议)
2、客户端灵活的发送ping心跳包
3、服务端OnRecive检查可用性回复pong
例如:
$server->on('receive', function (\Swoole\Server $server, $fd, $reactor_id, $data) { if ($data == 'ping') { checkDB(); checkServiceA(); checkRedis(); $server->send('pong'); } });
结论
1、tcp的keepalive最简单,但是有兼容性问题,不够灵活
2、swoole提供的keepalive最实用,但是需要客户端配合,复杂度适中
3、应用层的keepalive最灵活但是最麻烦
The above is the detailed content of Summary of common problems with PHP Swoole long connections. For more information, please follow other related articles on the PHP Chinese website!