Received feedback from the project that the customer encountered difficulties when deploying the product using the tools we provided, and encountered problems in the host addition step, which caused the implementation team to be unable to continue the work, so they asked us for help.
Environmental information: kylin10
Architecture: arm
During the system deployment process, we used ansible scripts for batch operations on hosts. Recently, I encountered a problem with the execution process being stuck. It was initially suspected that ansible was blocked during execution. To verify this, I have sent a command to the field for testing.
localhost$ date 2024年02月19日星期 17:30:41 CST localhost$ ansible all -i "192.168.2.84, -m shell -a 'date' --l become --become-method=sudo --become-user=root -u test 192.168.2.84 CHANGED rc=0 >> 2024年02月19日星期 17:33:34 CST
Sure enough, a simple ansible command took more than 2 minutes to execute on the environment before returning the result. The cause of the problem lies here, and we have a general direction.
Ansible still relies on SSH remote connection during actual execution. We have encountered slow SSH connection speed before. Preliminary speculation may be that the execution return speed is slow due to the slow SSH connection speed.
Check ssh parameters, check /etc/ssh/sshd_config configuration
GSSAPIAuthentication no #关闭SERVER上的GSS认证
In Linux, reverse DNS resolution of SSH is enabled by default. This consumes a lot of time, so it needs to be turned off. In the configuration file, although UseDNS yes is commented, the default switch is yes
UseDNS=no #关闭SERVER上DNS反向解析
It was found that the two ssh parameter configurations on the project were consistent with the above, and the ssh login to the peer host was manually tested, and the speed was very fast
I have no clue, I can only use the Linux strace command to trace the system call
Check the strace log and find that there are a large number of select waits, which should be blocked for a long time when performing an operation.
Debugging using ansible
ansible all -i "192.168.2.84, -m shell -a 'date' --l become --become-method=sudo --become-user=root -u test -vvv
Add -vvv after the command to see the detailed execution process
From the debug information, we can see the error "mux_client_read_packet: read header failed: Broken pipe". When executing the script, we can see that python-related commands are being executed. Ansible depends on python. Is it related to the python version?
Checking the information on the Internet said that there are compatibility issues between ansible and python.
So check the python version. You can see that the default version of python is python2, but there is also python3 on the system. Try to modify the soft link to python3.7 for verification
Execute the ansible command again
The execution time is 1.3s. It seems that the compatibility of ansible and python versions caused this problem.
The above is the detailed content of How to solve the problem of slow execution speed of ansible. For more information, please follow other related articles on the PHP Chinese website!