Recently, I encountered a very strange problem in a program written in Golang, that is, the process disappeared by itself. After some troubleshooting, I found the problem and summed up some experiences. Let me share the problems and solutions I encountered below.
Our product executes a task regularly every day. In the past few days, I found that the task always stopped after a few minutes. I set the log output of the process. to the file, and then looked at the log and found that the process only ran for a few minutes and then disappeared on its own. This situation is very strange, because I have never encountered similar problems in normal development and testing.
First of all, I thought of the simplest solution: add debug information to the code. So, when I started the process, I output a log of start
, and then every time I performed some important operations, I output a corresponding log. Then, I restarted the task, waited for it to stop, and then checked the log and found that the process stopped just a few minutes after it was started, but it did not output any error message in the log. It seemed that it terminated itself.
Next, I tried using the strace
command to trace the system calls of the process to see why it terminated. However, the structure of this process is relatively complex, with multiple goroutines running. I used the strace
command to trace one of the goroutines (ndeliver) to see its system calls. The following is the relevant code of ndeliver
goroutine:
c := make(chan os.Signal, 1) signal.Notify(c, syscall.SIGINT, syscall.SIGTERM) go func() { sig := <-c log.Errorf("main: received signal %s, shutting down server", sig.String()) server.Stop() os.Exit(0) }() go func() { err := server.Start() if err != nil { log.Fatalf("ndeliver: server start error: %s", err) } }()
The function of this code is to register a signal processing function for the process and start a goroutine to execute server.Start()
Function, this function will block until the process exits.
Through the strace
command, I found that this goroutine did not have any exceptions, and it exited without encountering any errors. However, I found that there were other goroutines in the process. I continued to use the strace
command to trace one of the goroutines. Then, I found the problem. A goroutine threw a panic, and the panic was not handled. , so the entire process crashes.
By looking at the code, I found that this panic was caused by a file being deleted, but our code did not handle this error. When a goroutine panic is not handled, the entire process will crash, which is why the process disappears by itself.
In order to solve this problem, we need to handle panic to prevent it from crashing the entire process. We can use the recover function where needed to capture panic and then handle it to avoid process crash.
The following is a code example for handling panic:
defer func() { if r := recover(); r != nil { log.Errorf("goroutine panic: %v", r) // TODO: 处理 panic } }() // 代码片段
By using the defer function, when the goroutine terminates, even if it panics, we can capture it and handle it accordingly, here We simply output the panic information, but in fact, we can also do other processing here, such as sending an alert or logging more information about the error.
When writing Golang code, due to the special nature of goroutine, when a goroutine panics, it may cause the entire process to crash. Therefore, when writing code, we must take this situation into consideration and write code to handle this situation. It is very important to add panic handling in the code, it can help us avoid similar problems in the production environment.
The above is the detailed content of The golang process disappears by itself. For more information, please follow other related articles on the PHP Chinese website!