Disk space is full
Since Linux does not have a recycle bin function, all files to be deleted on the online server will be moved to the system/tmp directory first, and then the data in the /tmp directory will be cleared regularly. There is nothing wrong with this strategy in itself, but after inspection, it was found that the system partition of this server did not have a separate /tmp partition, so the data under /tmp actually occupied the space of the root partition. Now that the problem has been found, just delete some data files that take up a lot of space in the /tmp directory, and check the three largest data files in /tmp.
du -sh /tmp/* | sort -nr | head -3
Check the top three largest data files under /tmp and find through the command output that there is a 66GB file access_log in the /tmp directory. This file should be the access log file generated by Apache. Judging from the log size, it should be I haven’t cleaned up the Apache log file for a long time. It is basically determined that the root space is full due to this file. After confirming that the file can be deleted, perform the following deletion operation:
rm /tmp/access_log
Then check whether the system root partition space is released. You can see from the output, The root partition space is still not released, what's going on?
The space is not released after deleting the file
Generally speaking, the space is not released after deleting the file, but there are exceptions, such as the file is locked by the process, or there is The process has been writing data to this file, etc. To understand this problem, you need to know the storage mechanism and storage structure of files under Linux.
The data and pointer part of the file
The storage of a file in the file system is divided into two parts: the data part and the pointer part. The pointer is located in the meta-data of the file system. After the data is deleted, this pointer changes from The meta-data is cleared, and the data part is stored on disk. After clearing the pointers corresponding to the data from meta-data, the space occupied by the file data can be overwritten and new content can be written. The reason why the space has not been released after deleting the access_log file is because the httpd process is still Keep writing content to this file, resulting in the access_log file being deleted, but due to process locking, the pointer part corresponding to the file has not been cleared from the meta-data, and because the pointer has not been deleted, the system kernel believes that the file has not been deleted. .
Find the list of deleted files occupied by the application
So query the space through the df command and it has not been released. Now that you have an idea to solve the problem, let's see if there is a process that has been writing data to the access_log file. Here you need to use the lsof command under Linux. Through this command, you can obtain a list of deleted files that are still occupied by the application:
lsof | grep delete
As you can see from the output, the /tmp/access_log file is locked by the process httpd, and the httpd process Log data has also been written to this file. From column 7, we can see that the size of this log file is about 70GB, and the total size of the system root partition is only 100GB. It can be seen that this file is the culprit that causes the system root partition to run out of space. The "deleted" status in the last column indicates that the log file has been deleted, but because the process is still writing data to this file, the space has not been released.
Clear files correctly
There are many ways to solve this type of problem. The simplest method is to close or restart the httpd process. Of course, you can also restart the operating system, but these are not the best methods. For this kind of process that keeps writing logs to files, the best way to release the disk space occupied by the file is to clear the file online. This can be done through the following command:
[root@localhost ~]# echo " " >/tmp/access_log
In this way, the disk space can not only be Releasing immediately can also ensure that the process continues to write logs to the file. This method is often used to clean up log files generated by Web services such as Apache, Tomcat, and Nginx online.