Have you ever encountered a situation where files have been deleted but the space has not been released in a Linux environment? This short article will introduce a scenario of this problem and the corresponding solution.
One of our application servers, the operating system is Red Hat Linux, monitoring alarm, /opt/applog file system usage exceeds the threshold, the overall capacity is 50G, but the actual file capacity is 20G, what is the remaining 30G space? ?
We know that in the Linux environment, everything exists in the form of a file. The system allocates a file descriptor to each application in the background, which facilitates the interaction between the application and the operating system. It provides a general interface. Since it is a file, it will take up space. At this time, you can use the lsof command, which can list the files currently being opened by the system.
>lsof COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... filebeat 111442 app 1r REG 253,3 209715229 1040407 /opt/applog/E.20171016.info.012.log filebeat 111442 app 2r REG 253,3 209715254 385080 /opt/applog/E.20171015.info.001.log (deleted) ...
The fields in the header have the following meanings:
COMMAND: The name of the process
PID: process identifier
USER: process owner
FD: File descriptor, the application identifies the file through the file descriptor. Such as cwd, txt, etc.
TYPE: file type, such as DIR, REG, etc.
DEVICE: Specify the name of the disk
SIZE: The size of the file
NODE: index node (identification of the file on disk)
NAME: The exact name of the open file
It can be seen that in some lines, NAME identifies (deleted)
/opt/applog/E.20171015.info.001.log (deleted)
What it means is that the file has been deleted, but the handle to the open file has not been closed. Look at the COMMAND name is filebeat, and the USER process owner is app. This is our log collection process. The app user opens it. filebeat process.
insert log collection platform
The traditional open source log platform, ELK, consists of three open source tools: ElasticSearch, Logstash and Kiabana, among which:
Common deployment diagrams are as follows
What is filebeat mentioned above? What’s the connection with ELK?
There is an introduction by the great expert Rao Chenlin (author of "ELKstack Authoritative Guide") on Zhihu, which is very insightful and quoted from https://www.zhihu.com/question/54058964/answer/137882919
Because logstash is run by jvm and consumes a lot of resources, the author later used golang to write a lightweight logstash-forwarder with fewer functions but low resource consumption. However, the author is just one person. After joining Simply speaking, filebeat is the process agent of log collection, responsible for collecting application log files. Regarding my above problem, the reason why there are a large number of (deleted) and unreleased file handles is that because the disk space is very limited, I temporarily added a task to delete logs 12 hours ago every hour. In other words, the scheduled task will automatically delete some files that filebeat is opening at this time, so these files become unreleased files, so the actual files are deleted, but the space is not released. Solution 1: In order to quickly release the space occupied, the most direct method is to kill -9 filebeat process, at which time the space will be released. But this is not a fundamental solution. Scheduled tasks will also delete these files opened by filebeat, causing the space to become full. Solution 2: That is, if a file has not been updated within a certain period of time, the monitored file handle will be closed. The default is 1 hour. That is, when the file name changes, including renaming and deletion, a file will be automatically closed. Combine these two parameters. According to the application requirements, if a file is not updated within 30 minutes, the handle needs to be closed. If the file is renamed or deleted, the handle needs to be closed. close_older: 30m It can meet the basic requirements of filebeat collection logs and regular deletion of historical files.
Filebeat's configuration file filebeat.yml actually has two parameters:
Description: Close older closes the file handler for which were not modified for longer then close_older. Time strings like 2h (2 hours), 5m (5 minutes) can be used.
Description: This option closes a file, as soon as the file name changes. This config option is recommended on windows only. Filebeat keeps the files it's reading open. This can cause issues when the file is removed, as the file will not be fully removed until also Filebeat closes the reading. Filebeat closes the file handler after ignore_older. During this time no new file with the same name can be created. Turning this feature on the other hand can lead to loss of data on rotate files. It can happen that after file rotation the beginning of the new file is skipped, as the reading starts at the end. We recommend to leave this option on false but lower the ignore_older value to release files faster.
force_close_files: true
The above is the detailed content of Why is the space not released after deleting a file?. For more information, please follow other related articles on the PHP Chinese website!