The Linux kernel module used to support user space file systems is called FUSE. The full name of fuse is "Filesystem in Userspace", which means "user space file system" in Chinese. It refers to a file system completely implemented in user mode. It is used to mount certain network spaces in Linux and is an important component of a general operating system.
#The operating environment of this tutorial: linux7.3 system, Dell G3 computer.
What is linux fuse
User space file system (Filesystem in Userspace) refers to a file system completely implemented in user mode, which is Linux A module used to mount certain network spaces, such as SSH, to the local file system. Related content can be found on SourceForge.
The kernel module used by Linux to support the user space file system is called FUSE. The term FUSE sometimes refers specifically to the user space file system under Linux. It is an important part of a general operating system. Operating systems have traditionally provided support for file systems at the kernel level. Generally, kernel-mode code is difficult to debug and has low productivity.
The so-called "user mode file system" means that the data and metadata of a file system are provided by user mode processes (this process is called "daemon"). For micro-kernel operating systems, implementing a file system in user mode is nothing, but for macro-kernel Linux, the meaning is different.
Although it is called a user-mode file system, it does not mean that it does not require the participation of the kernel at all, because in Linux, access to files is uniformly performed through the kernel interface provided by the VFS layer (such as open/read), so when a process (called "user") accesses the file system implemented by the daemon, it still needs to go through VFS.
When VFS receives a file access request from the user process and determines that the file belongs to a user-mode file system (according to the mount type), it will transfer the request to a file named "fuse" kernel module. Then, "fuse" converts the request into the protocol format agreed with the daemon and transmits it to the daemon process.
It can be seen that in this three-party relationship, the "fuse" kernel module plays a transfer role, which helps to establish the VFS (it can also be said to be the user process ) and the daemon. In layman's terms, its role is actually an "agent."
The implementation of this entire framework in Linux is FUSE (Filesystem in Userspace). As shown in Figure 1, the part in the red box is the specific implementation of the FUSE type file system, and it is the space that designers of user-mode file systems can play. Currently, there are more than a hundred file systems implemented based on FUSE (some kernel-based file systems can also be ported into user-mode file systems, such as ZFS and NTFS), and this article will use a ready-made fuse-sshfs for demonstration.
First install the fuse-sshfs software package and use the following command to mount the file system (mount the "remote-dir" directory of the remote machine to the "local-dir" directory of the local machine):
sshfs:
After that, in the "/sys/fs" directory, A folder named "fuse" will be generated. At the same time, you can see that the "fuse" kernel module has been loaded (its corresponding device is "/dev/fuse"), and the type of the local mounting directory has become " fuse.sshfs":
The purpose of generating device nodes is to facilitate user-mode control, but for file system-level applications, use ioctl() directly. Accessing the device is still troublesome because too many details are presented, so libfuse emerged as an intermediate layer. The daemon process actually operates the fuse device file through the interface provided by libfuse.
Next, take the "touch" command to create a new file in the "fuse.sshfs" file system as an example, and check the fuse kernel module and daemon process (i.e. " sshfs") specific interaction process (the code part is based on kernel version 5.2.0):
[First round]
The beginning is permission verification, but The verification here is not equivalent to the permission verification of VFS. Its main purpose is to prevent other users from accessing their private fuse file system.
#Then the inode of the file is found based on the file path. Since it is a newly created file, the inode is not in the kernel's inode cache, so a "lookup" request needs to be sent to the daemon:
These requests will be put into a pending queue, waiting for the reply of the daemon process, and the user process will fall into sleep:
As a daemon, the sshfs process reads "/dev/fuse" Device file to obtain data. If the pending queue is empty, it will fall into blocking waiting:
When requests arrive on the pending queue, the daemon process will be awakened and process these ask. The processed request will be moved into processing queue. After the daemon process replies to the fuse kernel module, the user process will be awakened and the corresponding request will be removed from the processing queue.
【Second round】
The next step is to execute the "touch" command If the other system calls triggered are data/metadata that have been accessed before, they are likely to exist in the cache. When accessing this part of data/metadata again, the fuse kernel module can solve it by itself without the need for a round trip to user space. trip, otherwise it still needs to be reported to the daemon process for processing.
Here get_fuse_conn() obtains the "fuse_conn" structure instance created when the fuse type file system is mounted. As the link between the daemon process and the kernel, the connection will always exist unless the daemon process dies or the corresponding fuse file system is uninstalled.
On the daemon process side, there are still similar operations. What needs to be noted is the difference between the two series of functions fuse_write/read() and fuse_dev_write/read() . The former is the VFS when the user process accesses files on the fuse file system. Read and write requests are operations on regular files, and the latter is the daemon process reading and writing the device "/dev/fuse" that represents the fuse kernel module. The purpose is to obtain the request and give a reply.
[The third round]
The last round of interaction between the fuse kernel module and the daemon process is to represent the fuse file system Get the inode number in the superblock and fill in the relevant information of this metadata.
It is not difficult to find that in the fuse file system, even if a relatively simple "touch" operation is performed, the user state involved Switching between the kernel state and the kernel state is relatively frequent and is accompanied by multiple data copies. Compared with traditional kernel file systems, its overall I/O throughput is lower and latency is greater.
Then why does fuse still occupy a place in the file system supported by the operating system? Speaking of which, developing in user mode has many advantages. First, is easy to debug and is particularly suitable for rapid verification of a new file system prototype, so it is very popular in the field of academic research. In the kernel, you can only use C language. In user mode, there are not so many restrictions. Various function libraries and various programming languages can be used.
Second, kernel bugs often cause the entire system to crash at the slightest disagreement (this is more serious in virtualized applications, because the crash of the host will cause all virtual machines running on it to crash) machine crash), while the impact caused by user-mode bugs is relatively limited.
So, the front side of the coin is that it is convenient for development, but how convenient it is is a subjective feeling after all, while the other side is the impact on performance, which can be usedObjective is verified by experimental data. So what method should be used to relatively accurately measure the loss caused by fuse?
Still use the fuse-sshfs we used before, but here we no longer use remote mounting, but use local mounting (assuming that the "dir-src" directory of the local machine is located in the ext4 file System):
sshfs localhost:
When the daemon process receives the request, it needs to enter the kernel again, To access the ext4 kernel module (this file system mode is called "stackable"):
Take the user process issuing a write() request to the fuse file system as an example , the red box on the right is a native ext4 call path, and the extra path on the left is the path added after introducing fuse:
According to this article According to the data given in the document, the request formed by "getxattr" used in this system call requires 2 times the amount of "user-kernel" interaction. For sequential writes, compared to the native ext4 file system, the I/O throughput is reduced by 27%, and random writes are reduced by 44%.
However, in the many years since the birth of the fuse file system, everyone has come up with many optimization measures for it. For example, when reading and writing sequentially, it can be designed to send requests in batches to the daemon process (but random reading and writing are not suitable).
There is also the use of splicing This zero-copy technology, the splicing mechanism provided by the Linux kernel allows user space to transfer the data of the memory buffers of the two kernels without copying. Therefore, it is especially suitable for stackable mode to directly transfer data from the fuse kernel module to the ext4 kernel module (but splicing is usually used for requests exceeding 4K, and is not used for reading and writing small amounts of data).
After these efforts, what kind of performance can the fuse file system achieve? According to the test results listed in this report, compared with native ext4, under the best circumstances, the performance loss of fuse can be controlled to less than 5%, but in the worst case it is 83% . At the same time, its CPU resource usage also increased by 31%.
From the sdcard daemon that existed between Android v4.4 to v7.0, to Ceph and GlusterFS in recent years, all have adopted or are currently adopting FUSE-based implementations. FUSE has shown its usefulness in both network filesystem and virtualization applications. Its emergence and development are not to replace the file system implemented in the kernel mode, but as a useful supplement (theoretically, FUSE can also Used to implement the root file system, but this is not recommended, "can do" and "should do" are two different things).
Related recommendations: "Linux Video Tutorial"
The above is the detailed content of What does linux fuse mean?. For more information, please follow other related articles on the PHP Chinese website!