In Linux, a pipe is a communication mechanism that directly connects the output of one program to the input of another program. In essence, a pipe is also a kind of file, but it is different from a general file. Pipes can overcome two problems of using files for communication. The specific manifestations are: limiting the size of the pipe, and the reading process may work faster than The writing process is fast.
#The operating environment of this tutorial: linux7.3 system, Dell G3 computer.
Pipeline is a very important communication method in Linux, which directly connects the output of one program to the input of another program. Pipes often refer to unnamed pipes. Unnamed pipes can only be used between processes that are related. This is the biggest difference between them and named pipes.
The famous pipe is called named pipe or FIFO (first in, first out), which can be created with the function mkfifo().
Linux pipe implementation mechanism
In Linux, pipes are a very frequently used communication mechanism. In essence, a pipe is also a kind of file, but it is different from a general file. Pipes can overcome the two problems of using files for communication. The specific performance is as follows:
Limit the size of the pipeline. Effectively, a pipe is a fixed size buffer. In Linux, the size of this buffer is 1 page, or 4K bytes, so that its size does not grow unchecked like a file. Using a single fixed buffer can also cause problems, such as the pipe may become full while writing. When this happens, subsequent write() calls to the pipe will by default block, waiting for some data to be read, so that Make enough space for the write() call to write.
Reading processes may also work faster than writing processes. The pipe becomes empty when all current process data has been read. When this happens, a subsequent read() call will by default block, waiting for some data to be written, which solves the problem of read() calls returning end-of-file.
Note: Reading data from the pipe is a one-time operation. Once the data is read, it is discarded from the pipe to free up space for writing more data.
1. The structure of the pipeline
In Linux, the implementation of the pipeline does not use a special data structure, but relies on the file structure of the file system and the VFS Index node inode. This is achieved by pointing two file structures to the same temporary VFS index node, which in turn points to a physical page.
2. Reading and writing of pipes
The source code of pipeline implementation is in fs/pipe.c. There are many functions in pipe.c, including two The functions are more important, namely the pipe reading function pipe_read() and the pipe writing function pipe_wrtie(). The pipe write function writes data by copying bytes to the physical memory pointed to by the VFS index node, while the pipe read function reads data by copying bytes in physical memory. Of course, the kernel must use a certain mechanism to synchronize access to the pipe. To do this, the kernel uses locks, wait queues, and signals.
When the writing process writes to the pipe, it uses the standard library function write(). The system can find the file structure of the file based on the file descriptor passed by the library function. The file structure specifies the address of the function used to perform write operations (that is, the write function), so the kernel calls this function to complete the write operation. Before the write function writes data to the memory, it must first check the information in the VFS index node. When the following conditions are met, the actual memory copy work can be performed:
In memory There is enough space to accommodate all the data to be written;
The memory is not locked by the reader.
If the above conditions are met at the same time, the writing function first locks the memory, and then copies the data from the address space of the writing process to the memory. Otherwise, the writing process sleeps in the waiting queue of the VFS index node. Next, the kernel will call the scheduler, and the scheduler will choose other processes to run. The writing process is actually in an interruptible waiting state. When there is enough space in the memory to accommodate the written data, or the memory is unlocked, the reading process will wake up the writing process. At this time, the writing process will receive the signal. After the data is written into the memory, the memory is unlocked and all reading processes sleeping on the index node are awakened.
The reading process of the pipe is similar to the writing process. However, a process can return an error message immediately when there is no data or when memory is locked, rather than blocking the process, depending on the file or pipe's open mode. On the contrary, the process can sleep in the waiting queue of the index node and wait for the writing process to write data. When all processes have completed the pipe operation, the pipe's inode is discarded and the shared data pages are released.
Because the implementation of pipelines involves the operation of many files, when readers finish learning about the file system and then read the code in pipe.c, you will find it not difficult to understand.
Linux pipes are simpler to create and use, simply because they require fewer parameters. To achieve the same pipe creation goals as Windows, Linux and UNIX use the following code snippet:
Creating a Linux named pipe
int fd1[2]; if(pipe(fd1)) { printf("pipe() FAILED: errno=%d",errno); return 1; }
Linux pipe pair There is a limit to the size of a write operation before blocking. The kernel-level buffer used specifically for each pipe is exactly 4096 bytes. A write operation larger than 4K will block unless the reader clears the pipe. In practice this is not a limitation, since reading and writing operations are implemented in different threads.
Linux also supports named pipes. An early commentator on these numbers suggested that I should, for the sake of fairness, compare Linux's named pipes to Windows' named pipes. I wrote another program that uses named pipes on Linux. I found that there was no difference in the results for named and unnamed pipes on Linux.
Linux pipes are much faster than Windows 2000 named pipes, and Windows 2000 named pipes are much faster than Windows XP named pipes.
Example:
#include<stdio.h> #include<unistd.h> int main() { int n,fd[2]; // 这里的fd是文件描述符的数组,用于创建管道做准备的 pid_t pid; char line[100]; if(pipe(fd)<0) // 创建管道 printf("pipe create error/n"); if((pid=fork())<0) //利用fork()创建新进程 printf("fork error/n"); else if(pid>0){ //这里是父进程,先关闭管道的读出端,然后在管道的写端写入“hello world" close(fd[0]); write(fd[1],"hello word/n",11); } else{ close(fd[1]); //这里是子进程,先关闭管道的写入端,然后在管道的读出端读出数据 n= read(fd[0],line,100); write(STDOUT_FILENO,line,n); } exit(0); }
Recommended learning: Linux video tutorial
The above is the detailed content of what is linux pipeline. For more information, please follow other related articles on the PHP Chinese website!