From beginner to proficient, learn Linux redirection and pipeline tools to speed up your workflow!-LINUX-php.cn

Home

System Tutorial

LINUX

From beginner to proficient, learn Linux redirection and pipeline tools to speed up your workflow!

PHPz

Feb 09, 2024 pm 11:36 PM

linux linux tutorial linux system Linux operating system linux command shell script embeddedlinux Getting started with linux linux learning

Improving work efficiency, operating system optimization, automation, etc. are the goals pursued by every IT practitioner. In the Linux operating system, being able to skillfully use redirection and pipeline command line tools is one of the skills that must be mastered. This article will explain in detail the usage and principles of redirection and pipeline tools through examples.

I like the Linux system very much, especially some of the designs of Linux are very beautiful. For example, some complex problems can be decomposed into several small problems, and can be solved flexibly with ready-made tools through the pipe character and redirection mechanism. It can be written as a shell script. Very efficient.

From beginner to proficient, learn Linux redirection and pipeline tools to speed up your workflow!

This article will share some of the pitfalls I encountered when using redirection and pipe characters in practice. Understanding some underlying principles can improve the efficiency of writing scripts a lot.

> and >> redirection characters pitfalls

Let’s talk about the first question first. What will happen if we execute the following command?

$ cat file.txt > file.txt

Copy after login

Reading and writing to the same file feels like nothing will happen, right?

Actually, the result of running the above command is to clear the contents of the file.txt file.

PS: Some Linux distributions may report an error directly. You can execute catfile.txt to bypass this detection.

As mentioned above about Linux processes and file descriptors, the program itself does not need to care about where its standard input/output points. It is the shell that modifies the location of the program's standard input/output through pipe characters and redirection symbols.

So when executing the command cat file.txt > file.txt, the shell will first open file.txt. Since the redirection symbol is >, the content in the file will be cleared, and then the shell will set the standard output of the cat command. is file.txt, then the cat command starts to be executed.

That is the following process:

1. Shell opens file.txt and clears its contents.
2. Shell points the standard output of the cat command to the file.txt file.
3. The shell executes the cat command and reads an empty file.
4. The cat command writes an empty string to the standard output (file.txt file).

So, the final result is that file.txt becomes an empty file.

We know that > will clear the target file, and >> will append content to the end of the target file, so what will happen if the redirection symbol > is changed to >>?

$ echo hello world > file.txt # 文件中只有一行内容 
$ cat file.txt >> file.txt # 这个命令会死循环

Copy after login

One line of content is first written into file.txt. After executing cat file.txt >> file.txt, the expected result should be two lines of content.

Unfortunately, the running result is not as expected. Instead, it will continue to write hello world to file.txt in an infinite loop. The file will soon become very large, and the command can only be stopped with Control C.

This is interesting, why is there an infinite loop? In fact, after a little analysis, you can think of the reason:

First, recall the behavior of the cat command. If you only execute the cat command, the keyboard input will be read from the command line. Every time you press Enter, the cat command will echo the input. In other words, the cat command It reads data line by line and then outputs the data.

Then, the execution process of cat file.txt >> file.txt command is as follows:

1. Open file.txt and prepare to append content to the end of the file.
2. Point the standard output of the cat command to the file.txt file.
3. The cat command reads a line of content in file.txt and writes it to the standard output (append to the file.txt file).
4. Since a line of data has just been written, the cat command finds that there is still content that can be read in file.txt, and will repeat step 3.

The above process is like traversing the list and appending elements to the list at the same time. It will never be traversed completely, so our command will loop in an infinite loop.

> The redirection character and the | pipe character work together

We often encounter such a requirement: intercept the first XX lines of the file and delete the rest.

In Linux, the head command can complete the function of intercepting the first few lines of the file:

$ cat file.txt # file.txt 中有五行内容 
1 
2 
3 
4 
5 
$ head -n 2 file.txt # head 命令读取前两行 
1 
2 
$ cat file.txt | head -n 2 # head 也可以读取标准输入 
1 
2

Copy after login

If we want to keep the first 2 lines of the file and delete the others, we may use the following command:

$ head -n 2 file.txt > file.txt

Copy after login

But this makes the mistake mentioned above. In the end, file.txt will be cleared, which cannot meet our needs.

Can we avoid pitfalls by writing commands like this:

$ cat file.txt | head -n 2 > file.txt

Copy after login

The conclusion is that it does not work, the file content will still be cleared.

What? Is there a leak in the pipeline and all the data is missing?

In the previous article, Linux processes and file descriptors, I also said that the implementation principle of the pipe character is essentially to connect the standard input and output of two commands, so that the standard output of the previous command can be used as the standard input of the next command.

However, if you think that writing commands like this can get the expected results, it may be because you think that the commands connected by the pipe character are executed serially. This is a common mistake. In fact, multiple commands connected by the pipe character are executed serially. are executed in parallel.

You may think that the shell will first execute the cat file.txt command, read all the contents in file.txt normally, and then pass these contents to the head -n 2 > file.txt command through the pipe.

Although the contents of file.txt will be cleared at this time, head does not read data from the file, but reads data from the pipe, so it should be possible to write two lines of data to file.txt correctly.

但实际上，上述理解是错误的，shell 会并行执行管道符连接的命令，比如说执行如下命令：

$ sleep 5 | sleep 5

Copy after login

shell 会同时启动两个sleep进程，所以执行结果是睡眠 5 秒，而不是 10 秒。

这是有点违背直觉的，比如这种常见的命令：

$ cat filename | grep 'pattern'

Copy after login

直觉好像是先执行cat命令一次性读取了filename中所有的内容，然后传递给grep命令进行搜索。

但实际上是cat和grep命令是同时执行的，之所以能得到预期的结果，是因为grep ‘pattern’会阻塞等待标准输入，而cat通过 Linux 管道向grep的标准输入写入数据。

执行下面这个命令能直观感受到cat和grep是在同时执行的，grep在实时处理我们用键盘输入的数据：

$ cat | grep 'pattern'

Copy after login

说了这么多，再回顾一开始的问题：

$ cat file.txt | head -n 2 > file.txt

Copy after login

cat命令和head会并行执行，谁先谁后不确定，执行结果也就不确定。

如果head命令先于cat执行，那么file.txt就会被先清空，cat也就读取不到任何内容;反之，如果cat先把文件的内容读取出来，那么可以得到预期的结果。

不过，通过我的实验(将这种并发情况重复 1w 次)发现，file.txt被清空这种错误情况出现的概率远大于预期结果出现的概率，这个暂时还不清楚是为什么，应该和 Linux 内核实现进程和管道的逻辑有关。

解决方案

说了这么多管道符和重定向符的特点，如何才能避免这个文件被清空的坑呢?

最靠谱的办法就是不要同时对同一个文件进行读写，而是通过临时文件的方式做一个中转。

比如说只保留file.txt文件中的头两行，可以这样写代码：

# 先把数据写入临时文件，然后覆盖原始文件

$ cat file.txt | head -n 2 > temp.txt && mv temp.txt file.txt

Copy after login

这是最简单，最可靠，万无一失的方法。

你如果嫌这段命令太长，也可以通过apt/brew/yum等包管理工具安装moreutils包，就会多出一个sponge命令，像这样使用：

# 先把数据传给 sponge，然后由 sponge 写入原始文件 
$ cat file.txt | head -n 2 | sponge file.txt

Copy after login

sponge这个单词的意思是海绵，挺形象的，它会先把输入的数据「吸收」起来，最后再写入file.txt，核心思路和我们使用临时文件时类似的，这个「海绵」就好比一个临时文件，就可以避免同时打开同一个文件进行读写的问题。

在Linux操作系统中，重定向和管道是非常有用的命令行工具，可以让我们更好地掌握系统的运行状态和信息。掌握相关技能能够帮助我们更好地进行系统优化和自动化工作，从而更好地提高工作效率。相信通过本文的介绍，读者对重定向和管道的原理和使用方法都有了更为深入的了解。

The above is the detailed content of From beginner to proficient, learn Linux redirection and pipeline tools to speed up your workflow!. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7564

CakePHP Tutorial

1385

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Difference between centos and ubuntu Apr 14, 2025 pm 09:09 PM

The key differences between CentOS and Ubuntu are: origin (CentOS originates from Red Hat, for enterprises; Ubuntu originates from Debian, for individuals), package management (CentOS uses yum, focusing on stability; Ubuntu uses apt, for high update frequency), support cycle (CentOS provides 10 years of support, Ubuntu provides 5 years of LTS support), community support (CentOS focuses on stability, Ubuntu provides a wide range of tutorials and documents), uses (CentOS is biased towards servers, Ubuntu is suitable for servers and desktops), other differences include installation simplicity (CentOS is thin)

How to use docker desktop Apr 15, 2025 am 11:45 AM

How to use Docker Desktop? Docker Desktop is a tool for running Docker containers on local machines. The steps to use include: 1. Install Docker Desktop; 2. Start Docker Desktop; 3. Create Docker image (using Dockerfile); 4. Build Docker image (using docker build); 5. Run Docker container (using docker run).

Centos options after stopping maintenance Apr 14, 2025 pm 08:51 PM

CentOS has been discontinued, alternatives include: 1. Rocky Linux (best compatibility); 2. AlmaLinux (compatible with CentOS); 3. Ubuntu Server (configuration required); 4. Red Hat Enterprise Linux (commercial version, paid license); 5. Oracle Linux (compatible with CentOS and RHEL). When migrating, considerations are: compatibility, availability, support, cost, and community support.

How to install centos Apr 14, 2025 pm 09:03 PM

CentOS installation steps: Download the ISO image and burn bootable media; boot and select the installation source; select the language and keyboard layout; configure the network; partition the hard disk; set the system clock; create the root user; select the software package; start the installation; restart and boot from the hard disk after the installation is completed.

How to view the docker process Apr 15, 2025 am 11:48 AM

Docker process viewing method: 1. Docker CLI command: docker ps; 2. Systemd CLI command: systemctl status docker; 3. Docker Compose CLI command: docker-compose ps; 4. Process Explorer (Windows); 5. /proc directory (Linux).

Detailed explanation of docker principle Apr 14, 2025 pm 11:57 PM

Docker uses Linux kernel features to provide an efficient and isolated application running environment. Its working principle is as follows: 1. The mirror is used as a read-only template, which contains everything you need to run the application; 2. The Union File System (UnionFS) stacks multiple file systems, only storing the differences, saving space and speeding up; 3. The daemon manages the mirrors and containers, and the client uses them for interaction; 4. Namespaces and cgroups implement container isolation and resource limitations; 5. Multiple network modes support container interconnection. Only by understanding these core concepts can you better utilize Docker.

What computer configuration is required for vscode Apr 15, 2025 pm 09:48 PM

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

What to do if the docker image fails Apr 15, 2025 am 11:21 AM

Troubleshooting steps for failed Docker image build: Check Dockerfile syntax and dependency version. Check if the build context contains the required source code and dependencies. View the build log for error details. Use the --target option to build a hierarchical phase to identify failure points. Make sure to use the latest version of Docker engine. Build the image with --t [image-name]:debug mode to debug the problem. Check disk space and make sure it is sufficient. Disable SELinux to prevent interference with the build process. Ask community platforms for help, provide Dockerfiles and build log descriptions for more specific suggestions.

See all articles