


Cutting and merging large files under Linux
往往是因为网络传输的限制,导致很多时候,我们需要在 Linux 系统下进行大文件的切割。这样将一个大文件切割成为多个小文件,进行传输,传输完毕之后进行合并即可。

文件切割 - split
在 Linux 系统下使用 split 命令进行大文件切割很方便
搜索公众号GitHub猿后台回复“打飞机”,获取一份惊喜礼包。
命令语法
-a: #指定输出文件名的后缀长度(默认为2个:aa,ab...) -d: #指定输出文件名的后缀用数字代替 -l: #行数分割模式(指定每多少行切成一个小文件;默认行数是1000行) -b: #二进制分割模式(支持单位:k/m) -C: #文件大小分割模式(切割时尽量维持每行的完整性) split [-a] [-d] [-l <行数>] [-b <字节>] [-C <字节>] [要切割的文件] [输出文件名]
使用实例
# 行切割文件 $ split -l 300000 users.sql /data/users_ # 使用数字后缀 $ split -d -l 300000 users.sql /data/users_ # 按字节大小分割 $ split -d -b 100m users.sql /data/users_ ```bash **帮助信息** ```bash # 帮助信息 $ split --help Usage: split [OPTION]... [FILE [PREFIX]] Output pieces of FILE to PREFIXaa, PREFIXab, ...; default size is 1000 lines, and default PREFIX is 'x'. With no FILE, or when FILE is -, read standard input. Mandatory arguments to long options are mandatory for short options too. -a, --suffix-length=N generate suffixes of length N (default 2) 后缀名称的长度(默认为2) --additional-suffix=SUFFIX append an additional SUFFIX to file names -b, --bytes=SIZE put SIZE bytes per output file 每个输出文件的字节大小 -C, --line-bytes=SIZE put at most SIZE bytes of records per output file 每个输出文件的最大字节大小 -d use numeric suffixes starting at 0, not alphabetic 使用数字后缀代替字母后缀 --numeric-suffixes[=FROM] same as -d, but allow setting the start value -e, --elide-empty-files do not generate empty output files with '-n' 不产生空的输出文件 --filter=COMMAND write to shell COMMAND; file name is $FILE 写入到shell命令行 -l, --lines=NUMBER put NUMBER lines/records per output file 设定每个输出文件的行数 -n, --number=CHUNKS generate CHUNKS output files; see explanation below 产生chunks文件 -t, --separator=SEP use SEP instead of newline as the record separator; 使用新字符分割 '\0' (zero) specifies the NUL character -u, --unbuffered immediately copy input to output with '-n r/...' 无需缓存 --verbose print a diagnostic just before each 显示分割进度 output file is opened --help display this help and exit 显示帮助信息 --version output version information and exit 显示版本信息 The SIZE argument is an integer and optional unit (example: 10K is 10*1024). Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000). CHUNKS may be: N split into N files based on size of input K/N output Kth of N to stdout l/N split into N files without splitting lines/records l/K/N output Kth of N to stdout without splitting lines/records r/N like 'l' but use round robin distribution r/K/N likewise but only output Kth of N to stdout GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Full documentation at: <http://www.gnu.org/software/coreutils/split> or available locally via: info '(coreutils) split invocation'
文件合并 - cat
在 Linux 系统下使用 cat 命令进行多个小文件的合并也很方便
命令语法
-n: #显示行号 -e: #以$字符作为每行的结尾 -t: #显示TAB字符(^I) cat [-n] [-e] [-t] [输出文件名]
使用实例
# 合并文件 $ cat /data/users_* > users.sql 帮助信息 # 帮助信息 $ cat --h Usage: cat [OPTION]... [FILE]... Concatenate FILE(s) to standard output. With no FILE, or when FILE is -, read standard input. -A, --show-all equivalent to -vET -b, --number-nonblank number nonempty output lines, overrides -n -e equivalent to -vE -E, --show-ends display $ at end of each line -n, --number number all output lines -s, --squeeze-blank suppress repeated empty output lines -t equivalent to -vT -T, --show-tabs display TAB characters as ^I -u (ignored) -v, --show-nonprinting use ^ and M- notation, except for LFD and TAB --help display this help and exit --version output version information and exit Examples: cat f - g Output f's contents, then standard input, then g's contents. cat Copy standard input to standard output. GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Full documentation at: <http://www.gnu.org/software/coreutils/cat> or available locally via: info '(coreutils) cat invocation'
The above is the detailed content of Cutting and merging large files under Linux. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



The steps to start Apache are as follows: Install Apache (command: sudo apt-get install apache2 or download it from the official website) Start Apache (Linux: sudo systemctl start apache2; Windows: Right-click the "Apache2.4" service and select "Start") Check whether it has been started (Linux: sudo systemctl status apache2; Windows: Check the status of the "Apache2.4" service in the service manager) Enable boot automatically (optional, Linux: sudo systemctl

When the Apache 80 port is occupied, the solution is as follows: find out the process that occupies the port and close it. Check the firewall settings to make sure Apache is not blocked. If the above method does not work, please reconfigure Apache to use a different port. Restart the Apache service.

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

To restart the Apache server, follow these steps: Linux/macOS: Run sudo systemctl restart apache2. Windows: Run net stop Apache2.4 and then net start Apache2.4. Run netstat -a | findstr 80 to check the server status.

This guide will guide you to learn how to use Syslog in Debian systems. Syslog is a key service in Linux systems for logging system and application log messages. It helps administrators monitor and analyze system activity to quickly identify and resolve problems. 1. Basic knowledge of Syslog The core functions of Syslog include: centrally collecting and managing log messages; supporting multiple log output formats and target locations (such as files or networks); providing real-time log viewing and filtering functions. 2. Install and configure Syslog (using Rsyslog) The Debian system uses Rsyslog by default. You can install it with the following command: sudoaptupdatesud

Apache cannot start because the following reasons may be: Configuration file syntax error. Conflict with other application ports. Permissions issue. Out of memory. Process deadlock. Daemon failure. SELinux permissions issues. Firewall problem. Software conflict.

The Internet does not rely on a single operating system, but Linux plays an important role in it. Linux is widely used in servers and network devices and is popular for its stability, security and scalability.

Steps to fix the Apache vulnerability include: 1. Determine the affected version; 2. Apply security updates; 3. Restart Apache; 4. Verify the fix; 5. Enable security features.
