首頁 資料庫 mysql教程 HDFS文件命令

HDFS文件命令

Jun 07, 2016 pm 04:41 PM
hdfs linux 命令 文件

HDFS在设计上仿照Linux下的文件操作命令,所以对熟悉Linux文件命令的小伙伴很好上手。另外在Hadoop DFS中没有pwd概念,所有都需要全路径。(本文基于版本2.5 CDH 5.2.1) 列出命令列表、格式和帮助,以及选择一个非参数文件配置的namenode。 hdfs dfs -usageh

HDFS在设计上仿照Linux下的文件操作命令,所以对熟悉Linux文件命令的小伙伴很好上手。另外在Hadoop DFS中没有pwd概念,所有都需要全路径。(本文基于版本2.5 CDH 5.2.1)
列出命令列表、格式和帮助,以及选择一个非参数文件配置的namenode。

hdfs dfs -usage
hadoop dfs -usage ls 
hadoop dfs -help
-fs <local>      specify a namenode
hdfs dfs -fs hdfs://test1:9000 -ls /</local>
登入後複製

——————————————————————————–
-df [-h] [path …] :
Shows the capacity, free and used space of the filesystem. If the filesystem has
multiple partitions, and no path to a particular partition is specified, then
the status of the root partitions will be shown.

$ hdfs dfs -df
Filesystem                 Size   Used     Available  Use%
hdfs://test1:9000  413544071168  98304  345612906496    0%
登入後複製

——————————————————————————–
-mkdir [-p] path … :
Create a directory in specified location.

-p Do not fail if the directory already exists

-rmdir dir … :
Removes the directory entry specified by each directory argument, provided it is
empty.

hdfs dfs -mkdir /tmp
hdfs dfs -mkdir /tmp/txt
hdfs dfs -rmdir /tmp/txt
hdfs dfs -mkdir -p /tmp/txt/hello
登入後複製

——————————————————————————–
-copyFromLocal [-f] [-p] localsrc … dst :
Identical to the -put command.

-copyToLocal [-p] [-ignoreCrc] [-crc] src … localdst :
Identical to the -get command.

-moveFromLocal localsrc …
Same as -put, except that the source is deleted after it’s copied.

-put [-f] [-p] localsrc …
Copy files from the local file system into fs. Copying fails if the file already
exists, unless the -f flag is given. Passing -p preserves access and
modification times, ownership and the mode. Passing -f overwrites the
destination if it already exists.

-get [-p] [-ignoreCrc] [-crc] src … localdst :
Copy files that match the file pattern src to the local name. src is kept.
When copying multiple files, the destination must b/e a directory. Passing -p
preserves access and modification times, ownership and the mode.

-getmerge [-nl] src localdst :
Get all the files in the directories that match the source file pattern and
merge and sort them to only one file on local fs. src is kept.

-nl Add a newline character at the end of each file.

-cat [-ignoreCrc] src … :
Fetch all files that match the file pattern src and display their content on
stdout.

#通配符? * {} []
hdfs dfs -cat /tmp/*.txt
Hello, Hadoop
Hello, HDFS
hdfs dfs -cat /tmp/h?fs.txt 
Hello, HDFS
hdfs dfs -cat /tmp/h{a,d}*.txt 
Hello, Hadoop
Hello, HDFS
hdfs dfs -cat /tmp/h[a-d]*.txt
Hello, Hadoop
Hello, HDFS
echo "Hello, Hadoop" > hadoop.txt
echo "Hello, HDFS" > hdfs.txt
dd if=/dev/zero of=/tmp/test.zero bs=1M count=1024
    1024+0 records in
    1024+0 records out
    1073741824 bytes (1.1 GB) copied, 0.93978 s, 1.1 GB/s
hdfs dfs -moveFromLocal /tmp/test.zero /tmp
hdfs dfs -put *.txt /tmp
登入後複製

——————————————————————————–
-ls [-d] [-h] [-R] [path …] :
List the contents that match the specified file pattern. If path is not
specified, the contents of /user/currentUser will be listed. Directory entries
are of the form:
permissions – userId groupId sizeOfDirectory(in bytes)
modificationDate(yyyy-MM-dd HH:mm) directoryName

and file entries are of the form:
permissions numberOfReplicas userId groupId sizeOfFile(in bytes)
modificationDate(yyyy-MM-dd HH:mm) fileName

-d Directories are listed as plain files.
-h Formats the sizes of files in a human-readable fashion rather than a number
of bytes.
-R Recursively list the contents of directories.

hdfs dfs -ls /tmp
hdfs dfs -ls -d /tmp
hdfs dfs -ls -h /tmp
  Found 4 items
  -rw-r--r--   3 hdfs supergroup         14 2014-12-18 10:00 /tmp/hadoop.txt
  -rw-r--r--   3 hdfs supergroup         12 2014-12-18 10:00 /tmp/hdfs.txt
  -rw-r--r--   3 hdfs supergroup        1 G 2014-12-18 10:19 /tmp/test.zero
  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txt
hdfs dfs -ls -R -h /tmp
  -rw-r--r--   3 hdfs supergroup         14 2014-12-18 10:00 /tmp/hadoop.txt
  -rw-r--r--   3 hdfs supergroup         12 2014-12-18 10:00 /tmp/hdfs.txt
  -rw-r--r--   3 hdfs supergroup        1 G 2014-12-18 10:19 /tmp/test.zero
  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txt
  drwxr-xr-x   - hdfs supergroup          0 2014-12-18 10:07 /tmp/txt/hello
登入後複製

——————————————————————————–
-checksum src … :
Dump checksum information for files that match the file pattern src to stdout.
Note that this requires a round-trip to a datanode storing each block of the
file, and thus is not efficient to run on a large number of files. The checksum
of a file depends on its content, block size and the checksum algorithm and
parameters used for creating the file.

hdfs dfs -checksum /tmp/test.zero
  /tmp/test.zero	MD5-of-262144MD5-of-512CRC32C	000002000000000000040000f960570129a4ef3a7e179073adceae97
登入後複製

——————————————————————————–
-appendToFile localsrc … dst :
Appends the contents of all the given local files to the given dst file. The dst
file will be created if it does not exist. If localSrc is -, then the input is
read from stdin.

hdfs dfs -appendToFile *.txt hello.txt
hdfs dfs -cat hello.txt
  Hello, Hadoop
  Hello, HDFS
登入後複製

——————————————————————————–
-tail [-f] file :
Show the last 1KB of the file.

hdfs dfs -tail -f hello.txt
#waiting for output. then Ctrl + C
#another terminal
hdfs dfs -appendToFile - hello.txt
#then type something
登入後複製

——————————————————————————–
-cp [-f] [-p | -p[topax]] src …
Copy files that match the file pattern src to a destination. When copying
multiple files, the destination must be a directory. Passing -p preserves status
[topax] (timestamps, ownership, permission, ACLs, XAttr). If -p is specified
with no arg, then preserves timestamps, ownership, permission. If -pa is
permission. Passing -f overwrites the destination if it already exists. raw
namespace extended attributes are preserved if (1) they are supported (HDFS
only) and, (2) all of the source and target pathnames are in the /.reserved/raw
hierarchy. raw namespace xattr preservation is determined solely by the presence
(or absence) of the /.reserved/raw prefix and not by the -p option.
-mv src … dst :
Move files that match the specified file pattern src to a destination dst.
When moving multiple files, the destination must be a directory.
-rm [-f] [-r|-R] [-skipTrash] src … :
Delete all files that match the specified file pattern. Equivalent to the Unix
command “rm src”

-skipTrash option bypasses trash, if enabled, and immediately deletes src
-f If the file does not exist, do not display a diagnostic message or
modify the exit status to reflect an error.
-[rR] Recursively deletes directories
-stat [format] path … :
Print statistics about the file/directory at path in the specified format.
Format accepts filesize in blocks (%b), group name of owner(%g), filename (%n),
block size (%o), replication (%r), user name of owner(%u), modification date
(%y, %Y)

hdfs dfs -stat /tmp/hadoop.txt
    2014-12-18 02:00:08
hdfs dfs -cp -p -f /tmp/hello.txt /tmp/hello.txt.bak
hdfs dfs -stat /tmp/hadoop.txt.bak
hdfs dfs -rm /tmp/not_exists
    rm: `/tmp/not_exists': No such file or directory
echo $?
    1
hdfs dfs -rm -f /tmp/123321123123123
echo $?
0
登入後複製

——————————————————————————–
-count [-q] path … :
Count the number of directories, files and bytes under the paths
that match the specified file pattern. The output columns are:
DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME or
QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA
DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME

-du [-s] [-h] path … :
Show the amount of space, in bytes, used by the files that match the specified
file pattern. The following flags are optional:

-s Rather than showing the size of each individual file that matches the
pattern, shows the total (summary) size.
-h Formats the sizes of files in a human-readable fashion rather than a number
of bytes.

Note that, even without the -s option, this only shows size summaries one level
deep into a directory.

The output is in the form
size name(full path)

hdfs dfs -count /tmp
           3            3         1073741850 /tmp
hdfs dfs -du /tmp
    14          /tmp/hadoop.txt
    12          /tmp/hdfs.txt
    1073741824  /tmp/test.zero
    0           /tmp/txt
hdfs dfs -du -s /tmp
    1073741850  /tmp
hdfs dfs -du -s -h /tmp
    1.0 G  /tmp
登入後複製

——————————————————————————–
-chgrp [-R] GROUP PATH… :
This is equivalent to -chown … :GROUP …

-chmod [-R] MODE[,MODE]… | OCTALMODE PATH… :
Changes permissions of a file. This works similar to the shell’s chmod command
with a few exceptions.

-R modifies the files recursively. This is the only option currently
supported.
MODE Mode is the same as mode used for the shell’s command. The only
letters recognized are ‘rwxXt’, e.g. +t,a+r,g-w,+rwx,o=r.
OCTALMODE Mode specifed in 3 or 4 digits. If 4 digits, the first may be 1 or
0 to turn the sticky bit on or off, respectively. Unlike the
shell command, it is not possible to specify only part of the
mode, e.g. 754 is same as u=rwx,g=rx,o=r.

If none of ‘augo’ is specified, ‘a’ is assumed and unlike the shell command, no
umask is applied.

-chown [-R] [OWNER][:[GROUP]] PATH… :
Changes owner and group of a file. This is similar to the shell’s chown command
with a few exceptions.

-R modifies the files recursively. This is the only option currently
supported.

If only the owner or group is specified, then only the owner or group is
modified. The owner and group names may only consist of digits, alphabet, and
any of [-_./@a-zA-Z0-9]. The names are case sensitive.

WARNING: Avoid using ‘.’ to separate user name and group though Linux allows it.
If user names have dots in them and you are using local file system, you might
see surprising results since the shell command ‘chown’ is used for local files.

-touchz path … :
Creates a file of zero length at path with current time as the timestamp of
that path. An error is returned if the file exists with non-zero length

hdfs dfs -mkdir -p /user/spark/tmp
hdfs dfs -chown -R spark:hadoop /user/spark
hdfs dfs -chmod -R 775 /user/spark/tmp
hdfs dfs -ls -d /user/spark/tmp
    drwxrwxr-x   - spark hadoop          0 2014-12-18 14:51 /user/spark/tmp
hdfs dfs -chmod +t /user/spark/tmp
#user:spark
    hdfs dfs -touchz /user/spark/tmp/own_by_spark
#user:hadoop
useradd -g hadoop hadoop
su - hadoop
id
    uid=502(hadoop) gid=492(hadoop) groups=492(hadoop)
hdfs dfs -rm /user/spark/tmp/own_by_spark
rm: Permission denied by sticky bit setting: user=hadoop, inode=own_by_spark
#使用超级管理员(dfs.permissions.superusergroup = hdfs),可以无视sticky位设置
登入後複製

——————————————————————————–
-test -[defsz] path :
Answer various questions about path, with result via exit status.
-d return 0 if path is a directory.
-e return 0 if path exists.
-f return 0 if path is a file.
-s return 0 if file path is greater than zero bytes in size.
-z return 0 if file path is zero bytes in size, else return 1.

hdfs dfs -test -d /tmp
echo $?
    0
hdfs dfs -test -f /tmp/txt
echo $?
    1
登入後複製

——————————————————————————–
-setrep [-R] [-w] rep path … :
Set the replication level of a file. If path is a directory then the command
recursively changes the replication factor of all files under the directory tree
rooted at path.
-w It requests that the command waits for the replication to complete. This
can potentially take a very long time.

hdfs fsck /tmp/test.zero -blocks -locations
    Average block replication:	3.0
hdfs dfs -setrep -w 4  /tmp/test.zero
    Replication 4 set: /tmp/test.zero
    Waiting for /tmp/test.zero .... done
hdfs fsck /tmp/test.zero -blocks
    Average block replication:	4.0
登入後複製
本網站聲明
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn

熱AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover

AI Clothes Remover

用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool

Undress AI Tool

免費脫衣圖片

Clothoff.io

Clothoff.io

AI脫衣器

Video Face Swap

Video Face Swap

使用我們完全免費的人工智慧換臉工具,輕鬆在任何影片中換臉!

熱門文章

<🎜>:泡泡膠模擬器無窮大 - 如何獲取和使用皇家鑰匙
3 週前 By 尊渡假赌尊渡假赌尊渡假赌
北端:融合系統,解釋
3 週前 By 尊渡假赌尊渡假赌尊渡假赌
Mandragora:巫婆樹的耳語 - 如何解鎖抓鉤
3 週前 By 尊渡假赌尊渡假赌尊渡假赌

熱工具

記事本++7.3.1

記事本++7.3.1

好用且免費的程式碼編輯器

SublimeText3漢化版

SublimeText3漢化版

中文版,非常好用

禪工作室 13.0.1

禪工作室 13.0.1

強大的PHP整合開發環境

Dreamweaver CS6

Dreamweaver CS6

視覺化網頁開發工具

SublimeText3 Mac版

SublimeText3 Mac版

神級程式碼編輯軟體(SublimeText3)

熱門話題

Java教學
1666
14
CakePHP 教程
1425
52
Laravel 教程
1324
25
PHP教程
1272
29
C# 教程
1251
24
Linux體系結構:揭示5個基本組件 Linux體系結構:揭示5個基本組件 Apr 20, 2025 am 12:04 AM

Linux系統的五個基本組件是:1.內核,2.系統庫,3.系統實用程序,4.圖形用戶界面,5.應用程序。內核管理硬件資源,系統庫提供預編譯函數,系統實用程序用於系統管理,GUI提供可視化交互,應用程序利用這些組件實現功能。

git怎麼查看倉庫地址 git怎麼查看倉庫地址 Apr 17, 2025 pm 01:54 PM

要查看 Git 倉庫地址,請執行以下步驟:1. 打開命令行並導航到倉庫目錄;2. 運行 "git remote -v" 命令;3. 查看輸出中的倉庫名稱及其相應的地址。

vscode上一步下一步快捷鍵 vscode上一步下一步快捷鍵 Apr 15, 2025 pm 10:51 PM

VS Code 一步/下一步快捷鍵的使用方法:一步(向後):Windows/Linux:Ctrl ←;macOS:Cmd ←下一步(向前):Windows/Linux:Ctrl →;macOS:Cmd →

notepad怎麼運行java代碼 notepad怎麼運行java代碼 Apr 16, 2025 pm 07:39 PM

雖然 Notepad 無法直接運行 Java 代碼,但可以通過借助其他工具實現:使用命令行編譯器 (javac) 編譯代碼,生成字節碼文件 (filename.class)。使用 Java 解釋器 (java) 解釋字節碼,執行代碼並輸出結果。

sublime寫好代碼後如何運行 sublime寫好代碼後如何運行 Apr 16, 2025 am 08:51 AM

在 Sublime 中運行代碼的方法有六種:通過熱鍵、菜單、構建系統、命令行、設置默認構建系統和自定義構建命令,並可通過右鍵單擊項目/文件運行單個文件/項目,構建系統可用性取決於 Sublime Text 的安裝情況。

Linux的主要目的是什麼? Linux的主要目的是什麼? Apr 16, 2025 am 12:19 AM

Linux的主要用途包括:1.服務器操作系統,2.嵌入式系統,3.桌面操作系統,4.開發和測試環境。 Linux在這些領域表現出色,提供了穩定性、安全性和高效的開發工具。

laravel安裝代碼 laravel安裝代碼 Apr 18, 2025 pm 12:30 PM

要安裝 Laravel,需依序進行以下步驟:安裝 Composer(適用於 macOS/Linux 和 Windows)安裝 Laravel 安裝器創建新項目啟動服務訪問應用程序(網址:http://127.0.0.1:8000)設置數據庫連接(如果需要)

git軟件安裝 git軟件安裝 Apr 17, 2025 am 11:57 AM

安裝 Git 軟件包括以下步驟:下載安裝包運行安裝包驗證安裝配置 Git安裝 Git Bash(僅限 Windows)

See all articles