在HDFS上创建文件

启动HDFS

运行脚本start-dfs.sh

1
2
3
4
5
6
7
[root@master ~]# start-dfs.sh
Starting namenodes on [master]
Last login: Sat Apr 16 20:09:41 CST 2022 from 192.168.166.1 on pts/0
Starting datanodes
Last login: Sat Apr 16 20:13:05 CST 2022 on pts/0
Starting secondary namenodes [master]
Last login: Sat Apr 16 20:13:07 CST 2022 on pts/0

运行脚本start-yarn.sh

1
2
3
4
5
[root@master ~]# start-yarn.sh
Starting resourcemanager
Last login: Sat Apr 16 20:13:14 CST 2022 on pts/0
Starting nodemanagers
Last login: Sat Apr 16 20:32:18 CST 2022 on pts/0

将数据插入HDFS

创建本地文件

1
echo "I am xxx" > file.txt
1
2
3
4
5
[root@master ~]# echo "I am zzq" > file.txt
[root@master ~]# ls
anaconda-ks.cfg file.txt hadoop-3.1.4.tar.gz jdk-8u202-linux-x64.rpm
[root@master ~]# cat file.txt
I am zzq

HDFS上创建目录

在HDFS上创建一个user目录

[-mkdir [-p] <path> ...]

1
hadoop fs -mkdir /user

本地文件传输到HDFS

使用put命令将数据文件从本地系统传输到HDFS上的指定位置。

[-put [-f] [-p] [-l] [-d] <localsrc> ... <dst>]

1
hadoop fs -put file.txt /user

查看是否上传成功

[-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]]

1
hadoop fs -ls /user

输出结果

1
2
3
4
5
[root@master ~]# hadoop fs -mkdir /user
[root@master ~]# hadoop fs -put file.txt /user
[root@master ~]# hadoop fs -ls /user
Found 1 items
-rw-r--r-- 3 root supergroup 0 2022-04-16 20:39 /user/file.txt

从HDFS检索数据

查看HDFS中的file.txt的数据

使用cat命令[-cat [-ignoreCrc] <src> ...]

1
hadoop fs -cat /user/file.txt
1
2
[root@master ~]# hadoop fs -cat /user/file.txt
I am zzq

从HDFS上下载文件到本地

使用get命令[-get [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]

将HDFS是的/user/file.txt文件下载到本地的file1.txt中

1
hadoop fs -get /user/file.txt file1.txt
1
2
3
[root@master ~]# hadoop fs -get /user/file.txt file1.txt
[root@master ~]# cat file1.txt
I am zzq

HDFS进阶操作

对文件追加,覆盖

向HDFS目录user中上传另一个文本文件test.txt,写入任意内容;

1
2
3
4
[root@master ~]# echo "Test file" > test.txt
[root@master ~]# hadoop fs -put test.txt /user/test.txt
[root@master ~]# hadoop fs -cat /user/test.txt
Test file

分别追加文件末尾和覆盖原有file.txt的内容,并查看结果。

将test.txt文件内容追加hdfs的file.txt文件末尾[-appendToFile <localsrc> ... <dst>]

1
2
3
4
[root@master ~]# hadoop fs -appendToFile test.txt /user/file.txt
[root@master ~]# hadoop fs -cat /user/file.txt
I am zzq
Test file

覆盖原有file.txt文件

[-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst>]

1
2
3
[root@master ~]# hadoop fs -copyFromLocal -f test.txt /user/file.txt
[root@master ~]# hadoop fs -cat /user/file.txt
Test file

查看HDFS中指定文件信息

显示HDFS中指定的文件的读写权限、大小、创建时间、路径等信息。

1
2
3
4
[root@master ~]# hadoop fs -ls /user
Found 2 items
-rw-r--r-- 3 root supergroup 10 2022-04-16 21:29 /user/file.txt
-rw-r--r-- 3 root supergroup 10 2022-04-16 21:21 /user/test.txt

递归地输出某目录的所有文件信息

输出user目录下的所有文件的读写权限、大小、创建时间、路径等信息,如果该文件是目录,则递归输出该目录下所有文件相关信息。

hadoop fs -ls -R [dist]

1
2
3
4
5
[root@master ~]# hadoop fs -ls -R /user
-rw-r--r-- 3 root supergroup 10 2022-04-16 21:29 /user/file.txt
-rw-r--r-- 3 root supergroup 10 2022-04-16 21:21 /user/test.txt
drwxr-xr-x - root supergroup 0 2022-04-16 21:32 /user/u1
-rw-r--r-- 3 root supergroup 9 2022-04-16 21:32 /user/u1/file1.txt

移动文件

将文件从源路径移动到新路径自己创建一个新目录名,如input

查看新路径下文件。

1
2
3
4
5
[root@master ~]# hadoop fs -mkdir /user/input
[root@master ~]# hadoop fs -mv /user/file.txt /user/input
[root@master ~]# hadoop fs -ls /user/input
Found 1 items
-rw-r--r-- 3 root supergroup 10 2022-04-16 21:29 /user/input/file.txt

关闭HDFS

关闭HDFS,使用命令关闭HDFS。

分别运行脚本stop-dfs.sh, stop-yarn.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@master ~]# stop-dfs.sh
Stopping namenodes on [master]
Last login: Sat Apr 16 20:32:20 CST 2022 on pts/0
Stopping datanodes
Last login: Sat Apr 16 21:36:38 CST 2022 on pts/0
Stopping secondary namenodes [master]
Last login: Sat Apr 16 21:36:39 CST 2022 on pts/0
[root@master ~]# stop-yarn.sh
Stopping nodemanagers
Last login: Sat Apr 16 21:36:41 CST 2022 on pts/0
Stopping resourcemanager
Last login: Sat Apr 16 21:36:51 CST 2022 on pts/0