Hadoop Commands Cheat Sheet
Apache Hadoop is a framework for distributed storage and processing of large datasets on commodity hardware. Here are some important Hadoop commands that you may find useful:
hadoop fs -ls <path>
: This command lists the files and directories in the specified path.hadoop fs -mkdir <path>
: This command creates a new directory in the specified path.hadoop fs -rm <path>
: This command deletes the specified file or directory.hadoop fs -put <local file> <hdfs file>
: This command copies a file from the local file system to HDFS (Hadoop Distributed File System).hadoop fs -get <hdfs file> <local file>
: This command copies a file from HDFS to the local file system.hadoop fs -mv <src> <dst>
: This command moves a file or directory from the source path to the destination path.hadoop jar <jar file> <main class> <arguments>
: This command runs a Java application packaged in a JAR file on a Hadoop cluster.yarn application -list
: This command lists all the applications that are currently running or have completed on the YARN (Yet Another Resource Negotiator) cluster.yarn node -list
: This command lists all the nodes (machines) that are currently registered with the YARN cluster.
These are just a few examples of Hadoop commands. There are many other commands available for managing and monitoring a Hadoop cluster.
- hdfs dfs -ls: This command is used to list the files and directories in HDFS. Example: hdfs dfs -ls /
- hdfs dfs -mkdir: This command is used to create a directory in HDFS. Example: hdfs dfs -mkdir /data
- hdfs dfs -put: This command is used to copy a file from the local file system to HDFS. Example: hdfs dfs -put /local/file.txt /data/file.txt
- hdfs dfs -get: This command is used to copy a file from HDFS to the local file system. Example: hdfs dfs -get /data/file.txt /local/file.txt
- hdfs dfs -cp: This command is used to copy a file or directory within HDFS. Example: hdfs dfs -cp /data/file.txt /data/copy.txt
- hdfs dfs -mv: This command is used to move a file or directory within HDFS. Example: hdfs dfs -mv /data/file.txt /data/newfile.txt
- hdfs dfs -rm: This command is used to delete a file from HDFS. Example: hdfs dfs -rm /data/file.txt
- hdfs dfs -rmdir: This command is used to delete an empty directory from HDFS. Example: hdfs dfs -rmdir /data/empty
- hdfs dfs -rmr: This command is used to recursively delete a directory and all its contents from HDFS. Example: hdfs dfs -rmr /data/dir
- hdfs dfs -du: This command is used to display the size of a file or directory in HDFS. Example: hdfs dfs -du /data/file.txt
- hdfs dfs -cat: This command is used to display the contents of a file in HDFS. Example: hdfs dfs -cat /data/file.txt
- hdfs dfs -tail: This command is used to display the last few lines of a file in HDFS. Example: hdfs dfs -tail /data/file.txt
- hdfs dfs -head: This command is used to display the first few lines of a file in HDFS. Example: hdfs dfs -head /data/file.txt
- hdfs dfs -count: This command is used to count the number of files and directories in HDFS. Example: hdfs dfs -count /data
- hdfs dfs -chmod: This command is used to change the permissions of a file or directory in HDFS. Example: hdfs dfs -chmod 755 /data/file.txt
- hdfs dfs -chown: This command is used to change the owner of a file or directory in HDFS. Example: hdfs dfs -chown user:group /data/file.txt
- hdfs dfs -getmerge: This command is used to merge multiple files in HDFS into a single file in the local file system. Example: hdfs dfs -getmerge /data/files /local/merged.txt
- hdfs dfs -df: This command is used to display the amount of disk space used and available in HDFS. Example: hdfs dfs -df /
- hdfs dfs -expunge: This command is used to empty the trash in HDFS. Example: hdfs dfs -expunge
- hdfs dfs -copyFromLocal: This command is used to copy a file from the local file system to HDFS, equivalent to the “-put” command. Example: hdfs dfs -copyFromLocal /local/file.txt /data/file.txt
- hdfs dfs -copyToLocal: This command is used to copy a file from HDFS to the local file system, equivalent to the “-get” command. Example: hdfs dfs -copyToLocal /data/file.txt /local/file.txt
- hdfs dfs -setrep: This command is used to set the replication factor of a file in HDFS. Example: hdfs dfs -setrep 3 /data/file.txt
- hdfs dfs -text: This command is used to display the contents of a file in HDFS in text format. Example: hdfs dfs -text /data/file.txt
- hdfs dfs -stat: This command is used to display the status of a file in HDFS. Example: hdfs dfs -stat /data/file.txt
- hdfs dfs -tail: This command is used to display the last few lines of a file in HDFS. Example: hdfs dfs -tail /data/file.txt
- hdfs dfs -touchz: This command is used to create an empty file in HDFS. Example: hdfs dfs -touchz /data/empty.txt
- hdfs dfs -usage: This command is used to display the usage information of a file or directory in HDFS. Example: hdfs dfs -usage /data/file.txt
- hdfs fsck: This command is used to check the health of the file system and detect any issues. Example: hdfs fsck /
- hdfs balancer: This command is used to balance the data across nodes in the cluster. Example: hdfs balancer
- hdfs dfsadmin: This command is used to perform administrative operations on HDFS. Example: hdfs dfsadmin -report
- hdfs oiv: This command is used to perform offline image viewer operations. Example: hdfs oiv -i /image -o /local/output
- hdfs version: This command is used to display the version of HDFS. Example: hdfs version