You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2006/06/14 19:26:07 UTC

[Lucene-hadoop Wiki] Update of "hadoop-0.1-dev/bin/hadoop dfs" by YoramArnon

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by YoramArnon:
http://wiki.apache.org/lucene-hadoop/hadoop-0%2e1-dev/bin/hadoop_dfs

New page:
hadoop dfs is the command used to execute dfs commands. The full syntax is 
hadoop dfs [-local | -dfs <namenode:port>] [-ls <path>] [-lsr <path>] [-du <path>] [-mv <src> <dst>] [-cp <src> <dst>] [-rm <src>] [-put <localsrc> <dst>] [-copyFromLocal <localsrc> <dst>] [-moveFromLocal <localsrc> <dst>] [-get <src> <localdst>] [-cat <src>] [-copyToLocal <src><localdst>] [-moveToLocal <src> <localdst>] [-mkdir <path>] [-report] [-setrep [-R] <rep> <path/file>]

[-local | -dfs <namenode:port>]: if not specified, the current configuration is used, taken from the following, in increasing precedence:
   * hadoop-default.xml inside the hadoop jar file
   * hadoop-default.xml in $HADOOP_CONF_DIR
   * hadoop-site.xml in $HADOOP_CONF_DIR
-local means use the local file system as your DFS, -dfs <namenode:port> specifies a particular name node to contact.
this argument is optional but if used must appear first on the command line.

exactly one additional argument must be specified.
A word about paths: a path may be relative or absolute. An absolute path starts with a '/', a relative path does not, and always relates to /user/<currentUser>. There is no notion of current working directory.

-ls: list the contents of the specified path. If path is not specified, the contents of /user/<currentUser> will be listed. The output contains one line of the form
 Found n items
followed by one line per directory and one line per file.  Directory entries are of the form and file entries are of the form
 fileName   <r n>   size
where n is the number of replicas specified for the file and size is the size of the file, in bytes.

-lsr: recursively list the contents of the specified path. Behaves very similarly to hadoop dfs -ls, except that the first line (Found n items) is omitted, and data is produces for all the entries in the subtree.