You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by tu...@apache.org on 2012/12/20 14:41:43 UTC

svn commit: r1424459 [2/2] - in /hadoop/common/trunk/hadoop-common-project/hadoop-common: ./ src/main/docs/src/documentation/content/xdocs/ src/site/apt/

Added: hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm?rev=1424459&view=auto
==============================================================================
--- hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm (added)
+++ hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/CommandsManual.apt.vm Thu Dec 20 13:41:43 2012
@@ -0,0 +1,490 @@
+~~ Licensed to the Apache Software Foundation (ASF) under one or more
+~~ contributor license agreements.  See the NOTICE file distributed with
+~~ this work for additional information regarding copyright ownership.
+~~ The ASF licenses this file to You under the Apache License, Version 2.0
+~~ (the "License"); you may not use this file except in compliance with
+~~ the License.  You may obtain a copy of the License at
+~~
+~~     http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License.
+
+  ---
+  Hadoop Commands Guide
+  ---
+  ---
+  ${maven.build.timestamp}
+
+%{toc}
+
+Overview
+
+   All hadoop commands are invoked by the <<<bin/hadoop>>> script. Running the
+   hadoop script without any arguments prints the description for all
+   commands.
+
+   Usage: <<<hadoop [--config confdir] [COMMAND] [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
+
+   Hadoop has an option parsing framework that employs parsing generic
+   options as well as running classes.
+
+*-----------------------+---------------+
+|| COMMAND_OPTION       || Description
+*-----------------------+---------------+
+| <<<--config confdir>>>| Overwrites the default Configuration directory.  Default is <<<${HADOOP_HOME}/conf>>>.
+*-----------------------+---------------+
+| GENERIC_OPTIONS       | The common set of options supported by multiple commands.
+| COMMAND_OPTIONS       | Various commands with their options are described in the following sections. The commands have been grouped into User Commands and Administration Commands.
+*-----------------------+---------------+
+
+Generic Options
+
+   The following options are supported by {{dfsadmin}}, {{fs}}, {{fsck}},
+   {{job}} and {{fetchdt}}. Applications should implement {{{some_useful_url}Tool}} to support
+   {{{another_useful_url}GenericOptions}}.
+
+*------------------------------------------------+-----------------------------+
+||            GENERIC_OPTION                     ||            Description
+*------------------------------------------------+-----------------------------+
+|<<<-conf \<configuration file\> >>>             | Specify an application
+                                                 | configuration file.
+*------------------------------------------------+-----------------------------+
+|<<<-D \<property\>=\<value\> >>>                | Use value for given property.
+*------------------------------------------------+-----------------------------+
+|<<<-jt \<local\> or \<jobtracker:port\> >>>     | Specify a job tracker.
+                                                 | Applies only to job.
+*------------------------------------------------+-----------------------------+
+|<<<-files \<comma separated list of files\> >>> | Specify comma separated files
+                                                 | to be copied to the map
+                                                 | reduce cluster.  Applies only
+                                                 | to job.
+*------------------------------------------------+-----------------------------+
+|<<<-libjars \<comma seperated list of jars\> >>>| Specify comma separated jar
+                                                 | files to include in the
+                                                 | classpath. Applies only to
+                                                 | job.
+*------------------------------------------------+-----------------------------+
+|<<<-archives \<comma separated list of archives\> >>> | Specify comma separated
+                                                 | archives to be unarchived on
+                                                 | the compute machines. Applies
+                                                 | only to job.
+*------------------------------------------------+-----------------------------+
+
+User Commands
+
+   Commands useful for users of a hadoop cluster.
+
+* <<<archive>>>
+
+   Creates a hadoop archive. More information can be found at Hadoop
+   Archives.
+
+   Usage: <<<hadoop archive -archiveName NAME <src>* <dest> >>>
+
+*-------------------+-------------------------------------------------------+
+||COMMAND_OPTION    ||                   Description
+*-------------------+-------------------------------------------------------+
+| -archiveName NAME |  Name of the archive to be created.
+*-------------------+-------------------------------------------------------+
+| src               | Filesystem pathnames which work as usual with regular
+                    | expressions.
+*-------------------+-------------------------------------------------------+
+| dest              | Destination directory which would contain the archive.
+*-------------------+-------------------------------------------------------+
+
+* <<<distcp>>>
+
+   Copy file or directories recursively. More information can be found at
+   Hadoop DistCp Guide.
+
+   Usage: <<<hadoop distcp <srcurl> <desturl> >>>
+
+*-------------------+--------------------------------------------+
+||COMMAND_OPTION    || Description
+*-------------------+--------------------------------------------+
+| srcurl            | Source Url
+*-------------------+--------------------------------------------+
+| desturl           | Destination Url
+*-------------------+--------------------------------------------+
+
+* <<<fs>>>
+
+   Usage: <<<hadoop fs [GENERIC_OPTIONS] [COMMAND_OPTIONS]>>>
+
+   Deprecated, use <<<hdfs dfs>>> instead.
+
+   Runs a generic filesystem user client.
+
+   The various COMMAND_OPTIONS can be found at File System Shell Guide.
+
+* <<<fsck>>>
+
+   Runs a HDFS filesystem checking utility. See {{Fsck}} for more info.
+
+   Usage: <<<hadoop fsck [GENERIC_OPTIONS] <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]>>>
+
+*------------------+---------------------------------------------+
+||  COMMAND_OPTION || Description
+*------------------+---------------------------------------------+
+|   <path>         | Start checking from this path.
+*------------------+---------------------------------------------+
+|   -move          | Move corrupted files to /lost+found
+*------------------+---------------------------------------------+
+|   -delete        | Delete corrupted files.
+*------------------+---------------------------------------------+
+|   -openforwrite  | Print out files opened for write.
+*------------------+---------------------------------------------+
+|   -files         | Print out files being checked.
+*------------------+---------------------------------------------+
+|   -blocks        | Print out block report.
+*------------------+---------------------------------------------+
+|   -locations     | Print out locations for every block.
+*------------------+---------------------------------------------+
+|   -racks         | Print out network topology for data-node locations.
+*------------------+---------------------------------------------+
+
+* <<<fetchdt>>>
+
+   Gets Delegation Token from a NameNode. See {{fetchdt}} for more info.
+
+   Usage: <<<hadoop fetchdt [GENERIC_OPTIONS] [--webservice <namenode_http_addr>] <path> >>>
+
+*------------------------------+---------------------------------------------+
+|| COMMAND_OPTION              || Description
+*------------------------------+---------------------------------------------+
+| <fileName>                   | File name to store the token into.
+*------------------------------+---------------------------------------------+
+| --webservice <https_address> | use http protocol instead of RPC
+*------------------------------+---------------------------------------------+
+
+* <<<jar>>>
+
+   Runs a jar file. Users can bundle their Map Reduce code in a jar file and
+   execute it using this command.
+
+   Usage: <<<hadoop jar <jar> [mainClass] args...>>>
+
+   The streaming jobs are run via this command. Examples can be referred from
+   Streaming examples
+
+   Word count example is also run using jar command. It can be referred from
+   Wordcount example
+
+* <<<job>>>
+
+   Command to interact with Map Reduce Jobs.
+
+   Usage: <<<hadoop job [GENERIC_OPTIONS] [-submit <job-file>] | [-status <job-id>] | [-counter <job-id> <group-name> <counter-name>] | [-kill <job-id>] | [-events <job-id> <from-event-#> <#-of-events>] | [-history [all] <jobOutputDir>] | [-list [all]] | [-kill-task <task-id>] | [-fail-task <task-id>] | [-set-priority <job-id> <priority>]>>>
+
+*------------------------------+---------------------------------------------+
+|| COMMAND_OPTION              || Description
+*------------------------------+---------------------------------------------+
+| -submit <job-file>           | Submits the job.
+*------------------------------+---------------------------------------------+
+| -status <job-id>             | Prints the map and reduce completion
+                               | percentage and all job counters.
+*------------------------------+---------------------------------------------+
+| -counter <job-id> <group-name> <counter-name> | Prints the counter value.
+*------------------------------+---------------------------------------------+
+| -kill <job-id>               | Kills the job.
+*------------------------------+---------------------------------------------+
+| -events <job-id> <from-event-#> <#-of-events> | Prints the events' details
+                               | received by jobtracker for the given range.
+*------------------------------+---------------------------------------------+
+| -history [all]<jobOutputDir> | Prints job details, failed and killed tip
+                               | details.  More details about the job such as
+                               | successful tasks and task attempts made for
+                               | each task can be viewed by specifying the [all]
+                               | option.
+*------------------------------+---------------------------------------------+
+| -list [all]                  | Displays jobs which are yet to complete.
+                               | <<<-list all>>> displays all jobs.
+*------------------------------+---------------------------------------------+
+| -kill-task <task-id>         | Kills the task. Killed tasks are NOT counted
+                               | against failed attempts.
+*------------------------------+---------------------------------------------+
+| -fail-task <task-id>         | Fails the task. Failed tasks are counted
+                               | against failed attempts.
+*------------------------------+---------------------------------------------+
+| -set-priority <job-id> <priority> | Changes the priority of the job. Allowed
+                               | priority values are VERY_HIGH, HIGH, NORMAL,
+                               | LOW, VERY_LOW
+*------------------------------+---------------------------------------------+
+
+* <<<pipes>>>
+
+   Runs a pipes job.
+
+   Usage: <<<hadoop pipes [-conf <path>] [-jobconf <key=value>, <key=value>,
+   ...] [-input <path>] [-output <path>] [-jar <jar file>] [-inputformat
+   <class>] [-map <class>] [-partitioner <class>] [-reduce <class>] [-writer
+   <class>] [-program <executable>] [-reduces <num>]>>>
+ 
+*----------------------------------------+------------------------------------+
+|| COMMAND_OPTION                        || Description
+*----------------------------------------+------------------------------------+
+| -conf <path>                           | Configuration for job
+*----------------------------------------+------------------------------------+
+| -jobconf <key=value>, <key=value>, ... | Add/override configuration for job
+*----------------------------------------+------------------------------------+
+| -input <path>                          | Input directory
+*----------------------------------------+------------------------------------+
+| -output <path>                         | Output directory
+*----------------------------------------+------------------------------------+
+| -jar <jar file>                        | Jar filename
+*----------------------------------------+------------------------------------+
+| -inputformat <class>                   | InputFormat class
+*----------------------------------------+------------------------------------+
+| -map <class>                           | Java Map class
+*----------------------------------------+------------------------------------+
+| -partitioner <class>                   | Java Partitioner
+*----------------------------------------+------------------------------------+
+| -reduce <class>                        | Java Reduce class
+*----------------------------------------+------------------------------------+
+| -writer <class>                        | Java RecordWriter
+*----------------------------------------+------------------------------------+
+| -program <executable>                  | Executable URI
+*----------------------------------------+------------------------------------+
+| -reduces <num>                         | Number of reduces
+*----------------------------------------+------------------------------------+
+
+* <<<queue>>>
+
+   command to interact and view Job Queue information
+
+   Usage: <<<hadoop queue [-list] | [-info <job-queue-name> [-showJobs]] | [-showacls]>>>
+
+*-----------------+-----------------------------------------------------------+
+|| COMMAND_OPTION || Description
+*-----------------+-----------------------------------------------------------+
+| -list           | Gets list of Job Queues configured in the system.
+                  | Along with scheduling information associated with the job queues.
+*-----------------+-----------------------------------------------------------+
+| -info <job-queue-name> [-showJobs] | Displays the job queue information and
+                  | associated scheduling information of particular job queue.
+                  | If <<<-showJobs>>> options is present a list of jobs
+                  | submitted to the particular job queue is displayed.
+*-----------------+-----------------------------------------------------------+
+| -showacls       | Displays the queue name and associated queue operations
+                  | allowed for the current user. The list consists of only
+                  | those queues to which the user has access.
+*-----------------+-----------------------------------------------------------+
+
+* <<<version>>>
+
+   Prints the version.
+
+   Usage: <<<hadoop version>>>
+
+* <<<CLASSNAME>>>
+
+   hadoop script can be used to invoke any class.
+
+   Usage: <<<hadoop CLASSNAME>>>
+
+   Runs the class named <<<CLASSNAME>>>.
+
+* <<<classpath>>>
+
+   Prints the class path needed to get the Hadoop jar and the required
+   libraries.
+
+   Usage: <<<hadoop classpath>>>
+
+Administration Commands
+
+   Commands useful for administrators of a hadoop cluster.
+
+* <<<balancer>>>
+
+   Runs a cluster balancing utility. An administrator can simply press Ctrl-C
+   to stop the rebalancing process. See Rebalancer for more details.
+
+   Usage: <<<hadoop balancer [-threshold <threshold>]>>>
+
+*------------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION        | Description
+*------------------------+-----------------------------------------------------------+
+| -threshold <threshold> | Percentage of disk capacity. This overwrites the
+                         | default threshold.
+*------------------------+-----------------------------------------------------------+
+
+* <<<daemonlog>>>
+
+   Get/Set the log level for each daemon.
+
+   Usage: <<<hadoop daemonlog -getlevel <host:port> <name> >>>
+   Usage: <<<hadoop daemonlog -setlevel <host:port> <name> <level> >>>
+
+*------------------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION              || Description
+*------------------------------+-----------------------------------------------------------+
+| -getlevel <host:port> <name> | Prints the log level of the daemon running at
+                               | <host:port>. This command internally connects
+                               | to http://<host:port>/logLevel?log=<name>
+*------------------------------+-----------------------------------------------------------+
+|   -setlevel <host:port> <name> <level> | Sets the log level of the daemon
+                               | running at <host:port>. This command internally
+                               | connects to http://<host:port>/logLevel?log=<name>
+*------------------------------+-----------------------------------------------------------+
+
+* <<<datanode>>>
+
+   Runs a HDFS datanode.
+
+   Usage: <<<hadoop datanode [-rollback]>>>
+
+*-----------------+-----------------------------------------------------------+
+|| COMMAND_OPTION || Description
+*-----------------+-----------------------------------------------------------+
+| -rollback       | Rollsback the datanode to the previous version. This should
+                  | be used after stopping the datanode and distributing the old
+                  | hadoop version.
+*-----------------+-----------------------------------------------------------+
+
+* <<<dfsadmin>>>
+
+   Runs a HDFS dfsadmin client.
+
+   Usage: <<<hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] [-help [cmd]]>>>
+
+*-----------------+-----------------------------------------------------------+
+|| COMMAND_OPTION || Description
+| -report         | Reports basic filesystem information and statistics.
+*-----------------+-----------------------------------------------------------+
+| -safemode enter / leave / get / wait | Safe mode maintenance command. Safe
+                  | mode is a Namenode state in which it \
+                  | 1. does not accept changes to the name space (read-only) \
+                  | 2. does not replicate or delete blocks. \
+                  | Safe mode is entered automatically at Namenode startup, and
+                  | leaves safe mode automatically when the configured minimum
+                  | percentage of blocks satisfies the minimum replication
+                  | condition. Safe mode can also be entered manually, but then
+                  | it can only be turned off manually as well.
+*-----------------+-----------------------------------------------------------+
+| -refreshNodes   | Re-read the hosts and exclude files to update the set of
+                  | Datanodes that are allowed to connect to the Namenode and
+                  | those that should be decommissioned or recommissioned.
+*-----------------+-----------------------------------------------------------+
+| -finalizeUpgrade| Finalize upgrade of HDFS. Datanodes delete their previous
+                  | version working directories, followed by Namenode doing the
+                  | same. This completes the upgrade process.
+*-----------------+-----------------------------------------------------------+
+| -upgradeProgress status / details / force | Request current distributed
+                  | upgrade status, a detailed status or force the upgrade to
+                  | proceed.
+*-----------------+-----------------------------------------------------------+
+| -metasave filename | Save Namenode's primary data structures to <filename> in
+                  | the directory specified by hadoop.log.dir property.
+                  | <filename> will contain one line for each of the following\
+                  | 1. Datanodes heart beating with Namenode\
+                  | 2. Blocks waiting to be replicated\
+                  | 3. Blocks currrently being replicated\
+                  | 4. Blocks waiting to be deleted\
+*-----------------+-----------------------------------------------------------+
+| -setQuota <quota> <dirname>...<dirname> | Set the quota <quota> for each
+                  | directory <dirname>. The directory quota is a long integer
+                  | that puts a hard limit on the number of names in the
+                  | directory tree.  Best effort for the directory, with faults
+                  | reported if \
+                  | 1. N is not a positive integer, or \
+                  | 2. user is not an administrator, or \
+                  | 3. the directory does not exist or is a file, or \
+                  | 4. the directory would immediately exceed the new quota. \
+*-----------------+-----------------------------------------------------------+
+| -clrQuota <dirname>...<dirname> | Clear the quota for each directory
+                  | <dirname>.  Best effort for the directory. with fault
+                  | reported if \
+                  | 1. the directory does not exist or is a file, or \
+                  | 2. user is not an administrator.  It does not fault if the
+                  | directory has no quota.
+*-----------------+-----------------------------------------------------------+
+| -help [cmd]     | Displays help for the given command or all commands if none
+                  | is specified.
+*-----------------+-----------------------------------------------------------+
+
+* <<<mradmin>>>
+
+   Runs MR admin client
+
+   Usage: <<<hadoop mradmin [ GENERIC_OPTIONS ] [-refreshQueueAcls]>>>
+
+*-------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION   || Description
+*-------------------+-----------------------------------------------------------+
+| -refreshQueueAcls | Refresh the queue acls used by hadoop, to check access
+                    | during submissions and administration of the job by the
+                    | user. The properties present in mapred-queue-acls.xml is
+                    | reloaded by the queue manager.
+*-------------------+-----------------------------------------------------------+
+
+* <<<jobtracker>>>
+
+   Runs the MapReduce job Tracker node.
+
+   Usage: <<<hadoop jobtracker [-dumpConfiguration]>>>
+
+*--------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION    || Description
+*--------------------+-----------------------------------------------------------+
+| -dumpConfiguration | Dumps the configuration used by the JobTracker alongwith
+                     | queue configuration in JSON format into Standard output
+                     | used by the jobtracker and exits.
+*--------------------+-----------------------------------------------------------+
+
+* <<<namenode>>>
+
+   Runs the namenode. More info about the upgrade, rollback and finalize is
+   at Upgrade Rollback
+
+   Usage: <<<hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint]>>>
+
+*--------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION    || Description
+*--------------------+-----------------------------------------------------------+
+| -format            | Formats the namenode. It starts the namenode, formats
+                     | it and then shut it down.
+*--------------------+-----------------------------------------------------------+
+| -upgrade           | Namenode should be started with upgrade option after
+                     | the distribution of new hadoop version.
+*--------------------+-----------------------------------------------------------+
+| -rollback          | Rollsback the namenode to the previous version. This
+                     | should be used after stopping the cluster and
+                     | distributing the old hadoop version.
+*--------------------+-----------------------------------------------------------+
+| -finalize          | Finalize will remove the previous state of the files
+                     | system. Recent upgrade will become permanent.  Rollback
+                     | option will not be available anymore. After finalization
+                     | it shuts the namenode down.
+*--------------------+-----------------------------------------------------------+
+| -importCheckpoint  | Loads image from a checkpoint directory and save it
+                     | into the current one. Checkpoint dir is read from
+                     | property fs.checkpoint.dir
+*--------------------+-----------------------------------------------------------+
+
+* <<<secondarynamenode>>>
+
+   Runs the HDFS secondary namenode. See Secondary Namenode for more
+   info.
+
+   Usage: <<<hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize]>>>
+
+*----------------------+-----------------------------------------------------------+
+|| COMMAND_OPTION      || Description
+*----------------------+-----------------------------------------------------------+
+| -checkpoint [-force] | Checkpoints the Secondary namenode if EditLog size
+                       | >= fs.checkpoint.size. If <<<-force>>> is used,
+                       | checkpoint irrespective of EditLog size.
+*----------------------+-----------------------------------------------------------+
+| -geteditsize         | Prints the EditLog size.
+*----------------------+-----------------------------------------------------------+
+
+* <<<tasktracker>>>
+
+   Runs a MapReduce task Tracker node.
+
+   Usage: <<<hadoop tasktracker>>>

Added: hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm?rev=1424459&view=auto
==============================================================================
--- hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm (added)
+++ hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm Thu Dec 20 13:41:43 2012
@@ -0,0 +1,418 @@
+~~ Licensed to the Apache Software Foundation (ASF) under one or more
+~~ contributor license agreements.  See the NOTICE file distributed with
+~~ this work for additional information regarding copyright ownership.
+~~ The ASF licenses this file to You under the Apache License, Version 2.0
+~~ (the "License"); you may not use this file except in compliance with
+~~ the License.  You may obtain a copy of the License at
+~~
+~~     http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License.
+
+  ---
+  File System Shell Guide
+  ---
+  ---
+  ${maven.build.timestamp}
+
+%{toc}
+
+Overview
+
+   The File System (FS) shell includes various shell-like commands that
+   directly interact with the Hadoop Distributed File System (HDFS) as well as
+   other file systems that Hadoop supports, such as Local FS, HFTP FS, S3 FS,
+   and others. The FS shell is invoked by:
+
++---
+bin/hadoop fs <args>
++---
+
+   All FS shell commands take path URIs as arguments. The URI format is
+   <<<scheme://authority/path>>>. For HDFS the scheme is <<<hdfs>>>, and for
+   the Local FS the scheme is <<<file>>>. The scheme and authority are
+   optional. If not specified, the default scheme specified in the
+   configuration is used. An HDFS file or directory such as /parent/child can
+   be specified as <<<hdfs://namenodehost/parent/child>>> or simply as
+   <<</parent/child>>> (given that your configuration is set to point to
+   <<<hdfs://namenodehost>>>).
+
+   Most of the commands in FS shell behave like corresponding Unix commands.
+   Differences are described with each of the commands. Error information is
+   sent to stderr and the output is sent to stdout.
+
+cat
+
+   Usage: <<<hdfs dfs -cat URI [URI ...]>>>
+
+   Copies source paths to stdout.
+
+   Example:
+
+     * <<<hdfs dfs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
+
+     * <<<hdfs dfs -cat file:///file3 /user/hadoop/file4>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+chgrp
+
+   Usage: <<<hdfs dfs -chgrp [-R] GROUP URI [URI ...]>>>
+
+   Change group association of files. With -R, make the change recursively
+   through the directory structure. The user must be the owner of files, or
+   else a super-user. Additional information is in the
+   {{{betterurl}Permissions Guide}}.
+
+chmod
+
+   Usage: <<<hdfs dfs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]>>>
+
+   Change the permissions of files. With -R, make the change recursively
+   through the directory structure. The user must be the owner of the file, or
+   else a super-user. Additional information is in the 
+   {{{betterurl}Permissions Guide}}.
+
+chown
+
+   Usage: <<<hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]>>>
+
+   Change the owner of files. With -R, make the change recursively through the
+   directory structure. The user must be a super-user. Additional information
+   is in the {{{betterurl}Permissions Guide}}.
+
+copyFromLocal
+
+   Usage: <<<hdfs dfs -copyFromLocal <localsrc> URI>>>
+
+   Similar to put command, except that the source is restricted to a local
+   file reference.
+
+copyToLocal
+
+   Usage: <<<hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI <localdst> >>>
+
+   Similar to get command, except that the destination is restricted to a
+   local file reference.
+
+count
+
+   Usage: <<<hdfs dfs -count [-q] <paths> >>>
+
+   Count the number of directories, files and bytes under the paths that match
+   the specified file pattern.  The output columns with -count are: DIR_COUNT,
+   FILE_COUNT, CONTENT_SIZE FILE_NAME
+
+   The output columns with -count -q are: QUOTA, REMAINING_QUATA, SPACE_QUOTA,
+   REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, FILE_NAME
+
+   Example:
+
+     * <<<hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2>>>
+
+     * <<<hdfs dfs -count -q hdfs://nn1.example.com/file1>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+cp
+
+   Usage: <<<hdfs dfs -cp URI [URI ...] <dest> >>>
+
+   Copy files from source to destination. This command allows multiple sources
+   as well in which case the destination must be a directory.
+
+   Example:
+
+     * <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2>>>
+
+     * <<<hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+du
+
+   Usage: <<<hdfs dfs -du [-s] [-h] URI [URI ...]>>>
+
+   Displays sizes of files and directories contained in the given directory or
+   the length of a file in case its just a file.
+
+   Options:
+
+     * The -s option will result in an aggregate summary of file lengths being
+       displayed, rather than the individual files.
+
+     * The -h option will format file sizes in a "human-readable" fashion (e.g
+       64.0m instead of 67108864)
+
+   Example:
+
+    * hdfs dfs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1
+
+   Exit Code:
+   Returns 0 on success and -1 on error.
+
+dus
+
+   Usage: <<<hdfs dfs -dus <args> >>>
+
+   Displays a summary of file lengths. This is an alternate form of hdfs dfs -du -s.
+
+expunge
+
+   Usage: <<<hdfs dfs -expunge>>>
+
+   Empty the Trash. Refer to the {{{betterurl}HDFS Architecture Guide}} for
+   more information on the Trash feature.
+
+get
+
+   Usage: <<<hdfs dfs -get [-ignorecrc] [-crc] <src> <localdst> >>>
+
+   Copy files to the local file system. Files that fail the CRC check may be
+   copied with the -ignorecrc option. Files and CRCs may be copied using the
+   -crc option.
+
+   Example:
+
+     * <<<hdfs dfs -get /user/hadoop/file localfile>>>
+
+     * <<<hdfs dfs -get hdfs://nn.example.com/user/hadoop/file localfile>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+getmerge
+
+   Usage: <<<hdfs dfs -getmerge <src> <localdst> [addnl]>>>
+
+   Takes a source directory and a destination file as input and concatenates
+   files in src into the destination local file. Optionally addnl can be set to
+   enable adding a newline character at the
+   end of each file.
+
+ls
+
+   Usage: <<<hdfs dfs -ls <args> >>>
+
+   For a file returns stat on the file with the following format:
+
++---+
+permissions number_of_replicas userid groupid filesize modification_date modification_time filename
++---+
+
+   For a directory it returns list of its direct children as in unix.A directory is listed as:
+
++---+
+permissions userid groupid modification_date modification_time dirname
++---+
+
+   Example:
+
+     * <<<hdfs dfs -ls /user/hadoop/file1>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+lsr
+
+   Usage: <<<hdfs dfs -lsr <args> >>>
+
+   Recursive version of ls. Similar to Unix ls -R.
+
+mkdir
+
+   Usage: <<<hdfs dfs -mkdir [-p] <paths> >>>
+
+   Takes path uri's as argument and creates directories.  With -p the behavior
+   is much like unix mkdir -p creating parent directories along the path.
+
+   Example:
+
+     * <<<hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2>>>
+
+     * <<<hdfs dfs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+moveFromLocal
+
+   Usage: <<<dfs -moveFromLocal <localsrc> <dst> >>>
+
+   Similar to put command, except that the source localsrc is deleted after
+   it's copied.
+
+moveToLocal
+
+   Usage: <<<hdfs dfs -moveToLocal [-crc] <src> <dst> >>>
+
+   Displays a "Not implemented yet" message.
+
+mv
+
+   Usage: <<<hdfs dfs -mv URI [URI ...] <dest> >>>
+
+   Moves files from source to destination. This command allows multiple sources
+   as well in which case the destination needs to be a directory. Moving files
+   across file systems is not permitted.
+
+   Example:
+
+     * <<<hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2>>>
+
+     * <<<hdfs dfs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+put
+
+   Usage: <<<hdfs dfs -put <localsrc> ... <dst> >>>
+
+   Copy single src, or multiple srcs from local file system to the destination
+   file system. Also reads input from stdin and writes to destination file
+   system.
+
+     * <<<hdfs dfs -put localfile /user/hadoop/hadoopfile>>>
+
+     * <<<hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir>>>
+
+     * <<<hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile>>>
+
+     * <<<hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile>>>
+       Reads the input from stdin.
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+rm
+
+   Usage: <<<hdfs dfs -rm [-skipTrash] URI [URI ...]>>>
+
+   Delete files specified as args. Only deletes non empty directory and files.
+   If the -skipTrash option is specified, the trash, if enabled, will be
+   bypassed and the specified file(s) deleted immediately. This can be useful
+   when it is necessary to delete files from an over-quota directory. Refer to
+   rmr for recursive deletes.
+
+   Example:
+
+     * <<<hdfs dfs -rm hdfs://nn.example.com/file /user/hadoop/emptydir>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+rmr
+
+   Usage: <<<hdfs dfs -rmr [-skipTrash] URI [URI ...]>>>
+
+   Recursive version of delete. If the -skipTrash option is specified, the
+   trash, if enabled, will be bypassed and the specified file(s) deleted
+   immediately. This can be useful when it is necessary to delete files from an
+   over-quota directory.
+
+   Example:
+
+     * <<<hdfs dfs -rmr /user/hadoop/dir>>>
+
+     * <<<hdfs dfs -rmr hdfs://nn.example.com/user/hadoop/dir>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+setrep
+
+   Usage: <<<hdfs dfs -setrep [-R] <path> >>>
+
+   Changes the replication factor of a file. -R option is for recursively
+   increasing the replication factor of files within a directory.
+
+   Example:
+
+     * <<<hdfs dfs -setrep -w 3 -R /user/hadoop/dir1>>>
+
+   Exit Code:
+
+   Returns 0 on success and -1 on error.
+
+stat
+
+   Usage: <<<hdfs dfs -stat URI [URI ...]>>>
+
+   Returns the stat information on the path.
+
+   Example:
+
+     * <<<hdfs dfs -stat path>>>
+
+   Exit Code:
+   Returns 0 on success and -1 on error.
+
+tail
+
+   Usage: <<<hdfs dfs -tail [-f] URI>>>
+
+   Displays last kilobyte of the file to stdout. -f option can be used as in
+   Unix.
+
+   Example:
+
+     * <<<hdfs dfs -tail pathname>>>
+
+   Exit Code:
+   Returns 0 on success and -1 on error.
+
+test
+
+   Usage: <<<hdfs dfs -test -[ezd] URI>>>
+
+   Options:
+
+*----+------------+
+| -e | check to see if the file exists. Return 0 if true.
+*----+------------+
+| -z | check to see if the file is zero length. Return 0 if true.
+*----+------------+
+| -d | check to see if the path is directory. Return 0 if true.
+*----+------------+
+
+   Example:
+
+     * <<<hdfs dfs -test -e filename>>>
+
+text
+
+   Usage: <<<hdfs dfs -text <src> >>>
+
+   Takes a source file and outputs the file in text format. The allowed formats
+   are zip and TextRecordInputStream.
+
+touchz
+
+   Usage: <<<hdfs dfs -touchz URI [URI ...]>>>
+
+   Create a file of zero length.
+
+   Example:
+
+     * <<<hadoop -touchz pathname>>>
+
+   Exit Code:
+   Returns 0 on success and -1 on error.

Added: hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/HttpAuthentication.apt.vm
URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/HttpAuthentication.apt.vm?rev=1424459&view=auto
==============================================================================
--- hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/HttpAuthentication.apt.vm (added)
+++ hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/HttpAuthentication.apt.vm Thu Dec 20 13:41:43 2012
@@ -0,0 +1,99 @@
+~~ Licensed under the Apache License, Version 2.0 (the "License");
+~~ you may not use this file except in compliance with the License.
+~~ You may obtain a copy of the License at
+~~
+~~   http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License. See accompanying LICENSE file.
+
+  ---
+  Authentication for Hadoop HTTP web-consoles
+  ---
+  ---
+  ${maven.build.timestamp}
+
+Authentication for Hadoop HTTP web-consoles
+
+%{toc|section=1|fromDepth=0}
+
+* Introduction
+
+   This document describes how to configure Hadoop HTTP web-consoles to
+   require user authentication.
+
+   By default Hadoop HTTP web-consoles (JobTracker, NameNode, TaskTrackers
+   and DataNodes) allow access without any form of authentication.
+
+   Similarly to Hadoop RPC, Hadoop HTTP web-consoles can be configured to
+   require Kerberos authentication using HTTP SPNEGO protocol (supported
+   by browsers like Firefox and Internet Explorer).
+
+   In addition, Hadoop HTTP web-consoles support the equivalent of
+   Hadoop's Pseudo/Simple authentication. If this option is enabled, user
+   must specify their user name in the first browser interaction using the
+   user.name query string parameter. For example:
+   <<<http://localhost:50030/jobtracker.jsp?user.name=babu>>>.
+
+   If a custom authentication mechanism is required for the HTTP
+   web-consoles, it is possible to implement a plugin to support the
+   alternate authentication mechanism (refer to Hadoop hadoop-auth for details
+   on writing an <<<AuthenticatorHandler>>>).
+
+   The next section describes how to configure Hadoop HTTP web-consoles to
+   require user authentication.
+
+* Configuration
+
+   The following properties should be in the <<<core-site.xml>>> of all the
+   nodes in the cluster.
+
+   <<<hadoop.http.filter.initializers>>>: add to this property the
+   <<<org.apache.hadoop.security.AuthenticationFilterInitializer>>> initializer
+   class.
+
+   <<<hadoop.http.authentication.type>>>: Defines authentication used for the
+   HTTP web-consoles. The supported values are: <<<simple>>> | <<<kerberos>>> |
+   <<<#AUTHENTICATION_HANDLER_CLASSNAME#>>>. The dfeault value is <<<simple>>>.
+
+   <<<hadoop.http.authentication.token.validity>>>: Indicates how long (in
+   seconds) an authentication token is valid before it has to be renewed.
+   The default value is <<<36000>>>.
+
+   <<<hadoop.http.authentication.signature.secret.file>>>: The signature secret
+   file for signing the authentication tokens. If not set a random secret is
+   generated at startup time. The same secret should be used for all nodes
+   in the cluster, JobTracker, NameNode, DataNode and TastTracker. The
+   default value is <<<${user.home}/hadoop-http-auth-signature-secret>>>.
+   IMPORTANT: This file should be readable only by the Unix user running the
+   daemons.
+
+   <<<hadoop.http.authentication.cookie.domain>>>: The domain to use for the
+   HTTP cookie that stores the authentication token. In order to
+   authentiation to work correctly across all nodes in the cluster the
+   domain must be correctly set. There is no default value, the HTTP
+   cookie will not have a domain working only with the hostname issuing
+   the HTTP cookie.
+
+   IMPORTANT: when using IP addresses, browsers ignore cookies with domain
+   settings. For this setting to work properly all nodes in the cluster
+   must be configured to generate URLs with <<<hostname.domain>>> names on it.
+
+   <<<hadoop.http.authentication.simple.anonymous.allowed>>>: Indicates if
+   anonymous requests are allowed when using 'simple' authentication. The
+   default value is <<<true>>>
+
+   <<<hadoop.http.authentication.kerberos.principal>>>: Indicates the Kerberos
+   principal to be used for HTTP endpoint when using 'kerberos'
+   authentication. The principal short name must be <<<HTTP>>> per Kerberos HTTP
+   SPNEGO specification. The default value is <<<HTTP/_HOST@$LOCALHOST>>>,
+   where <<<_HOST>>> -if present- is replaced with bind address of the HTTP
+   server.
+
+   <<<hadoop.http.authentication.kerberos.keytab>>>: Location of the keytab file
+   with the credentials for the Kerberos principal used for the HTTP
+   endpoint. The default value is <<<${user.home}/hadoop.keytab>>>.i
+