You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by cm...@apache.org on 2015/02/25 01:03:50 UTC

[3/6] hadoop git commit: HADOOP-11495. Backport "convert site documentation from apt to markdown" to branch-2 (Masatake Iwasaki via Colin P. McCabe)

http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
new file mode 100644
index 0000000..c058021
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md
@@ -0,0 +1,313 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Apache Hadoop Compatibility
+===========================
+
+* [Apache Hadoop Compatibility](#Apache_Hadoop_Compatibility)
+    * [Purpose](#Purpose)
+    * [Compatibility types](#Compatibility_types)
+        * [Java API](#Java_API)
+            * [Use Cases](#Use_Cases)
+            * [Policy](#Policy)
+        * [Semantic compatibility](#Semantic_compatibility)
+            * [Policy](#Policy)
+        * [Wire compatibility](#Wire_compatibility)
+            * [Use Cases](#Use_Cases)
+            * [Policy](#Policy)
+        * [Java Binary compatibility for end-user applications i.e. Apache Hadoop ABI](#Java_Binary_compatibility_for_end-user_applications_i.e._Apache_Hadoop_ABI)
+            * [Use cases](#Use_cases)
+            * [Policy](#Policy)
+        * [REST APIs](#REST_APIs)
+            * [Policy](#Policy)
+        * [Metrics/JMX](#MetricsJMX)
+            * [Policy](#Policy)
+        * [File formats & Metadata](#File_formats__Metadata)
+            * [User-level file formats](#User-level_file_formats)
+                * [Policy](#Policy)
+            * [System-internal file formats](#System-internal_file_formats)
+                * [MapReduce](#MapReduce)
+                * [Policy](#Policy)
+                * [HDFS Metadata](#HDFS_Metadata)
+                * [Policy](#Policy)
+        * [Command Line Interface (CLI)](#Command_Line_Interface_CLI)
+            * [Policy](#Policy)
+        * [Web UI](#Web_UI)
+            * [Policy](#Policy)
+        * [Hadoop Configuration Files](#Hadoop_Configuration_Files)
+            * [Policy](#Policy)
+        * [Directory Structure](#Directory_Structure)
+            * [Policy](#Policy)
+        * [Java Classpath](#Java_Classpath)
+            * [Policy](#Policy)
+        * [Environment variables](#Environment_variables)
+            * [Policy](#Policy)
+        * [Build artifacts](#Build_artifacts)
+            * [Policy](#Policy)
+        * [Hardware/Software Requirements](#HardwareSoftware_Requirements)
+            * [Policies](#Policies)
+    * [References](#References)
+
+Purpose
+-------
+
+This document captures the compatibility goals of the Apache Hadoop project. The different types of compatibility between Hadoop releases that affects Hadoop developers, downstream projects, and end-users are enumerated. For each type of compatibility we:
+
+* describe the impact on downstream projects or end-users
+* where applicable, call out the policy adopted by the Hadoop developers when incompatible changes are permitted.
+
+Compatibility types
+-------------------
+
+### Java API
+
+Hadoop interfaces and classes are annotated to describe the intended audience and stability in order to maintain compatibility with previous releases. See [Hadoop Interface Classification](./InterfaceClassification.html) for details.
+
+* InterfaceAudience: captures the intended audience, possible values are Public (for end users and external projects), LimitedPrivate (for other Hadoop components, and closely related projects like YARN, MapReduce, HBase etc.), and Private (for intra component use).
+* InterfaceStability: describes what types of interface changes are permitted. Possible values are Stable, Evolving, Unstable, and Deprecated.
+
+#### Use Cases
+
+* Public-Stable API compatibility is required to ensure end-user programs and downstream projects continue to work without modification.
+* LimitedPrivate-Stable API compatibility is required to allow upgrade of individual components across minor releases.
+* Private-Stable API compatibility is required for rolling upgrades.
+
+#### Policy
+
+* Public-Stable APIs must be deprecated for at least one major release prior to their removal in a major release.
+* LimitedPrivate-Stable APIs can change across major releases, but not within a major release.
+* Private-Stable APIs can change across major releases, but not within a major release.
+* Classes not annotated are implicitly "Private". Class members not annotated inherit the annotations of the enclosing class.
+* Note: APIs generated from the proto files need to be compatible for rolling-upgrades. See the section on wire-compatibility for more details. The compatibility policies for APIs and wire-communication need to go hand-in-hand to address this.
+
+### Semantic compatibility
+
+Apache Hadoop strives to ensure that the behavior of APIs remains consistent over versions, though changes for correctness may result in changes in behavior. Tests and javadocs specify the API's behavior. The community is in the process of specifying some APIs more rigorously, and enhancing test suites to verify compliance with the specification, effectively creating a formal specification for the subset of behaviors that can be easily tested.
+
+#### Policy
+
+The behavior of API may be changed to fix incorrect behavior, such a change to be accompanied by updating existing buggy tests or adding tests in cases there were none prior to the change.
+
+### Wire compatibility
+
+Wire compatibility concerns data being transmitted over the wire between Hadoop processes. Hadoop uses Protocol Buffers for most RPC communication. Preserving compatibility requires prohibiting modification as described below. Non-RPC communication should be considered as well, for example using HTTP to transfer an HDFS image as part of snapshotting or transferring MapTask output. The potential communications can be categorized as follows:
+
+* Client-Server: communication between Hadoop clients and servers (e.g., the HDFS client to NameNode protocol, or the YARN client to ResourceManager protocol).
+* Client-Server (Admin): It is worth distinguishing a subset of the Client-Server protocols used solely by administrative commands (e.g., the HAAdmin protocol) as these protocols only impact administrators who can tolerate changes that end users (which use general Client-Server protocols) can not.
+* Server-Server: communication between servers (e.g., the protocol between the DataNode and NameNode, or NodeManager and ResourceManager)
+
+#### Use Cases
+
+* Client-Server compatibility is required to allow users to continue using the old clients even after upgrading the server (cluster) to a later version (or vice versa). For example, a Hadoop 2.1.0 client talking to a Hadoop 2.3.0 cluster.
+* Client-Server compatibility is also required to allow users to upgrade the client before upgrading the server (cluster). For example, a Hadoop 2.4.0 client talking to a Hadoop 2.3.0 cluster. This allows deployment of client-side bug fixes ahead of full cluster upgrades. Note that new cluster features invoked by new client APIs or shell commands will not be usable. YARN applications that attempt to use new APIs (including new fields in data structures) that have not yet deployed to the cluster can expect link exceptions.
+* Client-Server compatibility is also required to allow upgrading individual components without upgrading others. For example, upgrade HDFS from version 2.1.0 to 2.2.0 without upgrading MapReduce.
+* Server-Server compatibility is required to allow mixed versions within an active cluster so the cluster may be upgraded without downtime in a rolling fashion.
+
+#### Policy
+
+* Both Client-Server and Server-Server compatibility is preserved within a major release. (Different policies for different categories are yet to be considered.)
+* Compatibility can be broken only at a major release, though breaking compatibility even at major releases has grave consequences and should be discussed in the Hadoop community.
+* Hadoop protocols are defined in .proto (ProtocolBuffers) files. Client-Server protocols and Server-protocol .proto files are marked as stable. When a .proto file is marked as stable it means that changes should be made in a compatible fashion as described below:
+    * The following changes are compatible and are allowed at any time:
+        * Add an optional field, with the expectation that the code deals with the field missing due to communication with an older version of the code.
+        * Add a new rpc/method to the service
+        * Add a new optional request to a Message
+        * Rename a field
+        * Rename a .proto file
+        * Change .proto annotations that effect code generation (e.g. name of java package)
+    * The following changes are incompatible but can be considered only at a major release
+        * Change the rpc/method name
+        * Change the rpc/method parameter type or return type
+        * Remove an rpc/method
+        * Change the service name
+        * Change the name of a Message
+        * Modify a field type in an incompatible way (as defined recursively)
+        * Change an optional field to required
+        * Add or delete a required field
+        * Delete an optional field as long as the optional field has reasonable defaults to allow deletions
+    * The following changes are incompatible and hence never allowed
+        * Change a field id
+        * Reuse an old field that was previously deleted.
+        * Field numbers are cheap and changing and reusing is not a good idea.
+
+### Java Binary compatibility for end-user applications i.e. Apache Hadoop ABI
+
+As Apache Hadoop revisions are upgraded end-users reasonably expect that their applications should continue to work without any modifications. This is fulfilled as a result of support API compatibility, Semantic compatibility and Wire compatibility.
+
+However, Apache Hadoop is a very complex, distributed system and services a very wide variety of use-cases. In particular, Apache Hadoop MapReduce is a very, very wide API; in the sense that end-users may make wide-ranging assumptions such as layout of the local disk when their map/reduce tasks are executing, environment variables for their tasks etc. In such cases, it becomes very hard to fully specify, and support, absolute compatibility.
+
+#### Use cases
+
+* Existing MapReduce applications, including jars of existing packaged end-user applications and projects such as Apache Pig, Apache Hive, Cascading etc. should work unmodified when pointed to an upgraded Apache Hadoop cluster within a major release.
+* Existing YARN applications, including jars of existing packaged end-user applications and projects such as Apache Tez etc. should work unmodified when pointed to an upgraded Apache Hadoop cluster within a major release.
+* Existing applications which transfer data in/out of HDFS, including jars of existing packaged end-user applications and frameworks such as Apache Flume, should work unmodified when pointed to an upgraded Apache Hadoop cluster within a major release.
+
+#### Policy
+
+* Existing MapReduce, YARN & HDFS applications and frameworks should work unmodified within a major release i.e. Apache Hadoop ABI is supported.
+* A very minor fraction of applications maybe affected by changes to disk layouts etc., the developer community will strive to minimize these changes and will not make them within a minor version. In more egregious cases, we will consider strongly reverting these breaking changes and invalidating offending releases if necessary.
+* In particular for MapReduce applications, the developer community will try our best to support provide binary compatibility across major releases e.g. applications using org.apache.hadoop.mapred.
+* APIs are supported compatibly across hadoop-1.x and hadoop-2.x. See [Compatibility for MapReduce applications between hadoop-1.x and hadoop-2.x](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html) for more details.
+
+### REST APIs
+
+REST API compatibility corresponds to both the request (URLs) and responses to each request (content, which may contain other URLs). Hadoop REST APIs are specifically meant for stable use by clients across releases, even major releases. The following are the exposed REST APIs:
+
+* [WebHDFS](../hadoop-hdfs/WebHDFS.html) - Stable
+* [ResourceManager](../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html)
+* [NodeManager](../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html)
+* [MR Application Master](../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html)
+* [History Server](../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html)
+
+#### Policy
+
+The APIs annotated stable in the text above preserve compatibility across at least one major release, and maybe deprecated by a newer version of the REST API in a major release.
+
+### Metrics/JMX
+
+While the Metrics API compatibility is governed by Java API compatibility, the actual metrics exposed by Hadoop need to be compatible for users to be able to automate using them (scripts etc.). Adding additional metrics is compatible. Modifying (eg changing the unit or measurement) or removing existing metrics breaks compatibility. Similarly, changes to JMX MBean object names also break compatibility.
+
+#### Policy
+
+Metrics should preserve compatibility within the major release.
+
+### File formats & Metadata
+
+User and system level data (including metadata) is stored in files of different formats. Changes to the metadata or the file formats used to store data/metadata can lead to incompatibilities between versions.
+
+#### User-level file formats
+
+Changes to formats that end-users use to store their data can prevent them for accessing the data in later releases, and hence it is highly important to keep those file-formats compatible. One can always add a "new" format improving upon an existing format. Examples of these formats include har, war, SequenceFileFormat etc.
+
+##### Policy
+
+* Non-forward-compatible user-file format changes are restricted to major releases. When user-file formats change, new releases are expected to read existing formats, but may write data in formats incompatible with prior releases. Also, the community shall prefer to create a new format that programs must opt in to instead of making incompatible changes to existing formats.
+
+#### System-internal file formats
+
+Hadoop internal data is also stored in files and again changing these formats can lead to incompatibilities. While such changes are not as devastating as the user-level file formats, a policy on when the compatibility can be broken is important.
+
+##### MapReduce
+
+MapReduce uses formats like I-File to store MapReduce-specific data.
+
+##### Policy
+
+MapReduce-internal formats like IFile maintain compatibility within a major release. Changes to these formats can cause in-flight jobs to fail and hence we should ensure newer clients can fetch shuffle-data from old servers in a compatible manner.
+
+##### HDFS Metadata
+
+HDFS persists metadata (the image and edit logs) in a particular format. Incompatible changes to either the format or the metadata prevent subsequent releases from reading older metadata. Such incompatible changes might require an HDFS "upgrade" to convert the metadata to make it accessible. Some changes can require more than one such "upgrades".
+
+Depending on the degree of incompatibility in the changes, the following potential scenarios can arise:
+
+* Automatic: The image upgrades automatically, no need for an explicit "upgrade".
+* Direct: The image is upgradable, but might require one explicit release "upgrade".
+* Indirect: The image is upgradable, but might require upgrading to intermediate release(s) first.
+* Not upgradeable: The image is not upgradeable.
+
+##### Policy
+
+* A release upgrade must allow a cluster to roll-back to the older version and its older disk format. The rollback needs to restore the original data, but not required to restore the updated data.
+* HDFS metadata changes must be upgradeable via any of the upgrade paths - automatic, direct or indirect.
+* More detailed policies based on the kind of upgrade are yet to be considered.
+
+### Command Line Interface (CLI)
+
+The Hadoop command line programs may be use either directly via the system shell or via shell scripts. Changing the path of a command, removing or renaming command line options, the order of arguments, or the command return code and output break compatibility and may adversely affect users.
+
+#### Policy
+
+CLI commands are to be deprecated (warning when used) for one major release before they are removed or incompatibly modified in a subsequent major release.
+
+### Web UI
+
+Web UI, particularly the content and layout of web pages, changes could potentially interfere with attempts to screen scrape the web pages for information.
+
+#### Policy
+
+Web pages are not meant to be scraped and hence incompatible changes to them are allowed at any time. Users are expected to use REST APIs to get any information.
+
+### Hadoop Configuration Files
+
+Users use (1) Hadoop-defined properties to configure and provide hints to Hadoop and (2) custom properties to pass information to jobs. Hence, compatibility of config properties is two-fold:
+
+* Modifying key-names, units of values, and default values of Hadoop-defined properties.
+* Custom configuration property keys should not conflict with the namespace of Hadoop-defined properties. Typically, users should avoid using prefixes used by Hadoop: hadoop, io, ipc, fs, net, file, ftp, s3, kfs, ha, file, dfs, mapred, mapreduce, yarn.
+
+#### Policy
+
+* Hadoop-defined properties are to be deprecated at least for one major release before being removed. Modifying units for existing properties is not allowed.
+* The default values of Hadoop-defined properties can be changed across minor/major releases, but will remain the same across point releases within a minor release.
+* Currently, there is NO explicit policy regarding when new prefixes can be added/removed, and the list of prefixes to be avoided for custom configuration properties. However, as noted above, users should avoid using prefixes used by Hadoop: hadoop, io, ipc, fs, net, file, ftp, s3, kfs, ha, file, dfs, mapred, mapreduce, yarn.
+
+### Directory Structure
+
+Source code, artifacts (source and tests), user logs, configuration files, output and job history are all stored on disk either local file system or HDFS. Changing the directory structure of these user-accessible files break compatibility, even in cases where the original path is preserved via symbolic links (if, for example, the path is accessed by a servlet that is configured to not follow symbolic links).
+
+#### Policy
+
+* The layout of source code and build artifacts can change anytime, particularly so across major versions. Within a major version, the developers will attempt (no guarantees) to preserve the directory structure; however, individual files can be added/moved/deleted. The best way to ensure patches stay in sync with the code is to get them committed to the Apache source tree.
+* The directory structure of configuration files, user logs, and job history will be preserved across minor and point releases within a major release.
+
+### Java Classpath
+
+User applications built against Hadoop might add all Hadoop jars (including Hadoop's library dependencies) to the application's classpath. Adding new dependencies or updating the version of existing dependencies may interfere with those in applications' classpaths.
+
+#### Policy
+
+Currently, there is NO policy on when Hadoop's dependencies can change.
+
+### Environment variables
+
+Users and related projects often utilize the exported environment variables (eg HADOOP\_CONF\_DIR), therefore removing or renaming environment variables is an incompatible change.
+
+#### Policy
+
+Currently, there is NO policy on when the environment variables can change. Developers try to limit changes to major releases.
+
+### Build artifacts
+
+Hadoop uses maven for project management and changing the artifacts can affect existing user workflows.
+
+#### Policy
+
+* Test artifacts: The test jars generated are strictly for internal use and are not expected to be used outside of Hadoop, similar to APIs annotated @Private, @Unstable.
+* Built artifacts: The hadoop-client artifact (maven groupId:artifactId) stays compatible within a major release, while the other artifacts can change in incompatible ways.
+
+### Hardware/Software Requirements
+
+To keep up with the latest advances in hardware, operating systems, JVMs, and other software, new Hadoop releases or some of their features might require higher versions of the same. For a specific environment, upgrading Hadoop might require upgrading other dependent software components.
+
+#### Policies
+
+* Hardware
+    * Architecture: The community has no plans to restrict Hadoop to specific architectures, but can have family-specific optimizations.
+    * Minimum resources: While there are no guarantees on the minimum resources required by Hadoop daemons, the community attempts to not increase requirements within a minor release.
+* Operating Systems: The community will attempt to maintain the same OS requirements (OS kernel versions) within a minor release. Currently GNU/Linux and Microsoft Windows are the OSes officially supported by the community while Apache Hadoop is known to work reasonably well on other OSes such as Apple MacOSX, Solaris etc.
+* The JVM requirements will not change across point releases within the same minor release except if the JVM version under question becomes unsupported. Minor/major releases might require later versions of JVM for some/all of the supported operating systems.
+* Other software: The community tries to maintain the minimum versions of additional software required by Hadoop. For example, ssh, kerberos etc.
+
+References
+----------
+
+Here are some relevant JIRAs and pages related to the topic:
+
+* The evolution of this document - [HADOOP-9517](https://issues.apache.org/jira/browse/HADOOP-9517)
+* Binary compatibility for MapReduce end-user applications between hadoop-1.x and hadoop-2.x - [MapReduce Compatibility between hadoop-1.x and hadoop-2.x](../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html)
+* Annotations for interfaces as per interface classification schedule - [HADOOP-7391](https://issues.apache.org/jira/browse/HADOOP-7391) [Hadoop Interface Classification](./InterfaceClassification.html)
+* Compatibility for Hadoop 1.x releases - [HADOOP-5071](https://issues.apache.org/jira/browse/HADOOP-5071)
+* The [Hadoop Roadmap](http://wiki.apache.org/hadoop/Roadmap) page that captures other release policies
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
new file mode 100644
index 0000000..dae9928
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/DeprecatedProperties.md
@@ -0,0 +1,288 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Deprecated Properties
+=====================
+
+The following table lists the configuration property names that are deprecated in this version of Hadoop, and their replacements.
+
+| **Deprecated property name** | **New property name** |
+|:---- |:---- |
+| create.empty.dir.if.nonexist | mapreduce.jobcontrol.createdir.ifnotexist |
+| dfs.access.time.precision | dfs.namenode.accesstime.precision |
+| dfs.backup.address | dfs.namenode.backup.address |
+| dfs.backup.http.address | dfs.namenode.backup.http-address |
+| dfs.balance.bandwidthPerSec | dfs.datanode.balance.bandwidthPerSec |
+| dfs.block.size | dfs.blocksize |
+| dfs.data.dir | dfs.datanode.data.dir |
+| dfs.datanode.max.xcievers | dfs.datanode.max.transfer.threads |
+| dfs.df.interval | fs.df.interval |
+| dfs.federation.nameservice.id | dfs.nameservice.id |
+| dfs.federation.nameservices | dfs.nameservices |
+| dfs.http.address | dfs.namenode.http-address |
+| dfs.https.address | dfs.namenode.https-address |
+| dfs.https.client.keystore.resource | dfs.client.https.keystore.resource |
+| dfs.https.need.client.auth | dfs.client.https.need-auth |
+| dfs.max.objects | dfs.namenode.max.objects |
+| dfs.max-repl-streams | dfs.namenode.replication.max-streams |
+| dfs.name.dir | dfs.namenode.name.dir |
+| dfs.name.dir.restore | dfs.namenode.name.dir.restore |
+| dfs.name.edits.dir | dfs.namenode.edits.dir |
+| dfs.permissions | dfs.permissions.enabled |
+| dfs.permissions.supergroup | dfs.permissions.superusergroup |
+| dfs.read.prefetch.size | dfs.client.read.prefetch.size |
+| dfs.replication.considerLoad | dfs.namenode.replication.considerLoad |
+| dfs.replication.interval | dfs.namenode.replication.interval |
+| dfs.replication.min | dfs.namenode.replication.min |
+| dfs.replication.pending.timeout.sec | dfs.namenode.replication.pending.timeout-sec |
+| dfs.safemode.extension | dfs.namenode.safemode.extension |
+| dfs.safemode.threshold.pct | dfs.namenode.safemode.threshold-pct |
+| dfs.secondary.http.address | dfs.namenode.secondary.http-address |
+| dfs.socket.timeout | dfs.client.socket-timeout |
+| dfs.umaskmode | fs.permissions.umask-mode |
+| dfs.write.packet.size | dfs.client-write-packet-size |
+| fs.checkpoint.dir | dfs.namenode.checkpoint.dir |
+| fs.checkpoint.edits.dir | dfs.namenode.checkpoint.edits.dir |
+| fs.checkpoint.period | dfs.namenode.checkpoint.period |
+| fs.default.name | fs.defaultFS |
+| hadoop.configured.node.mapping | net.topology.configured.node.mapping |
+| hadoop.job.history.location | mapreduce.jobtracker.jobhistory.location |
+| hadoop.native.lib | io.native.lib.available |
+| hadoop.net.static.resolutions | mapreduce.tasktracker.net.static.resolutions |
+| hadoop.pipes.command-file.keep | mapreduce.pipes.commandfile.preserve |
+| hadoop.pipes.executable.interpretor | mapreduce.pipes.executable.interpretor |
+| hadoop.pipes.executable | mapreduce.pipes.executable |
+| hadoop.pipes.java.mapper | mapreduce.pipes.isjavamapper |
+| hadoop.pipes.java.recordreader | mapreduce.pipes.isjavarecordreader |
+| hadoop.pipes.java.recordwriter | mapreduce.pipes.isjavarecordwriter |
+| hadoop.pipes.java.reducer | mapreduce.pipes.isjavareducer |
+| hadoop.pipes.partitioner | mapreduce.pipes.partitioner |
+| heartbeat.recheck.interval | dfs.namenode.heartbeat.recheck-interval |
+| io.bytes.per.checksum | dfs.bytes-per-checksum |
+| io.sort.factor | mapreduce.task.io.sort.factor |
+| io.sort.mb | mapreduce.task.io.sort.mb |
+| io.sort.spill.percent | mapreduce.map.sort.spill.percent |
+| jobclient.completion.poll.interval | mapreduce.client.completion.pollinterval |
+| jobclient.output.filter | mapreduce.client.output.filter |
+| jobclient.progress.monitor.poll.interval | mapreduce.client.progressmonitor.pollinterval |
+| job.end.notification.url | mapreduce.job.end-notification.url |
+| job.end.retry.attempts | mapreduce.job.end-notification.retry.attempts |
+| job.end.retry.interval | mapreduce.job.end-notification.retry.interval |
+| job.local.dir | mapreduce.job.local.dir |
+| keep.failed.task.files | mapreduce.task.files.preserve.failedtasks |
+| keep.task.files.pattern | mapreduce.task.files.preserve.filepattern |
+| key.value.separator.in.input.line | mapreduce.input.keyvaluelinerecordreader.key.value.separator |
+| local.cache.size | mapreduce.tasktracker.cache.local.size |
+| map.input.file | mapreduce.map.input.file |
+| map.input.length | mapreduce.map.input.length |
+| map.input.start | mapreduce.map.input.start |
+| map.output.key.field.separator | mapreduce.map.output.key.field.separator |
+| map.output.key.value.fields.spec | mapreduce.fieldsel.map.output.key.value.fields.spec |
+| mapred.acls.enabled | mapreduce.cluster.acls.enabled |
+| mapred.binary.partitioner.left.offset | mapreduce.partition.binarypartitioner.left.offset |
+| mapred.binary.partitioner.right.offset | mapreduce.partition.binarypartitioner.right.offset |
+| mapred.cache.archives | mapreduce.job.cache.archives |
+| mapred.cache.archives.timestamps | mapreduce.job.cache.archives.timestamps |
+| mapred.cache.files | mapreduce.job.cache.files |
+| mapred.cache.files.timestamps | mapreduce.job.cache.files.timestamps |
+| mapred.cache.localArchives | mapreduce.job.cache.local.archives |
+| mapred.cache.localFiles | mapreduce.job.cache.local.files |
+| mapred.child.tmp | mapreduce.task.tmp.dir |
+| mapred.cluster.average.blacklist.threshold | mapreduce.jobtracker.blacklist.average.threshold |
+| mapred.cluster.map.memory.mb | mapreduce.cluster.mapmemory.mb |
+| mapred.cluster.max.map.memory.mb | mapreduce.jobtracker.maxmapmemory.mb |
+| mapred.cluster.max.reduce.memory.mb | mapreduce.jobtracker.maxreducememory.mb |
+| mapred.cluster.reduce.memory.mb | mapreduce.cluster.reducememory.mb |
+| mapred.committer.job.setup.cleanup.needed | mapreduce.job.committer.setup.cleanup.needed |
+| mapred.compress.map.output | mapreduce.map.output.compress |
+| mapred.data.field.separator | mapreduce.fieldsel.data.field.separator |
+| mapred.debug.out.lines | mapreduce.task.debugout.lines |
+| mapred.healthChecker.interval | mapreduce.tasktracker.healthchecker.interval |
+| mapred.healthChecker.script.args | mapreduce.tasktracker.healthchecker.script.args |
+| mapred.healthChecker.script.path | mapreduce.tasktracker.healthchecker.script.path |
+| mapred.healthChecker.script.timeout | mapreduce.tasktracker.healthchecker.script.timeout |
+| mapred.heartbeats.in.second | mapreduce.jobtracker.heartbeats.in.second |
+| mapred.hosts.exclude | mapreduce.jobtracker.hosts.exclude.filename |
+| mapred.hosts | mapreduce.jobtracker.hosts.filename |
+| mapred.inmem.merge.threshold | mapreduce.reduce.merge.inmem.threshold |
+| mapred.input.dir.formats | mapreduce.input.multipleinputs.dir.formats |
+| mapred.input.dir.mappers | mapreduce.input.multipleinputs.dir.mappers |
+| mapred.input.dir | mapreduce.input.fileinputformat.inputdir |
+| mapred.input.pathFilter.class | mapreduce.input.pathFilter.class |
+| mapred.jar | mapreduce.job.jar |
+| mapred.job.classpath.archives | mapreduce.job.classpath.archives |
+| mapred.job.classpath.files | mapreduce.job.classpath.files |
+| mapred.job.id | mapreduce.job.id |
+| mapred.jobinit.threads | mapreduce.jobtracker.jobinit.threads |
+| mapred.job.map.memory.mb | mapreduce.map.memory.mb |
+| mapred.job.name | mapreduce.job.name |
+| mapred.job.priority | mapreduce.job.priority |
+| mapred.job.queue.name | mapreduce.job.queuename |
+| mapred.job.reduce.input.buffer.percent | mapreduce.reduce.input.buffer.percent |
+| mapred.job.reduce.markreset.buffer.percent | mapreduce.reduce.markreset.buffer.percent |
+| mapred.job.reduce.memory.mb | mapreduce.reduce.memory.mb |
+| mapred.job.reduce.total.mem.bytes | mapreduce.reduce.memory.totalbytes |
+| mapred.job.reuse.jvm.num.tasks | mapreduce.job.jvm.numtasks |
+| mapred.job.shuffle.input.buffer.percent | mapreduce.reduce.shuffle.input.buffer.percent |
+| mapred.job.shuffle.merge.percent | mapreduce.reduce.shuffle.merge.percent |
+| mapred.job.tracker.handler.count | mapreduce.jobtracker.handler.count |
+| mapred.job.tracker.history.completed.location | mapreduce.jobtracker.jobhistory.completed.location |
+| mapred.job.tracker.http.address | mapreduce.jobtracker.http.address |
+| mapred.jobtracker.instrumentation | mapreduce.jobtracker.instrumentation |
+| mapred.jobtracker.job.history.block.size | mapreduce.jobtracker.jobhistory.block.size |
+| mapred.job.tracker.jobhistory.lru.cache.size | mapreduce.jobtracker.jobhistory.lru.cache.size |
+| mapred.job.tracker | mapreduce.jobtracker.address |
+| mapred.jobtracker.maxtasks.per.job | mapreduce.jobtracker.maxtasks.perjob |
+| mapred.job.tracker.persist.jobstatus.active | mapreduce.jobtracker.persist.jobstatus.active |
+| mapred.job.tracker.persist.jobstatus.dir | mapreduce.jobtracker.persist.jobstatus.dir |
+| mapred.job.tracker.persist.jobstatus.hours | mapreduce.jobtracker.persist.jobstatus.hours |
+| mapred.jobtracker.restart.recover | mapreduce.jobtracker.restart.recover |
+| mapred.job.tracker.retiredjobs.cache.size | mapreduce.jobtracker.retiredjobs.cache.size |
+| mapred.job.tracker.retire.jobs | mapreduce.jobtracker.retirejobs |
+| mapred.jobtracker.taskalloc.capacitypad | mapreduce.jobtracker.taskscheduler.taskalloc.capacitypad |
+| mapred.jobtracker.taskScheduler | mapreduce.jobtracker.taskscheduler |
+| mapred.jobtracker.taskScheduler.maxRunningTasksPerJob | mapreduce.jobtracker.taskscheduler.maxrunningtasks.perjob |
+| mapred.join.expr | mapreduce.join.expr |
+| mapred.join.keycomparator | mapreduce.join.keycomparator |
+| mapred.lazy.output.format | mapreduce.output.lazyoutputformat.outputformat |
+| mapred.line.input.format.linespermap | mapreduce.input.lineinputformat.linespermap |
+| mapred.linerecordreader.maxlength | mapreduce.input.linerecordreader.line.maxlength |
+| mapred.local.dir | mapreduce.cluster.local.dir |
+| mapred.local.dir.minspacekill | mapreduce.tasktracker.local.dir.minspacekill |
+| mapred.local.dir.minspacestart | mapreduce.tasktracker.local.dir.minspacestart |
+| mapred.map.child.env | mapreduce.map.env |
+| mapred.map.child.java.opts | mapreduce.map.java.opts |
+| mapred.map.child.log.level | mapreduce.map.log.level |
+| mapred.map.max.attempts | mapreduce.map.maxattempts |
+| mapred.map.output.compression.codec | mapreduce.map.output.compress.codec |
+| mapred.mapoutput.key.class | mapreduce.map.output.key.class |
+| mapred.mapoutput.value.class | mapreduce.map.output.value.class |
+| mapred.mapper.regex.group | mapreduce.mapper.regexmapper..group |
+| mapred.mapper.regex | mapreduce.mapper.regex |
+| mapred.map.task.debug.script | mapreduce.map.debug.script |
+| mapred.map.tasks | mapreduce.job.maps |
+| mapred.map.tasks.speculative.execution | mapreduce.map.speculative |
+| mapred.max.map.failures.percent | mapreduce.map.failures.maxpercent |
+| mapred.max.reduce.failures.percent | mapreduce.reduce.failures.maxpercent |
+| mapred.max.split.size | mapreduce.input.fileinputformat.split.maxsize |
+| mapred.max.tracker.blacklists | mapreduce.jobtracker.tasktracker.maxblacklists |
+| mapred.max.tracker.failures | mapreduce.job.maxtaskfailures.per.tracker |
+| mapred.merge.recordsBeforeProgress | mapreduce.task.merge.progress.records |
+| mapred.min.split.size | mapreduce.input.fileinputformat.split.minsize |
+| mapred.min.split.size.per.node | mapreduce.input.fileinputformat.split.minsize.per.node |
+| mapred.min.split.size.per.rack | mapreduce.input.fileinputformat.split.minsize.per.rack |
+| mapred.output.compression.codec | mapreduce.output.fileoutputformat.compress.codec |
+| mapred.output.compression.type | mapreduce.output.fileoutputformat.compress.type |
+| mapred.output.compress | mapreduce.output.fileoutputformat.compress |
+| mapred.output.dir | mapreduce.output.fileoutputformat.outputdir |
+| mapred.output.key.class | mapreduce.job.output.key.class |
+| mapred.output.key.comparator.class | mapreduce.job.output.key.comparator.class |
+| mapred.output.value.class | mapreduce.job.output.value.class |
+| mapred.output.value.groupfn.class | mapreduce.job.output.group.comparator.class |
+| mapred.permissions.supergroup | mapreduce.cluster.permissions.supergroup |
+| mapred.pipes.user.inputformat | mapreduce.pipes.inputformat |
+| mapred.reduce.child.env | mapreduce.reduce.env |
+| mapred.reduce.child.java.opts | mapreduce.reduce.java.opts |
+| mapred.reduce.child.log.level | mapreduce.reduce.log.level |
+| mapred.reduce.max.attempts | mapreduce.reduce.maxattempts |
+| mapred.reduce.parallel.copies | mapreduce.reduce.shuffle.parallelcopies |
+| mapred.reduce.slowstart.completed.maps | mapreduce.job.reduce.slowstart.completedmaps |
+| mapred.reduce.task.debug.script | mapreduce.reduce.debug.script |
+| mapred.reduce.tasks | mapreduce.job.reduces |
+| mapred.reduce.tasks.speculative.execution | mapreduce.reduce.speculative |
+| mapred.seqbinary.output.key.class | mapreduce.output.seqbinaryoutputformat.key.class |
+| mapred.seqbinary.output.value.class | mapreduce.output.seqbinaryoutputformat.value.class |
+| mapred.shuffle.connect.timeout | mapreduce.reduce.shuffle.connect.timeout |
+| mapred.shuffle.read.timeout | mapreduce.reduce.shuffle.read.timeout |
+| mapred.skip.attempts.to.start.skipping | mapreduce.task.skip.start.attempts |
+| mapred.skip.map.auto.incr.proc.count | mapreduce.map.skip.proc-count.auto-incr |
+| mapred.skip.map.max.skip.records | mapreduce.map.skip.maxrecords |
+| mapred.skip.on | mapreduce.job.skiprecords |
+| mapred.skip.out.dir | mapreduce.job.skip.outdir |
+| mapred.skip.reduce.auto.incr.proc.count | mapreduce.reduce.skip.proc-count.auto-incr |
+| mapred.skip.reduce.max.skip.groups | mapreduce.reduce.skip.maxgroups |
+| mapred.speculative.execution.slowNodeThreshold | mapreduce.job.speculative.slownodethreshold |
+| mapred.speculative.execution.slowTaskThreshold | mapreduce.job.speculative.slowtaskthreshold |
+| mapred.speculative.execution.speculativeCap | mapreduce.job.speculative.speculativecap |
+| mapred.submit.replication | mapreduce.client.submit.file.replication |
+| mapred.system.dir | mapreduce.jobtracker.system.dir |
+| mapred.task.cache.levels | mapreduce.jobtracker.taskcache.levels |
+| mapred.task.id | mapreduce.task.attempt.id |
+| mapred.task.is.map | mapreduce.task.ismap |
+| mapred.task.partition | mapreduce.task.partition |
+| mapred.task.profile | mapreduce.task.profile |
+| mapred.task.profile.maps | mapreduce.task.profile.maps |
+| mapred.task.profile.params | mapreduce.task.profile.params |
+| mapred.task.profile.reduces | mapreduce.task.profile.reduces |
+| mapred.task.timeout | mapreduce.task.timeout |
+| mapred.tasktracker.dns.interface | mapreduce.tasktracker.dns.interface |
+| mapred.tasktracker.dns.nameserver | mapreduce.tasktracker.dns.nameserver |
+| mapred.tasktracker.events.batchsize | mapreduce.tasktracker.events.batchsize |
+| mapred.tasktracker.expiry.interval | mapreduce.jobtracker.expire.trackers.interval |
+| mapred.task.tracker.http.address | mapreduce.tasktracker.http.address |
+| mapred.tasktracker.indexcache.mb | mapreduce.tasktracker.indexcache.mb |
+| mapred.tasktracker.instrumentation | mapreduce.tasktracker.instrumentation |
+| mapred.tasktracker.map.tasks.maximum | mapreduce.tasktracker.map.tasks.maximum |
+| mapred.tasktracker.memory\_calculator\_plugin | mapreduce.tasktracker.resourcecalculatorplugin |
+| mapred.tasktracker.memorycalculatorplugin | mapreduce.tasktracker.resourcecalculatorplugin |
+| mapred.tasktracker.reduce.tasks.maximum | mapreduce.tasktracker.reduce.tasks.maximum |
+| mapred.task.tracker.report.address | mapreduce.tasktracker.report.address |
+| mapred.task.tracker.task-controller | mapreduce.tasktracker.taskcontroller |
+| mapred.tasktracker.taskmemorymanager.monitoring-interval | mapreduce.tasktracker.taskmemorymanager.monitoringinterval |
+| mapred.tasktracker.tasks.sleeptime-before-sigkill | mapreduce.tasktracker.tasks.sleeptimebeforesigkill |
+| mapred.temp.dir | mapreduce.cluster.temp.dir |
+| mapred.text.key.comparator.options | mapreduce.partition.keycomparator.options |
+| mapred.text.key.partitioner.options | mapreduce.partition.keypartitioner.options |
+| mapred.textoutputformat.separator | mapreduce.output.textoutputformat.separator |
+| mapred.tip.id | mapreduce.task.id |
+| mapreduce.combine.class | mapreduce.job.combine.class |
+| mapreduce.inputformat.class | mapreduce.job.inputformat.class |
+| mapreduce.job.counters.limit | mapreduce.job.counters.max |
+| mapreduce.jobtracker.permissions.supergroup | mapreduce.cluster.permissions.supergroup |
+| mapreduce.map.class | mapreduce.job.map.class |
+| mapreduce.outputformat.class | mapreduce.job.outputformat.class |
+| mapreduce.partitioner.class | mapreduce.job.partitioner.class |
+| mapreduce.reduce.class | mapreduce.job.reduce.class |
+| mapred.used.genericoptionsparser | mapreduce.client.genericoptionsparser.used |
+| mapred.userlog.limit.kb | mapreduce.task.userlog.limit.kb |
+| mapred.userlog.retain.hours | mapreduce.job.userlog.retain.hours |
+| mapred.working.dir | mapreduce.job.working.dir |
+| mapred.work.output.dir | mapreduce.task.output.dir |
+| min.num.spills.for.combine | mapreduce.map.combine.minspills |
+| reduce.output.key.value.fields.spec | mapreduce.fieldsel.reduce.output.key.value.fields.spec |
+| security.job.submission.protocol.acl | security.job.client.protocol.acl |
+| security.task.umbilical.protocol.acl | security.job.task.protocol.acl |
+| sequencefile.filter.class | mapreduce.input.sequencefileinputfilter.class |
+| sequencefile.filter.frequency | mapreduce.input.sequencefileinputfilter.frequency |
+| sequencefile.filter.regex | mapreduce.input.sequencefileinputfilter.regex |
+| session.id | dfs.metrics.session-id |
+| slave.host.name | dfs.datanode.hostname |
+| slave.host.name | mapreduce.tasktracker.host.name |
+| tasktracker.contention.tracking | mapreduce.tasktracker.contention.tracking |
+| tasktracker.http.threads | mapreduce.tasktracker.http.threads |
+| topology.node.switch.mapping.impl | net.topology.node.switch.mapping.impl |
+| topology.script.file.name | net.topology.script.file.name |
+| topology.script.number.args | net.topology.script.number.args |
+| user.name | mapreduce.job.user.name |
+| webinterface.private.actions | mapreduce.jobtracker.webinterface.trusted |
+| yarn.app.mapreduce.yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts | yarn.app.mapreduce.client-am.ipc.max-retries-on-timeouts |
+
+The following table lists additional changes to some configuration properties:
+
+| **Deprecated property name** | **New property name** |
+|:---- |:---- |
+| mapred.create.symlink | NONE - symlinking is always on |
+| mapreduce.job.cache.symlink.create | NONE - symlinking is always on |
+
+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md b/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
new file mode 100644
index 0000000..37f644d
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md
@@ -0,0 +1,710 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+* [Overview](#Overview)
+    * [appendToFile](#appendToFile)
+    * [cat](#cat)
+    * [checksum](#checksum)
+    * [chgrp](#chgrp)
+    * [chmod](#chmod)
+    * [chown](#chown)
+    * [copyFromLocal](#copyFromLocal)
+    * [copyToLocal](#copyToLocal)
+    * [count](#count)
+    * [cp](#cp)
+    * [createSnapshot](#createSnapshot)
+    * [deleteSnapshot](#deleteSnapshot)
+    * [df](#df)
+    * [du](#du)
+    * [dus](#dus)
+    * [expunge](#expunge)
+    * [find](#find)
+    * [get](#get)
+    * [getfacl](#getfacl)
+    * [getfattr](#getfattr)
+    * [getmerge](#getmerge)
+    * [help](#help)
+    * [ls](#ls)
+    * [lsr](#lsr)
+    * [mkdir](#mkdir)
+    * [moveFromLocal](#moveFromLocal)
+    * [moveToLocal](#moveToLocal)
+    * [mv](#mv)
+    * [put](#put)
+    * [renameSnapshot](#renameSnapshot)
+    * [rm](#rm)
+    * [rmdir](#rmdir)
+    * [rmr](#rmr)
+    * [setfacl](#setfacl)
+    * [setfattr](#setfattr)
+    * [setrep](#setrep)
+    * [stat](#stat)
+    * [tail](#tail)
+    * [test](#test)
+    * [text](#text)
+    * [touchz](#touchz)
+    * [truncate](#truncate)
+    * [usage](#usage)
+
+Overview
+========
+
+The File System (FS) shell includes various shell-like commands that directly interact with the Hadoop Distributed File System (HDFS) as well as other file systems that Hadoop supports, such as Local FS, HFTP FS, S3 FS, and others. The FS shell is invoked by:
+
+    bin/hadoop fs <args>
+
+All FS shell commands take path URIs as arguments. The URI format is `scheme://authority/path`. For HDFS the scheme is `hdfs`, and for the Local FS the scheme is `file`. The scheme and authority are optional. If not specified, the default scheme specified in the configuration is used. An HDFS file or directory such as /parent/child can be specified as `hdfs://namenodehost/parent/child` or simply as `/parent/child` (given that your configuration is set to point to `hdfs://namenodehost`).
+
+Most of the commands in FS shell behave like corresponding Unix commands. Differences are described with each of the commands. Error information is sent to stderr and the output is sent to stdout.
+
+If HDFS is being used, `hdfs dfs` is a synonym.
+
+See the [Commands Manual](./CommandsManual.html) for generic shell options.
+
+appendToFile
+------------
+
+Usage: `hadoop fs -appendToFile <localsrc> ... <dst> `
+
+Append single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and appends to destination file system.
+
+* `hadoop fs -appendToFile localfile /user/hadoop/hadoopfile`
+* `hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile`
+* `hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile`
+* `hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile` Reads the input from stdin.
+
+Exit Code:
+
+Returns 0 on success and 1 on error.
+
+cat
+---
+
+Usage: `hadoop fs -cat URI [URI ...]`
+
+Copies source paths to stdout.
+
+Example:
+
+* `hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2`
+* `hadoop fs -cat file:///file3 /user/hadoop/file4`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+checksum
+--------
+
+Usage: `hadoop fs -checksum URI`
+
+Returns the checksum information of a file.
+
+Example:
+
+* `hadoop fs -checksum hdfs://nn1.example.com/file1`
+* `hadoop fs -checksum file:///etc/hosts`
+
+chgrp
+-----
+
+Usage: `hadoop fs -chgrp [-R] GROUP URI [URI ...]`
+
+Change group association of files. The user must be the owner of files, or else a super-user. Additional information is in the [Permissions Guide](../hadoop-hdfs/HdfsPermissionsGuide.html).
+
+Options
+
+* The -R option will make the change recursively through the directory structure.
+
+chmod
+-----
+
+Usage: `hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE> URI [URI ...]`
+
+Change the permissions of files. With -R, make the change recursively through the directory structure. The user must be the owner of the file, or else a super-user. Additional information is in the [Permissions Guide](../hadoop-hdfs/HdfsPermissionsGuide.html).
+
+Options
+
+* The -R option will make the change recursively through the directory structure.
+
+chown
+-----
+
+Usage: `hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]`
+
+Change the owner of files. The user must be a super-user. Additional information is in the [Permissions Guide](../hadoop-hdfs/HdfsPermissionsGuide.html).
+
+Options
+
+* The -R option will make the change recursively through the directory structure.
+
+copyFromLocal
+-------------
+
+Usage: `hadoop fs -copyFromLocal <localsrc> URI`
+
+Similar to put command, except that the source is restricted to a local file reference.
+
+Options:
+
+* The -f option will overwrite the destination if it already exists.
+
+copyToLocal
+-----------
+
+Usage: `hadoop fs -copyToLocal [-ignorecrc] [-crc] URI <localdst> `
+
+Similar to get command, except that the destination is restricted to a local file reference.
+
+count
+-----
+
+Usage: `hadoop fs -count [-q] [-h] [-v] <paths> `
+
+Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns with -count are: DIR\_COUNT, FILE\_COUNT, CONTENT\_SIZE, PATHNAME
+
+The output columns with -count -q are: QUOTA, REMAINING\_QUATA, SPACE\_QUOTA, REMAINING\_SPACE\_QUOTA, DIR\_COUNT, FILE\_COUNT, CONTENT\_SIZE, PATHNAME
+
+The -h option shows sizes in human readable format.
+
+The -v option displays a header line.
+
+Example:
+
+* `hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2`
+* `hadoop fs -count -q hdfs://nn1.example.com/file1`
+* `hadoop fs -count -q -h hdfs://nn1.example.com/file1`
+* `hdfs dfs -count -q -h -v hdfs://nn1.example.com/file1`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+cp
+----
+
+Usage: `hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...] <dest> `
+
+Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
+
+'raw.\*' namespace extended attributes are preserved if (1) the source and destination filesystems support them (HDFS only), and (2) all source and destination pathnames are in the /.reserved/raw hierarchy. Determination of whether raw.\* namespace xattrs are preserved is independent of the -p (preserve) flag.
+
+Options:
+
+* The -f option will overwrite the destination if it already exists.
+* The -p option will preserve file attributes [topx] (timestamps, ownership, permission, ACL, XAttr). If -p is specified with no *arg*, then preserves timestamps, ownership, permission. If -pa is specified, then preserves permission also because ACL is a super-set of permission. Determination of whether raw namespace extended attributes are preserved is independent of the -p flag.
+
+Example:
+
+* `hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2`
+* `hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+createSnapshot
+--------------
+
+See [HDFS Snapshots Guide](../hadoop-hdfs/HdfsSnapshots.html).
+
+deleteSnapshot
+--------------
+
+See [HDFS Snapshots Guide](../hadoop-hdfs/HdfsSnapshots.html).
+
+df
+----
+
+Usage: `hadoop fs -df [-h] URI [URI ...]`
+
+Displays free space.
+
+Options:
+
+* The -h option will format file sizes in a "human-readable" fashion (e.g 64.0m instead of 67108864)
+
+Example:
+
+* `hadoop dfs -df /user/hadoop/dir1`
+
+du
+----
+
+Usage: `hadoop fs -du [-s] [-h] URI [URI ...]`
+
+Displays sizes of files and directories contained in the given directory or the length of a file in case its just a file.
+
+Options:
+
+* The -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files.
+* The -h option will format file sizes in a "human-readable" fashion (e.g 64.0m instead of 67108864)
+
+Example:
+
+* `hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1`
+
+Exit Code: Returns 0 on success and -1 on error.
+
+dus
+---
+
+Usage: `hadoop fs -dus <args> `
+
+Displays a summary of file lengths.
+
+**Note:** This command is deprecated. Instead use `hadoop fs -du -s`.
+
+expunge
+-------
+
+Usage: `hadoop fs -expunge`
+
+Empty the Trash. Refer to the [HDFS Architecture Guide](../hadoop-hdfs/HdfsDesign.html) for more information on the Trash feature.
+
+find
+----
+
+Usage: `hadoop fs -find <path> ... <expression> ... `
+
+Finds all files that match the specified expression and applies selected actions to them. If no *path* is specified then defaults to the current working directory. If no expression is specified then defaults to -print.
+
+The following primary expressions are recognised:
+
+*   -name pattern<br />-iname pattern
+
+    Evaluates as true if the basename of the file matches the pattern using standard file system globbing. If -iname is used then the match is case insensitive.
+
+*   -print<br />-print0Always
+
+    evaluates to true. Causes the current pathname to be written to standard output. If the -print0 expression is used then an ASCII NULL character is appended.
+
+The following operators are recognised:
+
+* expression -a expression<br />expression -and expression<br />expression expression
+
+    Logical AND operator for joining two expressions. Returns true if both child expressions return true. Implied by the juxtaposition of two expressions and so does not need to be explicitly specified. The second expression will not be applied if the first fails.
+
+Example:
+
+`hadoop fs -find / -name test -print`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+get
+---
+
+Usage: `hadoop fs -get [-ignorecrc] [-crc] <src> <localdst> `
+
+Copy files to the local file system. Files that fail the CRC check may be copied with the -ignorecrc option. Files and CRCs may be copied using the -crc option.
+
+Example:
+
+* `hadoop fs -get /user/hadoop/file localfile`
+* `hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+getfacl
+-------
+
+Usage: `hadoop fs -getfacl [-R] <path> `
+
+Displays the Access Control Lists (ACLs) of files and directories. If a directory has a default ACL, then getfacl also displays the default ACL.
+
+Options:
+
+* -R: List the ACLs of all files and directories recursively.
+* *path*: File or directory to list.
+
+Examples:
+
+* `hadoop fs -getfacl /file`
+* `hadoop fs -getfacl -R /dir`
+
+Exit Code:
+
+Returns 0 on success and non-zero on error.
+
+getfattr
+--------
+
+Usage: `hadoop fs -getfattr [-R] -n name | -d [-e en] <path> `
+
+Displays the extended attribute names and values (if any) for a file or directory.
+
+Options:
+
+* -R: Recursively list the attributes for all files and directories.
+* -n name: Dump the named extended attribute value.
+* -d: Dump all extended attribute values associated with pathname.
+* -e *encoding*: Encode values after retrieving them. Valid encodings are "text", "hex", and "base64". Values encoded as text strings are enclosed in double quotes ("), and values encoded as hexadecimal and base64 are prefixed with 0x and 0s, respectively.
+* *path*: The file or directory.
+
+Examples:
+
+* `hadoop fs -getfattr -d /file`
+* `hadoop fs -getfattr -R -n user.myAttr /dir`
+
+Exit Code:
+
+Returns 0 on success and non-zero on error.
+
+getmerge
+--------
+
+Usage: `hadoop fs -getmerge <src> <localdst> [addnl]`
+
+Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally addnl can be set to enable adding a newline character at the end of each file.
+
+help
+----
+
+Usage: `hadoop fs -help`
+
+Return usage output.
+
+ls
+----
+
+Usage: `hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u] <args> `
+
+Options:
+
+* -d: Directories are listed as plain files.
+* -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
+* -R: Recursively list subdirectories encountered.
+* -t: Sort output by modification time (most recent first).
+* -S: Sort output by file size.
+* -r: Reverse the sort order.
+* -u: Use access time rather than modification time for display and sorting.  
+
+For a file ls returns stat on the file with the following format:
+
+    permissions number_of_replicas userid groupid filesize modification_date modification_time filename
+
+For a directory it returns list of its direct children as in Unix. A directory is listed as:
+
+    permissions userid groupid modification_date modification_time dirname
+
+Files within a directory are order by filename by default.
+
+Example:
+
+* `hadoop fs -ls /user/hadoop/file1`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+lsr
+---
+
+Usage: `hadoop fs -lsr <args> `
+
+Recursive version of ls.
+
+**Note:** This command is deprecated. Instead use `hadoop fs -ls -R`
+
+mkdir
+-----
+
+Usage: `hadoop fs -mkdir [-p] <paths> `
+
+Takes path uri's as argument and creates directories.
+
+Options:
+
+* The -p option behavior is much like Unix mkdir -p, creating parent directories along the path.
+
+Example:
+
+* `hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2`
+* `hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+moveFromLocal
+-------------
+
+Usage: `hadoop fs -moveFromLocal <localsrc> <dst> `
+
+Similar to put command, except that the source localsrc is deleted after it's copied.
+
+moveToLocal
+-----------
+
+Usage: `hadoop fs -moveToLocal [-crc] <src> <dst> `
+
+Displays a "Not implemented yet" message.
+
+mv
+----
+
+Usage: `hadoop fs -mv URI [URI ...] <dest> `
+
+Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Moving files across file systems is not permitted.
+
+Example:
+
+* `hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2`
+* `hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+put
+---
+
+Usage: `hadoop fs -put <localsrc> ... <dst> `
+
+Copy single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and writes to destination file system.
+
+* `hadoop fs -put localfile /user/hadoop/hadoopfile`
+* `hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir`
+* `hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile`
+* `hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile` Reads the input from stdin.
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+renameSnapshot
+--------------
+
+See [HDFS Snapshots Guide](../hadoop-hdfs/HdfsSnapshots.html).
+
+rm
+----
+
+Usage: `hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]`
+
+Delete files specified as args.
+
+Options:
+
+* The -f option will not display a diagnostic message or modify the exit status to reflect an error if the file does not exist.
+* The -R option deletes the directory and any content under it recursively.
+* The -r option is equivalent to -R.
+* The -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.
+
+Example:
+
+* `hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+rmdir
+-----
+
+Usage: `hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]`
+
+Delete a directory.
+
+Options:
+
+* `--ignore-fail-on-non-empty`: When using wildcards, do not fail if a directory still contains files.
+
+Example:
+
+* `hadoop fs -rmdir /user/hadoop/emptydir`
+
+rmr
+---
+
+Usage: `hadoop fs -rmr [-skipTrash] URI [URI ...]`
+
+Recursive version of delete.
+
+**Note:** This command is deprecated. Instead use `hadoop fs -rm -r`
+
+setfacl
+-------
+
+Usage: `hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>] `
+
+Sets Access Control Lists (ACLs) of files and directories.
+
+Options:
+
+* -b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
+* -k: Remove the default ACL.
+* -R: Apply operations to all files and directories recursively.
+* -m: Modify ACL. New entries are added to the ACL, and existing entries are retained.
+* -x: Remove specified ACL entries. Other ACL entries are retained.
+* ``--set``: Fully replace the ACL, discarding all existing entries. The *acl\_spec* must include entries for user, group, and others for compatibility with permission bits.
+* *acl\_spec*: Comma separated list of ACL entries.
+* *path*: File or directory to modify.
+
+Examples:
+
+* `hadoop fs -setfacl -m user:hadoop:rw- /file`
+* `hadoop fs -setfacl -x user:hadoop /file`
+* `hadoop fs -setfacl -b /file`
+* `hadoop fs -setfacl -k /dir`
+* `hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file`
+* `hadoop fs -setfacl -R -m user:hadoop:r-x /dir`
+* `hadoop fs -setfacl -m default:user:hadoop:r-x /dir`
+
+Exit Code:
+
+Returns 0 on success and non-zero on error.
+
+setfattr
+--------
+
+Usage: `hadoop fs -setfattr -n name [-v value] | -x name <path> `
+
+Sets an extended attribute name and value for a file or directory.
+
+Options:
+
+* -b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
+* -n name: The extended attribute name.
+* -v value: The extended attribute value. There are three different encoding methods for the value. If the argument is enclosed in double quotes, then the value is the string inside the quotes. If the argument is prefixed with 0x or 0X, then it is taken as a hexadecimal number. If the argument begins with 0s or 0S, then it is taken as a base64 encoding.
+* -x name: Remove the extended attribute.
+* *path*: The file or directory.
+
+Examples:
+
+* `hadoop fs -setfattr -n user.myAttr -v myValue /file`
+* `hadoop fs -setfattr -n user.noValue /file`
+* `hadoop fs -setfattr -x user.myAttr /file`
+
+Exit Code:
+
+Returns 0 on success and non-zero on error.
+
+setrep
+------
+
+Usage: `hadoop fs -setrep [-R] [-w] <numReplicas> <path> `
+
+Changes the replication factor of a file. If *path* is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at *path*.
+
+Options:
+
+* The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
+* The -R flag is accepted for backwards compatibility. It has no effect.
+
+Example:
+
+* `hadoop fs -setrep -w 3 /user/hadoop/dir1`
+
+Exit Code:
+
+Returns 0 on success and -1 on error.
+
+stat
+----
+
+Usage: `hadoop fs -stat [format] <path> ...`
+
+Print statistics about the file/directory at \<path\> in the specified format. Format accepts filesize in blocks (%b), type (%F), group name of owner (%g), name (%n), block size (%o), replication (%r), user name of owner(%u), and modification date (%y, %Y). %y shows UTC date as "yyyy-MM-dd HH:mm:ss" and %Y shows milliseconds since January 1, 1970 UTC. If the format is not specified, %y is used by default.
+
+Example:
+
+* `hadoop fs -stat "%F %u:%g %b %y %n" /file`
+
+Exit Code: Returns 0 on success and -1 on error.
+
+tail
+----
+
+Usage: `hadoop fs -tail [-f] URI`
+
+Displays last kilobyte of the file to stdout.
+
+Options:
+
+* The -f option will output appended data as the file grows, as in Unix.
+
+Example:
+
+* `hadoop fs -tail pathname`
+
+Exit Code: Returns 0 on success and -1 on error.
+
+test
+----
+
+Usage: `hadoop fs -test -[defsz] URI`
+
+Options:
+
+* -d: f the path is a directory, return 0.
+* -e: if the path exists, return 0.
+* -f: if the path is a file, return 0.
+* -s: if the path is not empty, return 0.
+* -z: if the file is zero length, return 0.
+
+Example:
+
+* `hadoop fs -test -e filename`
+
+text
+----
+
+Usage: `hadoop fs -text <src> `
+
+Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
+
+touchz
+------
+
+Usage: `hadoop fs -touchz URI [URI ...]`
+
+Create a file of zero length.
+
+Example:
+
+* `hadoop fs -touchz pathname`
+
+Exit Code: Returns 0 on success and -1 on error.
+
+truncate
+--------
+
+Usage: `hadoop fs -truncate [-w] <length> <paths>`
+
+Truncate all files that match the specified file pattern to the
+specified length.
+
+Options:
+
+* The `-w` flag requests that the command waits for block recovery
+  to complete, if necessary. Without -w flag the file may remain
+  unclosed for some time while the recovery is in progress.
+  During this time file cannot be reopened for append.
+
+Example:
+
+* `hadoop fs -truncate 55 /user/hadoop/file1 /user/hadoop/file2`
+* `hadoop fs -truncate -w 127 hdfs://nn1.example.com/user/hadoop/file1`
+
+usage
+-----
+
+Usage: `hadoop fs -usage command`
+
+Return the help for an individual command.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/markdown/HttpAuthentication.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/HttpAuthentication.md b/hadoop-common-project/hadoop-common/src/site/markdown/HttpAuthentication.md
new file mode 100644
index 0000000..e0a2693
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/HttpAuthentication.md
@@ -0,0 +1,58 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Authentication for Hadoop HTTP web-consoles
+===========================================
+
+* [Authentication for Hadoop HTTP web-consoles](#Authentication_for_Hadoop_HTTP_web-consoles)
+    * [Introduction](#Introduction)
+    * [Configuration](#Configuration)
+
+Introduction
+------------
+
+This document describes how to configure Hadoop HTTP web-consoles to require user authentication.
+
+By default Hadoop HTTP web-consoles (JobTracker, NameNode, TaskTrackers and DataNodes) allow access without any form of authentication.
+
+Similarly to Hadoop RPC, Hadoop HTTP web-consoles can be configured to require Kerberos authentication using HTTP SPNEGO protocol (supported by browsers like Firefox and Internet Explorer).
+
+In addition, Hadoop HTTP web-consoles support the equivalent of Hadoop's Pseudo/Simple authentication. If this option is enabled, user must specify their user name in the first browser interaction using the user.name query string parameter. For example: `http://localhost:50030/jobtracker.jsp?user.name=babu`.
+
+If a custom authentication mechanism is required for the HTTP web-consoles, it is possible to implement a plugin to support the alternate authentication mechanism (refer to Hadoop hadoop-auth for details on writing an `AuthenticatorHandler`).
+
+The next section describes how to configure Hadoop HTTP web-consoles to require user authentication.
+
+Configuration
+-------------
+
+The following properties should be in the `core-site.xml` of all the nodes in the cluster.
+
+`hadoop.http.filter.initializers`: add to this property the `org.apache.hadoop.security.AuthenticationFilterInitializer` initializer class.
+
+`hadoop.http.authentication.type`: Defines authentication used for the HTTP web-consoles. The supported values are: `simple` | `kerberos` | `#AUTHENTICATION_HANDLER_CLASSNAME#`. The dfeault value is `simple`.
+
+`hadoop.http.authentication.token.validity`: Indicates how long (in seconds) an authentication token is valid before it has to be renewed. The default value is `36000`.
+
+`hadoop.http.authentication.signature.secret.file`: The signature secret file for signing the authentication tokens. The same secret should be used for all nodes in the cluster, JobTracker, NameNode, DataNode and TastTracker. The default value is `$user.home/hadoop-http-auth-signature-secret`. IMPORTANT: This file should be readable only by the Unix user running the daemons.
+
+`hadoop.http.authentication.cookie.domain`: The domain to use for the HTTP cookie that stores the authentication token. In order to authentiation to work correctly across all nodes in the cluster the domain must be correctly set. There is no default value, the HTTP cookie will not have a domain working only with the hostname issuing the HTTP cookie.
+
+IMPORTANT: when using IP addresses, browsers ignore cookies with domain settings. For this setting to work properly all nodes in the cluster must be configured to generate URLs with `hostname.domain` names on it.
+
+`hadoop.http.authentication.simple.anonymous.allowed`: Indicates if anonymous requests are allowed when using 'simple' authentication. The default value is `true`
+
+`hadoop.http.authentication.kerberos.principal`: Indicates the Kerberos principal to be used for HTTP endpoint when using 'kerberos' authentication. The principal short name must be `HTTP` per Kerberos HTTP SPNEGO specification. The default value is `HTTP/_HOST@$LOCALHOST`, where `_HOST` -if present- is replaced with bind address of the HTTP server.
+
+`hadoop.http.authentication.kerberos.keytab`: Location of the keytab file with the credentials for the Kerberos principal used for the HTTP endpoint. The default value is `$user.home/hadoop.keytab`.i

http://git-wip-us.apache.org/repos/asf/hadoop/blob/343cffb0/hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md
----------------------------------------------------------------------
diff --git a/hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md b/hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md
new file mode 100644
index 0000000..0392610
--- /dev/null
+++ b/hadoop-common-project/hadoop-common/src/site/markdown/InterfaceClassification.md
@@ -0,0 +1,105 @@
+<!---
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+Hadoop Interface Taxonomy: Audience and Stability Classification
+================================================================
+
+* [Hadoop Interface Taxonomy: Audience and Stability Classification](#Hadoop_Interface_Taxonomy:_Audience_and_Stability_Classification)
+    * [Motivation](#Motivation)
+    * [Interface Classification](#Interface_Classification)
+        * [Audience](#Audience)
+        * [Stability](#Stability)
+    * [How are the Classifications Recorded?](#How_are_the_Classifications_Recorded)
+    * [FAQ](#FAQ)
+
+Motivation
+----------
+
+The interface taxonomy classification provided here is for guidance to developers and users of interfaces. The classification guides a developer to declare the targeted audience or users of an interface and also its stability.
+
+* Benefits to the user of an interface: Knows which interfaces to use or not use and their stability.
+* Benefits to the developer: to prevent accidental changes of interfaces and hence accidental impact on users or other components or system. This is particularly useful in large systems with many developers who may not all have a shared state/history of the project.
+
+Interface Classification
+------------------------
+
+Hadoop adopts the following interface classification, this classification was derived from the [OpenSolaris taxonomy](http://www.opensolaris.org/os/community/arc/policies/interface-taxonomy/#Advice) and, to some extent, from taxonomy used inside Yahoo. Interfaces have two main attributes: Audience and Stability
+
+### Audience
+
+Audience denotes the potential consumers of the interface. While many interfaces are internal/private to the implementation, other are public/external interfaces are meant for wider consumption by applications and/or clients. For example, in posix, libc is an external or public interface, while large parts of the kernel are internal or private interfaces. Also, some interfaces are targeted towards other specific subsystems.
+
+Identifying the audience of an interface helps define the impact of breaking it. For instance, it might be okay to break the compatibility of an interface whose audience is a small number of specific subsystems. On the other hand, it is probably not okay to break a protocol interfaces that millions of Internet users depend on.
+
+Hadoop uses the following kinds of audience in order of increasing/wider visibility:
+
+* Private:
+    * The interface is for internal use within the project (such as HDFS or MapReduce) and should not be used by applications or by other projects. It is subject to change at anytime without notice. Most interfaces of a project are Private (also referred to as project-private).
+* Limited-Private:
+    * The interface is used by a specified set of projects or systems (typically closely related projects). Other projects or systems should not use the interface. Changes to the interface will be communicated/ negotiated with the specified projects. For example, in the Hadoop project, some interfaces are LimitedPrivate{HDFS, MapReduce} in that they are private to the HDFS and MapReduce projects.
+* Public
+    * The interface is for general use by any application.
+
+Hadoop doesn't have a Company-Private classification, which is meant for APIs which are intended to be used by other projects within the company, since it doesn't apply to opensource projects. Also, certain APIs are annotated as @VisibleForTesting (from com.google.common .annotations.VisibleForTesting) - these are meant to be used strictly for unit tests and should be treated as "Private" APIs.
+
+### Stability
+
+Stability denotes how stable an interface is, as in when incompatible changes to the interface are allowed. Hadoop APIs have the following levels of stability.
+
+* Stable
+    * Can evolve while retaining compatibility for minor release boundaries; in other words, incompatible changes to APIs marked Stable are allowed only at major releases (i.e. at m.0).
+* Evolving
+    * Evolving, but incompatible changes are allowed at minor release (i.e. m .x)
+* Unstable
+    * Incompatible changes to Unstable APIs are allowed any time. This usually makes sense for only private interfaces.
+    * However one may call this out for a supposedly public interface to highlight that it should not be used as an interface; for public interfaces, labeling it as Not-an-interface is probably more appropriate than "Unstable".
+        * Examples of publicly visible interfaces that are unstable (i.e. not-an-interface): GUI, CLIs whose output format will change
+* Deprecated
+    * APIs that could potentially removed in the future and should not be used.
+
+How are the Classifications Recorded?
+-------------------------------------
+
+How will the classification be recorded for Hadoop APIs?
+
+* Each interface or class will have the audience and stability recorded using annotations in org.apache.hadoop.classification package.
+* The javadoc generated by the maven target javadoc:javadoc lists only the public API.
+* One can derive the audience of java classes and java interfaces by the audience of the package in which they are contained. Hence it is useful to declare the audience of each java package as public or private (along with the private audience variations).
+
+FAQ
+---
+
+* Why aren’t the java scopes (private, package private and public) good enough?
+    * Java’s scoping is not very complete. One is often forced to make a class public in order for other internal components to use it. It does not have friends or sub-package-private like C++.
+* But I can easily access a private implementation interface if it is Java public. Where is the protection and control?
+    * The purpose of this is not providing absolute access control. Its purpose is to communicate to users and developers. One can access private implementation functions in libc; however if they change the internal implementation details, your application will break and you will have little sympathy from the folks who are supplying libc. If you use a non-public interface you understand the risks.
+* Why bother declaring the stability of a private interface? Aren’t private interfaces always unstable?
+    * Private interfaces are not always unstable. In the cases where they are stable they capture internal properties of the system and can communicate these properties to its internal users and to developers of the interface.
+        * e.g. In HDFS, NN-DN protocol is private but stable and can help implement rolling upgrades. It communicates that this interface should not be changed in incompatible ways even though it is private.
+        * e.g. In HDFS, FSImage stability can help provide more flexible roll backs.
+* What is the harm in applications using a private interface that is stable? How is it different than a public stable interface?
+    * While a private interface marked as stable is targeted to change only at major releases, it may break at other times if the providers of that interface are willing to changes the internal users of that interface. Further, a public stable interface is less likely to break even at major releases (even though it is allowed to break compatibility) because the impact of the change is larger. If you use a private interface (regardless of its stability) you run the risk of incompatibility.
+* Why bother with Limited-private? Isn’t it giving special treatment to some projects? That is not fair.
+    * First, most interfaces should be public or private; actually let us state it even stronger: make it private unless you really want to expose it to public for general use.
+    * Limited-private is for interfaces that are not intended for general use. They are exposed to related projects that need special hooks. Such a classification has a cost to both the supplier and consumer of the limited interface. Both will have to work together if ever there is a need to break the interface in the future; for example the supplier and the consumers will have to work together to get coordinated releases of their respective projects. This should not be taken lightly – if you can get away with private then do so; if the interface is really for general use for all applications then do so. But remember that making an interface public has huge responsibility. Sometimes Limited-private is just right.
+    * A good example of a limited-private interface is BlockLocations, This is fairly low-level interface that we are willing to expose to MR and perhaps HBase. We are likely to change it down the road and at that time we will have get a coordinated effort with the MR team to release matching releases. While MR and HDFS are always released in sync today, they may change down the road.
+    * If you have a limited-private interface with many projects listed then you are fooling yourself. It is practically public.
+    * It might be worth declaring a special audience classification called Hadoop-Private for the Hadoop family.
+* Lets treat all private interfaces as Hadoop-private. What is the harm in projects in the Hadoop family have access to private classes?
+    * Do we want MR accessing class files that are implementation details inside HDFS. There used to be many such layer violations in the code that we have been cleaning up over the last few years. We don’t want such layer violations to creep back in by no separating between the major components like HDFS and MR.
+* Aren't all public interfaces stable?
+    * One may mark a public interface as evolving in its early days. Here one is promising to make an effort to make compatible changes but may need to break it at minor releases.
+    * One example of a public interface that is unstable is where one is providing an implementation of a standards-body based interface that is still under development. For example, many companies, in an attampt to be first to market, have provided implementations of a new NFS protocol even when the protocol was not fully completed by IETF. The implementor cannot evolve the interface in a fashion that causes least distruption because the stability is controlled by the standards body. Hence it is appropriate to label the interface as unstable.
+
+