You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by ec...@apache.org on 2015/09/15 21:11:59 UTC
[14/50] [abbrv] hadoop git commit: HDFS-8974. Convert docs in xdoc
format to markdown. Contributed by Masatake Iwasaki.
HDFS-8974. Convert docs in xdoc format to markdown. Contributed by Masatake Iwasaki.
Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/7b5b2c58
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/7b5b2c58
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/7b5b2c58
Branch: refs/heads/HADOOP-11890
Commit: 7b5b2c5822ac722893ef5db753144f18d5056f5b
Parents: f153710
Author: Akira Ajisaka <aa...@apache.org>
Authored: Thu Sep 10 16:45:27 2015 +0900
Committer: Akira Ajisaka <aa...@apache.org>
Committed: Thu Sep 10 16:45:27 2015 +0900
----------------------------------------------------------------------
hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt | 3 +
.../src/site/markdown/HdfsRollingUpgrade.md | 293 +++++++++++++++++
.../src/site/markdown/HdfsSnapshots.md | 301 +++++++++++++++++
.../src/site/xdoc/HdfsRollingUpgrade.xml | 329 -------------------
.../hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml | 303 -----------------
5 files changed, 597 insertions(+), 632 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/hadoop/blob/7b5b2c58/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
index bbb6066..445c50f 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
+++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
@@ -911,6 +911,9 @@ Release 2.8.0 - UNRELEASED
HDFS-7116. Add a command to get the balancer bandwidth
(Rakesh R via vinayakumarb)
+ HDFS-8974. Convert docs in xdoc format to markdown.
+ (Masatake Iwasaki via aajisaka)
+
OPTIMIZATIONS
HDFS-8026. Trace FSOutputSummer#writeChecksumChunks rather than
http://git-wip-us.apache.org/repos/asf/hadoop/blob/7b5b2c58/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md
new file mode 100644
index 0000000..5415912
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md
@@ -0,0 +1,293 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+HDFS Rolling Upgrade
+====================
+
+* [Introduction](#Introduction)
+* [Upgrade](#Upgrade)
+ * [Upgrade without Downtime](#Upgrade_without_Downtime)
+ * [Upgrading Non-Federated Clusters](#Upgrading_Non-Federated_Clusters)
+ * [Upgrading Federated Clusters](#Upgrading_Federated_Clusters)
+ * [Upgrade with Downtime](#Upgrade_with_Downtime)
+ * [Upgrading Non-HA Clusters](#Upgrading_Non-HA_Clusters)
+* [Downgrade and Rollback](#Downgrade_and_Rollback)
+* [Downgrade](#Downgrade)
+* [Rollback](#Rollback)
+* [Commands and Startup Options for Rolling Upgrade](#Commands_and_Startup_Options_for_Rolling_Upgrade)
+ * [DFSAdmin Commands](#DFSAdmin_Commands)
+ * [dfsadmin -rollingUpgrade](#dfsadmin_-rollingUpgrade)
+ * [dfsadmin -getDatanodeInfo](#dfsadmin_-getDatanodeInfo)
+ * [dfsadmin -shutdownDatanode](#dfsadmin_-shutdownDatanode)
+ * [NameNode Startup Options](#NameNode_Startup_Options)
+ * [namenode -rollingUpgrade](#namenode_-rollingUpgrade)
+
+
+Introduction
+------------
+
+*HDFS rolling upgrade* allows upgrading individual HDFS daemons.
+For examples, the datanodes can be upgraded independent of the namenodes.
+A namenode can be upgraded independent of the other namenodes.
+The namenodes can be upgraded independent of datanods and journal nodes.
+
+
+Upgrade
+-------
+
+In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility.
+These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime.
+In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA.
+
+If there is any new feature which is enabled in new software release, may not work with old software release after upgrade.
+In such cases upgrade should be done by following steps.
+
+1. Disable new feature.
+2. Upgrade the cluster.
+3. Enable the new feature.
+
+Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards.
+
+
+### Upgrade without Downtime
+
+In a HA cluster, there are two or more *NameNodes (NNs)*, many *DataNodes (DNs)*,
+a few *JournalNodes (JNs)* and a few *ZooKeeperNodes (ZKNs)*.
+*JNs* is relatively stable and does not require upgrade when upgrading HDFS in most of the cases.
+In the rolling upgrade procedure described here,
+only *NNs* and *DNs* are considered but *JNs* and *ZKNs* are not.
+Upgrading *JNs* and *ZKNs* may incur cluster downtime.
+
+#### Upgrading Non-Federated Clusters
+
+Suppose there are two namenodes *NN1* and *NN2*,
+where *NN1* and *NN2* are respectively in active and standby states.
+The following are the steps for upgrading a HA cluster:
+
+1. Prepare Rolling Upgrade
+ 1. Run "[`hdfs dfsadmin -rollingUpgrade prepare`](#dfsadmin_-rollingUpgrade)"
+ to create a fsimage for rollback.
+ 1. Run "[`hdfs dfsadmin -rollingUpgrade query`](#dfsadmin_-rollingUpgrade)"
+ to check the status of the rollback image.
+ Wait and re-run the command until
+ the "`Proceed with rolling upgrade`" message is shown.
+1. Upgrade Active and Standby *NNs*
+ 1. Shutdown and upgrade *NN2*.
+ 1. Start *NN2* as standby with the
+ "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option.
+ 1. Failover from *NN1* to *NN2*
+ so that *NN2* becomes active and *NN1* becomes standby.
+ 1. Shutdown and upgrade *NN1*.
+ 1. Start *NN1* as standby with the
+ "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option.
+1. Upgrade *DNs*
+ 1. Choose a small subset of datanodes (e.g. all datanodes under a particular rack).
+ 1. Run "[`hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade`](#dfsadmin_-shutdownDatanode)"
+ to shutdown one of the chosen datanodes.
+ 1. Run "[`hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>`](#dfsadmin_-getDatanodeInfo)"
+ to check and wait for the datanode to shutdown.
+ 1. Upgrade and restart the datanode.
+ 1. Perform the above steps for all the chosen datanodes in the subset in parallel.
+ 1. Repeat the above steps until all datanodes in the cluster are upgraded.
+1. Finalize Rolling Upgrade
+ 1. Run "[`hdfs dfsadmin -rollingUpgrade finalize`](#dfsadmin_-rollingUpgrade)"
+ to finalize the rolling upgrade.
+
+
+#### Upgrading Federated Clusters
+
+In a federated cluster, there are multiple namespaces
+and a pair of active and standby *NNs* for each namespace.
+The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster
+except that Step 1 and Step 4 are performed on each namespace
+and Step 2 is performed on each pair of active and standby *NNs*, i.e.
+
+1. Prepare Rolling Upgrade for Each Namespace
+1. Upgrade Active and Standby *NN* pairs for Each Namespace
+1. Upgrade *DNs*
+1. Finalize Rolling Upgrade for Each Namespace
+
+
+### Upgrade with Downtime
+
+For non-HA clusters,
+it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes.
+However, datanodes can still be upgraded in a rolling manner.
+
+
+#### Upgrading Non-HA Clusters
+
+In a non-HA cluster, there are a *NameNode (NN)*, a *SecondaryNameNode (SNN)*
+and many *DataNodes (DNs)*.
+The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster
+except that Step 2 "Upgrade Active and Standby *NNs*" is changed to below:
+
+* Upgrade *NN* and *SNN*
+ 1. Shutdown *SNN*
+ 1. Shutdown and upgrade *NN*.
+ 1. Start *NN* with the
+ "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option.
+ 1. Upgrade and restart *SNN*
+
+
+Downgrade and Rollback
+----------------------
+
+When the upgraded release is undesirable
+or, in some unlikely case, the upgrade fails (due to bugs in the newer release),
+administrators may choose to downgrade HDFS back to the pre-upgrade release,
+or rollback HDFS to the pre-upgrade release and the pre-upgrade state.
+
+Note that downgrade can be done in a rolling fashion but rollback cannot.
+Rollback requires cluster downtime.
+
+Note also that downgrade and rollback are possible only after a rolling upgrade is started and
+before the upgrade is terminated.
+An upgrade can be terminated by either finalize, downgrade or rollback.
+Therefore, it may not be possible to perform rollback after finalize or downgrade,
+or to perform downgrade after finalize.
+
+
+Downgrade
+---------
+
+*Downgrade* restores the software back to the pre-upgrade release
+and preserves the user data.
+Suppose time *T* is the rolling upgrade start time and the upgrade is terminated by downgrade.
+Then, the files created before or after *T* remain available in HDFS.
+The files deleted before or after *T* remain deleted in HDFS.
+
+A newer release is downgradable to the pre-upgrade release
+only if both the namenode layout version and the datenode layout version
+are not changed between these two releases.
+
+In a HA cluster,
+when a rolling upgrade from an old software release to a new software release is in progress,
+it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release.
+Same as before, suppose *NN1* and *NN2* are respectively in active and standby states.
+Below are the steps for rolling downgrade without downtime:
+
+1. Downgrade *DNs*
+ 1. Choose a small subset of datanodes (e.g. all datanodes under a particular rack).
+ 1. Run "[`hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade`](#dfsadmin_-shutdownDatanode)"
+ to shutdown one of the chosen datanodes.
+ 1. Run "[`hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>`](#dfsadmin_-getDatanodeInfo)"
+ to check and wait for the datanode to shutdown.
+ 1. Downgrade and restart the datanode.
+ 1. Perform the above steps for all the chosen datanodes in the subset in parallel.
+ 1. Repeat the above steps until all upgraded datanodes in the cluster are downgraded.
+1. Downgrade Active and Standby *NNs*
+ 1. Shutdown and downgrade *NN2*.
+ 1. Start *NN2* as standby normally.
+ 1. Failover from *NN1* to *NN2*
+ so that *NN2* becomes active and *NN1* becomes standby.
+ 1. Shutdown and upgrade *NN1*.
+ 1. Start *NN1* as standby normally.
+1. Finalize Rolling Downgrade
+ 1. Run "[`hdfs dfsadmin -rollingUpgrade finalize`](#dfsadmin_-rollingUpgrade)"
+ to finalize the rolling downgrade.
+
+Note that the datanodes must be downgraded before downgrading the namenodes
+since protocols may be changed in a backward compatible manner but not forward compatible,
+i.e. old datanodes can talk to the new namenodes but not vice versa.
+
+
+Rollback
+--------
+
+*Rollback* restores the software back to the pre-upgrade release
+but also reverts the user data back to the pre-upgrade state.
+Suppose time *T* is the rolling upgrade start time and the upgrade is terminated by rollback.
+The files created before *T* remain available in HDFS but the files created after *T* become unavailable.
+The files deleted before *T* remain deleted in HDFS but the files deleted after *T* are restored.
+
+Rollback from a newer release to the pre-upgrade release is always supported.
+However, it cannot be done in a rolling fashion. It requires cluster downtime.
+Suppose *NN1* and *NN2* are respectively in active and standby states.
+Below are the steps for rollback:
+
+* Rollback HDFS
+ 1. Shutdown all *NNs* and *DNs*.
+ 1. Restore the pre-upgrade release in all machines.
+ 1. Start *NN1* as Active with the
+ "[`-rollingUpgrade rollback`](#namenode_-rollingUpgrade)" option.
+ 1. Run `-bootstrapStandby' on NN2 and start it normally as standby.
+ 1. Start *DNs* with the "`-rollback`" option.
+
+
+Commands and Startup Options for Rolling Upgrade
+------------------------------------------------
+
+### DFSAdmin Commands
+
+#### `dfsadmin -rollingUpgrade`
+
+ hdfs dfsadmin -rollingUpgrade <query|prepare|finalize>
+
+Execute a rolling upgrade action.
+
+* Options:
+
+ | --- | --- |
+ | `query` | Query the current rolling upgrade status. |
+ | `prepare` | Prepare a new rolling upgrade. |
+ | `finalize` | Finalize the current rolling upgrade. |
+
+
+#### `dfsadmin -getDatanodeInfo`
+
+ hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>
+
+Get the information about the given datanode.
+This command can be used for checking if a datanode is alive
+like the Unix `ping` command.
+
+
+#### `dfsadmin -shutdownDatanode`
+
+ hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]
+
+Submit a shutdown request for the given datanode.
+If the optional `upgrade` argument is specified,
+clients accessing the datanode will be advised to wait for it to restart
+and the fast start-up mode will be enabled.
+When the restart does not happen in time, clients will timeout and ignore the datanode.
+In such case, the fast start-up mode will also be disabled.
+
+Note that the command does not wait for the datanode shutdown to complete.
+The "[`dfsadmin -getDatanodeInfo`](#dfsadmin_-getDatanodeInfo)"
+command can be used for checking if the datanode shutdown is completed.
+
+
+### NameNode Startup Options
+
+#### `namenode -rollingUpgrade`
+
+ hdfs namenode -rollingUpgrade <rollback|started>
+
+When a rolling upgrade is in progress,
+the `-rollingUpgrade` namenode startup option is used to specify
+various rolling upgrade options.
+
+* Options:
+
+ | --- | --- |
+ | `rollback` | Restores the namenode back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state. |
+ | `started` | Specifies a rolling upgrade already started so that the namenode should allow image directories with different layout versions during startup. |
+
+**WARN: downgrade options is obsolete.**
+It is not necessary to start namenode with downgrade options explicitly.
http://git-wip-us.apache.org/repos/asf/hadoop/blob/7b5b2c58/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md
new file mode 100644
index 0000000..94a37cd
--- /dev/null
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md
@@ -0,0 +1,301 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+HDFS Snapshots
+==============
+
+* [HDFS Snapshots](#HDFS_Snapshots)
+ * [Overview](#Overview)
+ * [Snapshottable Directories](#Snapshottable_Directories)
+ * [Snapshot Paths](#Snapshot_Paths)
+ * [Upgrading to a version of HDFS with snapshots](#Upgrading_to_a_version_of_HDFS_with_snapshots)
+ * [Snapshot Operations](#Snapshot_Operations)
+ * [Administrator Operations](#Administrator_Operations)
+ * [Allow Snapshots](#Allow_Snapshots)
+ * [Disallow Snapshots](#Disallow_Snapshots)
+ * [User Operations](#User_Operations)
+ * [Create Snapshots](#Create_Snapshots)
+ * [Delete Snapshots](#Delete_Snapshots)
+ * [Rename Snapshots](#Rename_Snapshots)
+ * [Get Snapshottable Directory Listing](#Get_Snapshottable_Directory_Listing)
+ * [Get Snapshots Difference Report](#Get_Snapshots_Difference_Report)
+
+
+Overview
+--------
+
+HDFS Snapshots are read-only point-in-time copies of the file system.
+Snapshots can be taken on a subtree of the file system or the entire file system.
+Some common use cases of snapshots are data backup, protection against user errors
+and disaster recovery.
+
+The implementation of HDFS Snapshots is efficient:
+
+
+* Snapshot creation is instantaneous:
+ the cost is *O(1)* excluding the inode lookup time.
+
+* Additional memory is used only when modifications are made relative to a snapshot:
+ memory usage is *O(M)*,
+ where *M* is the number of modified files/directories.
+
+* Blocks in datanodes are not copied:
+ the snapshot files record the block list and the file size.
+ There is no data copying.
+
+* Snapshots do not adversely affect regular HDFS operations:
+ modifications are recorded in reverse chronological order
+ so that the current data can be accessed directly.
+ The snapshot data is computed by subtracting the modifications
+ from the current data.
+
+
+### Snapshottable Directories
+
+Snapshots can be taken on any directory once the directory has been set as
+*snapshottable*.
+A snapshottable directory is able to accommodate 65,536 simultaneous snapshots.
+There is no limit on the number of snapshottable directories.
+Administrators may set any directory to be snapshottable.
+If there are snapshots in a snapshottable directory,
+the directory can be neither deleted nor renamed
+before all the snapshots are deleted.
+
+Nested snapshottable directories are currently not allowed.
+In other words, a directory cannot be set to snapshottable
+if one of its ancestors/descendants is a snapshottable directory.
+
+
+### Snapshot Paths
+
+For a snapshottable directory,
+the path component *".snapshot"* is used for accessing its snapshots.
+Suppose `/foo` is a snapshottable directory,
+`/foo/bar` is a file/directory in `/foo`,
+and `/foo` has a snapshot `s0`.
+Then, the path `/foo/.snapshot/s0/bar`
+refers to the snapshot copy of `/foo/bar`.
+The usual API and CLI can work with the ".snapshot" paths.
+The following are some examples.
+
+* Listing all the snapshots under a snapshottable directory:
+
+ hdfs dfs -ls /foo/.snapshot
+
+* Listing the files in snapshot `s0`:
+
+ hdfs dfs -ls /foo/.snapshot/s0
+
+* Copying a file from snapshot `s0`:
+
+ hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp
+
+ Note that this example uses the preserve option to preserve
+ timestamps, ownership, permission, ACLs and XAttrs.
+
+
+Upgrading to a version of HDFS with snapshots
+---------------------------------------------
+
+The HDFS snapshot feature introduces a new reserved path name used to
+interact with snapshots: `.snapshot`. When upgrading from an
+older version of HDFS, existing paths named `.snapshot` need
+to first be renamed or deleted to avoid conflicting with the reserved path.
+See the upgrade section in
+[the HDFS user guide](HdfsUserGuide.html#Upgrade_and_Rollback)
+for more information.
+
+
+Snapshot Operations
+-------------------
+
+
+### Administrator Operations
+
+The operations described in this section require superuser privilege.
+
+
+#### Allow Snapshots
+
+
+Allowing snapshots of a directory to be created.
+If the operation completes successfully, the directory becomes snapshottable.
+
+* Command:
+
+ hdfs dfsadmin -allowSnapshot <path>
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+
+See also the corresponding Java API
+`void allowSnapshot(Path path)` in `HdfsAdmin`.
+
+
+#### Disallow Snapshots
+
+Disallowing snapshots of a directory to be created.
+All snapshots of the directory must be deleted before disallowing snapshots.
+
+* Command:
+
+ hdfs dfsadmin -disallowSnapshot <path>
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+
+See also the corresponding Java API
+`void disallowSnapshot(Path path)` in `HdfsAdmin`.
+
+
+### User Operations
+
+The section describes user operations.
+Note that HDFS superuser can perform all the operations
+without satisfying the permission requirement in the individual operations.
+
+
+#### Create Snapshots
+
+Create a snapshot of a snapshottable directory.
+This operation requires owner privilege of the snapshottable directory.
+
+* Command:
+
+ hdfs dfs -createSnapshot <path> [<snapshotName>]
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+ | snapshotName | The snapshot name, which is an optional argument. When it is omitted, a default name is generated using a timestamp with the format `"'s'yyyyMMdd-HHmmss.SSS"`, e.g. `"s20130412-151029.033"`. |
+
+See also the corresponding Java API
+`Path createSnapshot(Path path)` and
+`Path createSnapshot(Path path, String snapshotName)`
+in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html)
+The snapshot path is returned in these methods.
+
+
+#### Delete Snapshots
+
+Delete a snapshot of from a snapshottable directory.
+This operation requires owner privilege of the snapshottable directory.
+
+* Command:
+
+ hdfs dfs -deleteSnapshot <path> <snapshotName>
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+ | snapshotName | The snapshot name. |
+
+See also the corresponding Java API
+`void deleteSnapshot(Path path, String snapshotName)`
+in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html).
+
+
+#### Rename Snapshots
+
+Rename a snapshot.
+This operation requires owner privilege of the snapshottable directory.
+
+* Command:
+
+ hdfs dfs -renameSnapshot <path> <oldName> <newName>
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+ | oldName | The old snapshot name. |
+ | newName | The new snapshot name. |
+
+See also the corresponding Java API
+`void renameSnapshot(Path path, String oldName, String newName)`
+in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html).
+
+
+#### Get Snapshottable Directory Listing
+
+Get all the snapshottable directories where the current user has permission to take snapshtos.
+
+* Command:
+
+ hdfs lsSnapshottableDir
+
+* Arguments: none
+
+See also the corresponding Java API
+`SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()`
+in `DistributedFileSystem`.
+
+
+#### Get Snapshots Difference Report
+
+Get the differences between two snapshots.
+This operation requires read access privilege for all files/directories in both snapshots.
+
+* Command:
+
+ hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
+
+* Arguments:
+
+ | --- | --- |
+ | path | The path of the snapshottable directory. |
+ | fromSnapshot | The name of the starting snapshot. |
+ | toSnapshot | The name of the ending snapshot. |
+
+ Note that snapshotDiff can be used to get the difference report between two snapshots, or between
+ a snapshot and the current status of a directory. Users can use "." to represent the current status.
+
+* Results:
+
+ | --- | --- |
+ | \+ | The file/directory has been created. |
+ | \- | The file/directory has been deleted. |
+ | M | The file/directory has been modified. |
+ | R | The file/directory has been renamed. |
+
+A *RENAME* entry indicates a file/directory has been renamed but
+is still under the same snapshottable directory. A file/directory is
+reported as deleted if it was renamed to outside of the snapshottble directory.
+A file/directory renamed from outside of the snapshottble directory is
+reported as newly created.
+
+The snapshot difference report does not guarantee the same operation sequence.
+For example, if we rename the directory *"/foo"* to *"/foo2"*, and
+then append new data to the file *"/foo2/bar"*, the difference report will
+be:
+
+ R. /foo -> /foo2
+ M. /foo/bar
+
+I.e., the changes on the files/directories under a renamed directory is
+reported using the original path before the rename (*"/foo/bar"* in
+the above example).
+
+See also the corresponding Java API
+`SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)`
+in `DistributedFileSystem`.
http://git-wip-us.apache.org/repos/asf/hadoop/blob/7b5b2c58/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
deleted file mode 100644
index f0b0ccf..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml
+++ /dev/null
@@ -1,329 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<document xmlns="http://maven.apache.org/XDOC/2.0"
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
-
- <properties>
- <title>HDFS Rolling Upgrade</title>
- </properties>
-
- <body>
-
- <h1>HDFS Rolling Upgrade</h1>
- <macro name="toc">
- <param name="section" value="0"/>
- <param name="fromDepth" value="0"/>
- <param name="toDepth" value="4"/>
- </macro>
-
- <section name="Introduction" id="Introduction">
- <p>
- <em>HDFS rolling upgrade</em> allows upgrading individual HDFS daemons.
- For examples, the datanodes can be upgraded independent of the namenodes.
- A namenode can be upgraded independent of the other namenodes.
- The namenodes can be upgraded independent of datanods and journal nodes.
- </p>
- </section>
-
- <section name="Upgrade" id="Upgrade">
- <p>
- In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility.
- These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime.
- In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA.
- </p>
- <p>
- If there is any new feature which is enabled in new software release, may not work with old software release after upgrade.
- In such cases upgrade should be done by following steps.
- </p>
- <ol>
- <li>Disable new feature.</li>
- <li>Upgrade the cluster.</li>
- <li>Enable the new feature.</li>
- </ol>
- <p>
- Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards.
- </p>
- <subsection name="Upgrade without Downtime" id="UpgradeWithoutDowntime">
- <p>
- In a HA cluster, there are two or more <em>NameNodes (NNs)</em>, many <em>DataNodes (DNs)</em>,
- a few <em>JournalNodes (JNs)</em> and a few <em>ZooKeeperNodes (ZKNs)</em>.
- <em>JNs</em> is relatively stable and does not require upgrade when upgrading HDFS in most of the cases.
- In the rolling upgrade procedure described here,
- only <em>NNs</em> and <em>DNs</em> are considered but <em>JNs</em> and <em>ZKNs</em> are not.
- Upgrading <em>JNs</em> and <em>ZKNs</em> may incur cluster downtime.
- </p>
-
- <h4>Upgrading Non-Federated Clusters</h4>
- <p>
- Suppose there are two namenodes <em>NN1</em> and <em>NN2</em>,
- where <em>NN1</em> and <em>NN2</em> are respectively in active and standby states.
- The following are the steps for upgrading a HA cluster:
- </p>
- <ol>
- <li>Prepare Rolling Upgrade<ol>
- <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade prepare</a></code>"
- to create a fsimage for rollback.
- </li>
- <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade query</a></code>"
- to check the status of the rollback image.
- Wait and re-run the command until
- the "<tt>Proceed with rolling upgrade</tt>" message is shown.
- </li>
- </ol></li>
- <li>Upgrade Active and Standby <em>NNs</em><ol>
- <li>Shutdown and upgrade <em>NN2</em>.</li>
- <li>Start <em>NN2</em> as standby with the
- "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li>
- <li>Failover from <em>NN1</em> to <em>NN2</em>
- so that <em>NN2</em> becomes active and <em>NN1</em> becomes standby.</li>
- <li>Shutdown and upgrade <em>NN1</em>.</li>
- <li>Start <em>NN1</em> as standby with the
- "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li>
- </ol></li>
- <li>Upgrade <em>DNs</em><ol>
- <li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).</li>
- <ol>
- <li>Run "<code><a href="#dfsadmin_-shutdownDatanode">hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</a></code>"
- to shutdown one of the chosen datanodes.</li>
- <li>Run "<code><a href="#dfsadmin_-getDatanodeInfo">hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></a></code>"
- to check and wait for the datanode to shutdown.</li>
- <li>Upgrade and restart the datanode.</li>
- <li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li>
- </ol>
- <li>Repeat the above steps until all datanodes in the cluster are upgraded.</li>
- </ol></li>
- <li>Finalize Rolling Upgrade<ul>
- <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade finalize</a></code>"
- to finalize the rolling upgrade.</li>
- </ul></li>
- </ol>
-
- <h4>Upgrading Federated Clusters</h4>
- <p>
- In a federated cluster, there are multiple namespaces
- and a pair of active and standby <em>NNs</em> for each namespace.
- The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster
- except that Step 1 and Step 4 are performed on each namespace
- and Step 2 is performed on each pair of active and standby <em>NNs</em>, i.e.
- </p>
- <ol>
- <li>Prepare Rolling Upgrade for Each Namespace</li>
- <li>Upgrade Active and Standby <em>NN</em> pairs for Each Namespace</li>
- <li>Upgrade <em>DNs</em></li>
- <li>Finalize Rolling Upgrade for Each Namespace</li>
- </ol>
-
- </subsection>
-
- <subsection name="Upgrade with Downtime" id="UpgradeWithDowntime">
- <p>
- For non-HA clusters,
- it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes.
- However, datanodes can still be upgraded in a rolling manner.
- </p>
-
- <h4>Upgrading Non-HA Clusters</h4>
- <p>
- In a non-HA cluster, there are a <em>NameNode (NN)</em>, a <em>SecondaryNameNode (SNN)</em>
- and many <em>DataNodes (DNs)</em>.
- The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster
- except that Step 2 "Upgrade Active and Standby <em>NNs</em>" is changed to below:
- </p>
- <ul>
- <li>Upgrade <em>NN</em> and <em>SNN</em><ol>
- <li>Shutdown <em>SNN</em></li>
- <li>Shutdown and upgrade <em>NN</em>.</li>
- <li>Start <em>NN</em> with the
- "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li>
- <li>Upgrade and restart <em>SNN</em></li>
- </ol></li>
- </ul>
- </subsection>
- </section>
-
- <section name="Downgrade and Rollback" id="DowngradeAndRollback">
- <p>
- When the upgraded release is undesirable
- or, in some unlikely case, the upgrade fails (due to bugs in the newer release),
- administrators may choose to downgrade HDFS back to the pre-upgrade release,
- or rollback HDFS to the pre-upgrade release and the pre-upgrade state.
- </p>
- <p>
- Note that downgrade can be done in a rolling fashion but rollback cannot.
- Rollback requires cluster downtime.
- </p>
- <p>
- Note also that downgrade and rollback are possible only after a rolling upgrade is started and
- before the upgrade is terminated.
- An upgrade can be terminated by either finalize, downgrade or rollback.
- Therefore, it may not be possible to perform rollback after finalize or downgrade,
- or to perform downgrade after finalize.
- </p>
- </section>
-
- <section name="Downgrade" id="Downgrade">
- <p>
- <em>Downgrade</em> restores the software back to the pre-upgrade release
- and preserves the user data.
- Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by downgrade.
- Then, the files created before or after <em>T</em> remain available in HDFS.
- The files deleted before or after <em>T</em> remain deleted in HDFS.
- </p>
- <p>
- A newer release is downgradable to the pre-upgrade release
- only if both the namenode layout version and the datenode layout version
- are not changed between these two releases.
- </p>
- <p>
- In a HA cluster,
- when a rolling upgrade from an old software release to a new software release is in progress,
- it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release.
- Same as before, suppose <em>NN1</em> and <em>NN2</em> are respectively in active and standby states.
- Below are the steps for rolling downgrade without downtime:
- </p>
- <ol>
- <li>Downgrade <em>DNs</em><ol>
- <li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).</li>
- <ol>
- <li>Run "<code><a href="#dfsadmin_-shutdownDatanode">hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</a></code>"
- to shutdown one of the chosen datanodes.</li>
- <li>Run "<code><a href="#dfsadmin_-getDatanodeInfo">hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></a></code>"
- to check and wait for the datanode to shutdown.</li>
- <li>Downgrade and restart the datanode.</li>
- <li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li>
- </ol>
- <li>Repeat the above steps until all upgraded datanodes in the cluster are downgraded.</li>
- </ol></li>
- <li>Downgrade Active and Standby <em>NNs</em><ol>
- <li>Shutdown and downgrade <em>NN2</em>.</li>
- <li>Start <em>NN2</em> as standby normally.
- </li>
- <li>Failover from <em>NN1</em> to <em>NN2</em>
- so that <em>NN2</em> becomes active and <em>NN1</em> becomes standby.</li>
- <li>Shutdown and upgrade <em>NN1</em>.</li>
- <li>Start <em>NN1</em> as standby normally.
- </li>
- </ol></li>
- <li>Finalize Rolling Downgrade<ul>
- <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade finalize</a></code>"
- to finalize the rolling downgrade.</li>
- </ul></li>
- </ol>
- <p>
- Note that the datanodes must be downgraded before downgrading the namenodes
- since protocols may be changed in a backward compatible manner but not forward compatible,
- i.e. old datanodes can talk to the new namenodes but not vice versa.
- </p>
- </section>
-
- <section name="Rollback" id="Rollback">
- <p>
- <em>Rollback</em> restores the software back to the pre-upgrade release
- but also reverts the user data back to the pre-upgrade state.
- Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by rollback.
- The files created before <em>T</em> remain available in HDFS but the files created after <em>T</em> become unavailable.
- The files deleted before <em>T</em> remain deleted in HDFS but the files deleted after <em>T</em> are restored.
- </p>
- <p>
- Rollback from a newer release to the pre-upgrade release is always supported.
- However, it cannot be done in a rolling fashion. It requires cluster downtime.
- Suppose <em>NN1</em> and <em>NN2</em> are respectively in active and standby states.
- Below are the steps for rollback:
- </p>
- <ul>
- <li>Rollback HDFS<ol>
- <li>Shutdown all <em>NNs</em> and <em>DNs</em>.</li>
- <li>Restore the pre-upgrade release in all machines.</li>
- <li>Start <em>NN1</em> as Active with the
- "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade rollback</code></a>" option.</li>
- <li>Run `-bootstrapStandby' on NN2 and start it normally as standby.</li>
- <li>Start <em>DNs</em> with the "<code>-rollback</code>" option.</li>
- </ol></li>
- </ul>
-
- </section>
-
- <section name="Commands and Startup Options for Rolling Upgrade" id="dfsadminCommands">
-
- <subsection name="DFSAdmin Commands" id="dfsadminCommands">
- <h4><code>dfsadmin -rollingUpgrade</code></h4>
- <source>hdfs dfsadmin -rollingUpgrade <query|prepare|finalize></source>
- <p>
- Execute a rolling upgrade action.
- <ul><li>Options:<table>
- <tr><td><code>query</code></td><td>Query the current rolling upgrade status.</td></tr>
- <tr><td><code>prepare</code></td><td>Prepare a new rolling upgrade.</td></tr>
- <tr><td><code>finalize</code></td><td>Finalize the current rolling upgrade.</td></tr>
- </table></li></ul>
- </p>
-
- <h4><code>dfsadmin -getDatanodeInfo</code></h4>
- <source>hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></source>
- <p>
- Get the information about the given datanode.
- This command can be used for checking if a datanode is alive
- like the Unix <code>ping</code> command.
- </p>
-
- <h4><code>dfsadmin -shutdownDatanode</code></h4>
- <source>hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]</source>
- <p>
- Submit a shutdown request for the given datanode.
- If the optional <code>upgrade</code> argument is specified,
- clients accessing the datanode will be advised to wait for it to restart
- and the fast start-up mode will be enabled.
- When the restart does not happen in time, clients will timeout and ignore the datanode.
- In such case, the fast start-up mode will also be disabled.
- </p>
- <p>
- Note that the command does not wait for the datanode shutdown to complete.
- The "<a href="#dfsadmin_-getDatanodeInfo">dfsadmin -getDatanodeInfo</a>"
- command can be used for checking if the datanode shutdown is completed.
- </p>
- </subsection>
-
- <subsection name="NameNode Startup Options" id="dfsadminCommands">
-
- <h4><code>namenode -rollingUpgrade</code></h4>
- <source>hdfs namenode -rollingUpgrade <rollback|started></source>
- <p>
- When a rolling upgrade is in progress,
- the <code>-rollingUpgrade</code> namenode startup option is used to specify
- various rolling upgrade options.
- </p>
- <ul><li>Options:<table>
- <tr><td><code>rollback</code></td>
- <td>Restores the namenode back to the pre-upgrade release
- but also reverts the user data back to the pre-upgrade state.</td>
- </tr>
- <tr><td><code>started</code></td>
- <td>Specifies a rolling upgrade already started
- so that the namenode should allow image directories
- with different layout versions during startup.</td>
- </tr>
- </table></li></ul>
- <p>
- <b>WARN: downgrade options is obsolete.</b>
- It is not necessary to start namenode with downgrade options explicitly.
- </p>
- </subsection>
-
- </section>
- </body>
-</document>
http://git-wip-us.apache.org/repos/asf/hadoop/blob/7b5b2c58/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml
deleted file mode 100644
index 330d00f..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml
+++ /dev/null
@@ -1,303 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
--->
-<document xmlns="http://maven.apache.org/XDOC/2.0"
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
- xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
-
- <properties>
- <title>HDFS Snapshots</title>
- </properties>
-
- <body>
-
- <h1>HDFS Snapshots</h1>
- <macro name="toc">
- <param name="section" value="0"/>
- <param name="fromDepth" value="0"/>
- <param name="toDepth" value="4"/>
- </macro>
-
- <section name="Overview" id="Overview">
- <p>
- HDFS Snapshots are read-only point-in-time copies of the file system.
- Snapshots can be taken on a subtree of the file system or the entire file system.
- Some common use cases of snapshots are data backup, protection against user errors
- and disaster recovery.
- </p>
-
- <p>
- The implementation of HDFS Snapshots is efficient:
- </p>
- <ul>
- <li>Snapshot creation is instantaneous:
- the cost is <em>O(1)</em> excluding the inode lookup time.</li>
- <li>Additional memory is used only when modifications are made relative to a snapshot:
- memory usage is <em>O(M)</em>,
- where <em>M</em> is the number of modified files/directories.</li>
- <li>Blocks in datanodes are not copied:
- the snapshot files record the block list and the file size.
- There is no data copying.</li>
- <li>Snapshots do not adversely affect regular HDFS operations:
- modifications are recorded in reverse chronological order
- so that the current data can be accessed directly.
- The snapshot data is computed by subtracting the modifications
- from the current data.</li>
- </ul>
-
- <subsection name="Snapshottable Directories" id="SnapshottableDirectories">
- <p>
- Snapshots can be taken on any directory once the directory has been set as
- <em>snapshottable</em>.
- A snapshottable directory is able to accommodate 65,536 simultaneous snapshots.
- There is no limit on the number of snapshottable directories.
- Administrators may set any directory to be snapshottable.
- If there are snapshots in a snapshottable directory,
- the directory can be neither deleted nor renamed
- before all the snapshots are deleted.
- </p>
-
- <p>
- Nested snapshottable directories are currently not allowed.
- In other words, a directory cannot be set to snapshottable
- if one of its ancestors/descendants is a snapshottable directory.
- </p>
-
- </subsection>
-
- <subsection name="Snapshot Paths" id="SnapshotPaths">
- <p>
- For a snapshottable directory,
- the path component <em>".snapshot"</em> is used for accessing its snapshots.
- Suppose <code>/foo</code> is a snapshottable directory,
- <code>/foo/bar</code> is a file/directory in <code>/foo</code>,
- and <code>/foo</code> has a snapshot <code>s0</code>.
- Then, the path <source>/foo/.snapshot/s0/bar</source>
- refers to the snapshot copy of <code>/foo/bar</code>.
- The usual API and CLI can work with the ".snapshot" paths.
- The following are some examples.
- </p>
- <ul>
- <li>Listing all the snapshots under a snapshottable directory:
- <source>hdfs dfs -ls /foo/.snapshot</source></li>
- <li>Listing the files in snapshot <code>s0</code>:
- <source>hdfs dfs -ls /foo/.snapshot/s0</source></li>
- <li>Copying a file from snapshot <code>s0</code>:
- <source>hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp</source>
- <p>Note that this example uses the preserve option to preserve
- timestamps, ownership, permission, ACLs and XAttrs.</p></li>
- </ul>
- </subsection>
- </section>
-
- <section name="Upgrading to a version of HDFS with snapshots" id="Upgrade">
-
- <p>
- The HDFS snapshot feature introduces a new reserved path name used to
- interact with snapshots: <tt>.snapshot</tt>. When upgrading from an
- older version of HDFS, existing paths named <tt>.snapshot</tt> need
- to first be renamed or deleted to avoid conflicting with the reserved path.
- See the upgrade section in
- <a href="HdfsUserGuide.html#Upgrade_and_Rollback">the HDFS user guide</a>
- for more information. </p>
-
- </section>
-
- <section name="Snapshot Operations" id="SnapshotOperations">
- <subsection name="Administrator Operations" id="AdministratorOperations">
- <p>
- The operations described in this section require superuser privilege.
- </p>
-
- <h4>Allow Snapshots</h4>
- <p>
- Allowing snapshots of a directory to be created.
- If the operation completes successfully, the directory becomes snapshottable.
- </p>
- <ul>
- <li>Command:
- <source>hdfs dfsadmin -allowSnapshot <path></source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- </table></li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>void allowSnapshot(Path path)</code> in <code>HdfsAdmin</code>.
- </p>
-
- <h4>Disallow Snapshots</h4>
- <p>
- Disallowing snapshots of a directory to be created.
- All snapshots of the directory must be deleted before disallowing snapshots.
- </p>
- <ul>
- <li>Command:
- <source>hdfs dfsadmin -disallowSnapshot <path></source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- </table></li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>void disallowSnapshot(Path path)</code> in <code>HdfsAdmin</code>.
- </p>
- </subsection>
-
- <subsection name="User Operations" id="UserOperations">
- <p>
- The section describes user operations.
- Note that HDFS superuser can perform all the operations
- without satisfying the permission requirement in the individual operations.
- </p>
-
- <h4>Create Snapshots</h4>
- <p>
- Create a snapshot of a snapshottable directory.
- This operation requires owner privilege of the snapshottable directory.
- </p>
- <ul>
- <li>Command:
- <source>hdfs dfs -createSnapshot <path> [<snapshotName>]</source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- <tr><td>snapshotName</td><td>
- The snapshot name, which is an optional argument.
- When it is omitted, a default name is generated using a timestamp with the format
- <code>"'s'yyyyMMdd-HHmmss.SSS"</code>, e.g. "s20130412-151029.033".
- </td></tr>
- </table></li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>Path createSnapshot(Path path)</code> and
- <code>Path createSnapshot(Path path, String snapshotName)</code>
- in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
- The snapshot path is returned in these methods.
- </p>
-
- <h4>Delete Snapshots</h4>
- <p>
- Delete a snapshot of from a snapshottable directory.
- This operation requires owner privilege of the snapshottable directory.
- </p>
- <ul>
- <li>Command:
- <source>hdfs dfs -deleteSnapshot <path> <snapshotName></source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- <tr><td>snapshotName</td><td>The snapshot name.</td></tr>
- </table></li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>void deleteSnapshot(Path path, String snapshotName)</code>
- in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
- </p>
-
- <h4>Rename Snapshots</h4>
- <p>
- Rename a snapshot.
- This operation requires owner privilege of the snapshottable directory.
- </p>
- <ul>
- <li>Command:
- <source>hdfs dfs -renameSnapshot <path> <oldName> <newName></source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- <tr><td>oldName</td><td>The old snapshot name.</td></tr>
- <tr><td>newName</td><td>The new snapshot name.</td></tr>
- </table></li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>void renameSnapshot(Path path, String oldName, String newName)</code>
- in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
- </p>
-
- <h4>Get Snapshottable Directory Listing</h4>
- <p>
- Get all the snapshottable directories where the current user has permission to take snapshtos.
- </p>
- <ul>
- <li>Command:
- <source>hdfs lsSnapshottableDir</source></li>
- <li>Arguments: none</li>
- </ul>
- <p>
- See also the corresponding Java API
- <code>SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()</code>
- in <code>DistributedFileSystem</code>.
- </p>
-
- <h4>Get Snapshots Difference Report</h4>
- <p>
- Get the differences between two snapshots.
- This operation requires read access privilege for all files/directories in both snapshots.
- </p>
- <ul>
- <li>Command:
- <source>hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot></source></li>
- <li>Arguments:<table>
- <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
- <tr><td>fromSnapshot</td><td>The name of the starting snapshot.</td></tr>
- <tr><td>toSnapshot</td><td>The name of the ending snapshot.</td></tr>
- </table></li>
- <p>
- Note that snapshotDiff can be used to get the difference report between two snapshots, or between
- a snapshot and the current status of a directory.Users can use "." to represent the current status.
- </p>
- <li>Results:
- <table>
- <tr><td>+</td><td>The file/directory has been created.</td></tr>
- <tr><td>-</td><td>The file/directory has been deleted.</td></tr>
- <tr><td>M</td><td>The file/directory has been modified.</td></tr>
- <tr><td>R</td><td>The file/directory has been renamed.</td></tr>
- </table>
- </li>
- </ul>
- <p>
- A <em>RENAME</em> entry indicates a file/directory has been renamed but
- is still under the same snapshottable directory. A file/directory is
- reported as deleted if it was renamed to outside of the snapshottble directory.
- A file/directory renamed from outside of the snapshottble directory is
- reported as newly created.
- </p>
- <p>
- The snapshot difference report does not guarantee the same operation sequence.
- For example, if we rename the directory <em>"/foo"</em> to <em>"/foo2"</em>, and
- then append new data to the file <em>"/foo2/bar"</em>, the difference report will
- be:
- <source>
- R. /foo -> /foo2
- M. /foo/bar
- </source>
- I.e., the changes on the files/directories under a renamed directory is
- reported using the original path before the rename (<em>"/foo/bar"</em> in
- the above example).
- </p>
- <p>
- See also the corresponding Java API
- <code>SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)</code>
- in <code>DistributedFileSystem</code>.
- </p>
-
- </subsection>
- </section>
-
- </body>
-</document>