You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-commits@hadoop.apache.org by sz...@apache.org on 2013/04/22 21:13:19 UTC
svn commit: r1470665 - in
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs:
CHANGES.HDFS-2802.txt src/site/xdoc/ src/site/xdoc/HdfsSnapshots.xml
Author: szetszwo
Date: Mon Apr 22 19:13:18 2013
New Revision: 1470665
URL: http://svn.apache.org/r1470665
Log:
HDFS-4708. Add snapshot user documentation.
Added:
hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/
hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml
Modified:
hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
Modified: hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt?rev=1470665&r1=1470664&r2=1470665&view=diff
==============================================================================
--- hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt (original)
+++ hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt Mon Apr 22 19:13:18 2013
@@ -258,3 +258,5 @@ Branch-2802 Snapshot (Unreleased)
HDFS-4717. Change the path parameter type of the snapshot methods in
HdfsAdmin from String to Path. (szetszwo)
+
+ HDFS-4708. Add snapshot user documentation. (szetszwo)
Added: hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml?rev=1470665&view=auto
==============================================================================
--- hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml (added)
+++ hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml Mon Apr 22 19:13:18 2013
@@ -0,0 +1,262 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<document xmlns="http://maven.apache.org/XDOC/2.0"
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd">
+
+ <properties>
+ <title>HFDS Snapshots</title>
+ </properties>
+
+ <body>
+
+ <h1>HDFS Snapshots</h1>
+ <macro name="toc">
+ <param name="section" value="0"/>
+ <param name="fromDepth" value="0"/>
+ <param name="toDepth" value="4"/>
+ </macro>
+
+ <section name="Overview" id="Overview">
+ <p>
+ HDFS Snapshots are read-only point-in-time copies of the file system.
+ Snapshots can be taken on a subtree of the file system or the entire file system.
+ Some common use cases of snapshots are data backup, protection against user errors
+ and disaster recovery.
+ </p>
+
+ <p>
+ The implementation of HDFS Snapshots is efficient:
+ </p>
+ <ul>
+ <li>Snapshot creation is instantaneous:
+ the cost is <em>O(1)</em> excluding the inode lookup time.</li>
+ <li>Additional memory is used only when modifications are made relative to a snapshot:
+ memory usage is <em>O(M)</em>,
+ where <em>M</em> is the number of modified files/directories.</li>
+ <li>Blocks in datanodes are not copied:
+ the snapshot files record the block list and the file size.
+ There is no data copying.</li>
+ <li>Snapshots do not adversely affect regular HDFS operations:
+ modifications are recorded in reverse chronological order
+ so that the current data can be accessed directly.
+ The snapshot data is computed by subtracting the modifications
+ from the current data.</li>
+ </ul>
+
+ <subsection name="Snapshottable Directories" id="SnapshottableDirectories">
+ <p>
+ Snapshots can be taken on any directory once the directory has been set as
+ <em>snapshottable</em>.
+ A snapshottable directory is able to accommodate 65,536 simultaneous snapshots.
+ There is no limit on the number of snapshottable directories.
+ Administrators may set any directory to be snapshottable.
+ If there are snapshots in a snapshottable directory,
+ the directory can be neither deleted nor renamed
+ before all the snapshots are deleted.
+ </p>
+<!--
+ <p>
+ Nested snapshottable directories are currently not allowed.
+ In other words, a directory cannot be set to snapshottable
+ if one of its ancestors is a snapshottable directory.
+ </p>
+-->
+ </subsection>
+
+ <subsection name="Snapshot Paths" id="SnapshotPaths">
+ <p>
+ For a snapshottable directory,
+ the path component <em>".snapshot"</em> is used for accessing its snapshots.
+ Suppose <code>/foo</code> is a snapshottable directory,
+ <code>/foo/bar</code> is a file/directory in <code>/foo</code>,
+ and <code>/foo</code> has a snapshot <code>s0</code>.
+ Then, the path <source>/foo/.snapshot/s0/bar</source>
+ refers to the snapshot copy of <code>/foo/bar</code>.
+ The usual API and CLI can work with the ".snapshot" paths.
+ The following are some examples.
+ </p>
+ <ul>
+ <li>Listing all the snapshots under a snapshottable directory:
+ <source>hdfs dfs -ls /foo/.snapshot</source></li>
+ <li>Listing the files in snapshot <code>s0</code>:
+ <source>hdfs dfs -ls /foo/.snapshot/s0</source></li>
+ <li>Copying a file from snapshot <code>s0</code>:
+ <source>hdfs dfs -cp /foo/.snapshot/s0/bar /tmp</source></li>
+ </ul>
+ <p>
+ The name ".snapshot" is now a reserved file name in HDFS
+ so that users cannot create a file/directory with ".snapshot" as the name.
+ If ".snapshot" is used in a previous version of HDFS, it must be renamed before upgrade;
+ otherwise, upgrade will fail.
+ </p>
+ </subsection>
+ </section>
+
+ <section name="Snapshot Operations" id="SnapshotOperations">
+ <subsection name="Administrator Operations" id="AdministratorOperations">
+ <p>
+ The operations described in this section require superuser privilege.
+ </p>
+
+ <h4>Allow Snapshots</h4>
+ <p>
+ Allowing snapshots of a directory to be created.
+ If the operation completes successfully, the directory becomes snapshottable.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs dfsadmin -allowSnapshot <path></source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>void allowSnapshot(Path path)</code> in <code>HdfsAdmin</code>.
+ </p>
+
+ <h4>Disallow Snapshots</h4>
+ <p>
+ Disallowing snapshots of a directory to be created.
+ All snapshots of the directory must be deleted before disallowing snapshots.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs dfsadmin -disallowSnapshot <path></source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>void disallowSnapshot(Path path)</code> in <code>HdfsAdmin</code>.
+ </p>
+ </subsection>
+
+ <subsection name="User Operations" id="UserOperations">
+ <p>
+ The section describes user operations.
+ Note that HDFS superuser can perform all the operations
+ without satisfying the permission requirement in the individual operations.
+ </p>
+
+ <h4>Create Snapshots</h4>
+ <p>
+ Create a snapshot of a snapshottable directory.
+ This operation requires owner privilege of the snapshottable directory.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs dfs -createSnapshot <path> [<snapshotName>]</source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ <tr><td>snapshotName</td><td>
+ The snapshot name, which is an optional argument.
+ When it is omitted, a default name is generated using a timestamp with the format
+ <code>"'s'yyyyMMdd-HHmmss.SSS"</code>, e.g. "s20130412-151029.033".
+ </td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>Path createSnapshot(Path path)</code> and
+ <code>Path createSnapshot(Path path, String snapshotName)</code>
+ in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
+ The snapshot path is returned in these methods.
+ </p>
+
+ <h4>Delete Snapshots</h4>
+ <p>
+ Delete a snapshot of from a snapshottable directory.
+ This operation requires owner privilege of the snapshottable directory.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs dfs -deleteSnapshot <path> <snapshotName></source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ <tr><td>snapshotName</td><td>The snapshot name.</td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>void deleteSnapshot(Path path, String snapshotName)</code>
+ in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
+ </p>
+
+ <h4>Rename Snapshots</h4>
+ <p>
+ Rename a snapshot.
+ This operation requires owner privilege of the snapshottable directory.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs dfs -renameSnapshot <path> <oldName> <newName></source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ <tr><td>oldName</td><td>The old snapshot name.</td></tr>
+ <tr><td>newName</td><td>The new snapshot name.</td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>void renameSnapshot(Path path, String oldName, String newName)</code>
+ in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>.
+ </p>
+
+ <h4>Get Snapshottable Directory Listing</h4>
+ <p>
+ Get all the snapshottable directories where the current user has permission to take snapshtos.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs lsSnapshottableDir</source></li>
+ <li>Arguments: none</li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()</code>
+ in <code>DistributedFileSystem</code>.
+ </p>
+
+ <h4>Get Snapshots Difference Report</h4>
+ <p>
+ Get the differences between two snapshots.
+ This operation requires read access privilege for all files/directories in both snapshots.
+ </p>
+ <ul>
+ <li>Command:
+ <source>hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot></source></li>
+ <li>Arguments:<table>
+ <tr><td>path</td><td>The path of the snapshottable directory.</td></tr>
+ <tr><td>fromSnapshot</td><td>The name of the starting snapshot.</td></tr>
+ <tr><td>toSnapshot</td><td>The name of the ending snapshot.</td></tr>
+ </table></li>
+ </ul>
+ <p>
+ See also the corresponding Java API
+ <code>SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)</code>
+ in <code>DistributedFileSystem</code>.
+ </p>
+
+ </subsection>
+ </section>
+
+ </body>
+</document>