You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-commits@hadoop.apache.org by om...@apache.org on 2009/09/18 03:58:18 UTC

svn commit: r816432 [2/2] - in /hadoop/hdfs/trunk: ./ src/docs/src/documentation/ src/docs/src/documentation/content/xdocs/ src/docs/src/documentation/resources/images/

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml Fri Sep 18 01:58:17 2009
@@ -62,8 +62,8 @@
         an fsimage that does not contain any of these fields, the field's column will be included,
         but no data recorded. The default record delimiter is a tab, but this may be changed
         via the <code>-delimiter</code> command line argument. This processor is designed to
-        create output that is easily analyzed by other tools, such as <a href="http://hadoop.apache.org/pig/">Pig</a>. 
-        See the <a href="#analysis">Analysis</a> section
+        create output that is easily analyzed by other tools, such as <a href="http://hadoop.apache.org/pig/">Apache Pig</a>. 
+        See the <a href="#analysis">Analyzing Results</a> section
         for further information on using this processor to analyze the contents of fsimage files.</li>
         <li><strong>XML</strong> creates an XML document of the fsimage and includes all of the
           information within the fsimage, similar to the <code>lsr </code> processor. The output
@@ -125,53 +125,103 @@
       <section id="Example">
         <title>Example</title>
 
-          <p>Consider the following contrived namespace:</p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:17&nbsp;/anotherDir</code></p>
-          <p><code>-rw-r--r--&nbsp;&nbsp;&nbsp;3&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;286631664&nbsp;2009-03-16&nbsp;21:15&nbsp;/anotherDir/biggerfile</code></p>
-          <p><code>-rw-r--r--&nbsp;&nbsp;&nbsp;3&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;8754&nbsp;2009-03-16&nbsp;21:17&nbsp;/anotherDir/smallFile</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:11&nbsp;/mapredsystem</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:11&nbsp;/mapredsystem/theuser</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:11&nbsp;/mapredsystem/theuser/mapredsystem</code></p>
-          <p><code>drwx-wx-wx&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:11&nbsp;/mapredsystem/theuser/mapredsystem/ip.redacted.com</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:12&nbsp;/one</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:12&nbsp;/one/two</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:16&nbsp;/user</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;&nbsp;-&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;21:19&nbsp;/user/theuser</code></p>
-          <p>Applying the Offline Image Processor against this file with default options would result in the following output:</p>
-          <p><code>machine:hadoop-0.21.0-dev&nbsp;theuser$&nbsp;bin/hdfs&nbsp;oiv&nbsp;-i&nbsp;fsimagedemo&nbsp;-o&nbsp;fsimage.txt</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:16&nbsp;/</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:17&nbsp;/anotherDir</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:11&nbsp;/mapredsystem</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:12&nbsp;/one</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:16&nbsp;/user</code></p>
-          <p><code>-rw-r--r--&nbsp;&nbsp;3&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;286631664&nbsp;2009-03-16&nbsp;14:15&nbsp;/anotherDir/biggerfile</code></p>
-          <p><code>-rw-r--r--&nbsp;&nbsp;3&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;8754&nbsp;2009-03-16&nbsp;14:17&nbsp;/anotherDir/smallFile</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:11&nbsp;/mapredsystem/theuser</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:11&nbsp;/mapredsystem/theuser/mapredsystem</code></p>
-          <p><code>drwx-wx-wx&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:11&nbsp;/mapredsystem/theuser/mapredsystem/ip.redacted.com</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:12&nbsp;/one/two</code></p>
-          <p><code>drwxr-xr-x&nbsp;&nbsp;-&nbsp;&nbsp;&nbsp;theuser&nbsp;supergroup&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0&nbsp;2009-03-16&nbsp;14:19&nbsp;/user/theuser</code></p>
-          <p>Similarly, applying the Indented processor would generate output that begins with:</p>
-          <p><code>machine:hadoop-0.21.0-dev&nbsp;theuser$&nbsp;bin/hdfs&nbsp;oiv&nbsp;-i&nbsp;fsimagedemo&nbsp;-p&nbsp;Indented&nbsp;-o&nbsp;fsimage.txt</code></p>
-          <p><code>FSImage</code></p>
-          <p><code>&nbsp;&nbsp;ImageVersion&nbsp;=&nbsp;-19</code></p>
-          <p><code>&nbsp;&nbsp;NamespaceID&nbsp;=&nbsp;2109123098</code></p>
-          <p><code>&nbsp;&nbsp;GenerationStamp&nbsp;=&nbsp;1003</code></p>
-          <p><code>&nbsp;&nbsp;INodes&nbsp;[NumInodes&nbsp;=&nbsp;12]</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;Inode</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;INodePath&nbsp;=&nbsp;</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Replication&nbsp;=&nbsp;0</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ModificationTime&nbsp;=&nbsp;2009-03-16&nbsp;14:16</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;AccessTime&nbsp;=&nbsp;1969-12-31&nbsp;16:00</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;BlockSize&nbsp;=&nbsp;0</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Blocks&nbsp;[NumBlocks&nbsp;=&nbsp;-1]</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;NSQuota&nbsp;=&nbsp;2147483647</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DSQuota&nbsp;=&nbsp;-1</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Permissions</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Username&nbsp;=&nbsp;theuser</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;GroupName&nbsp;=&nbsp;supergroup</code></p>
-          <p><code>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;PermString&nbsp;=&nbsp;rwxr-xr-x</code></p>
-          <p><code>&hellip;remaining output omitted&hellip;</code></p>
+<p>Consider the following contrived namespace:</p>
+<source>
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:17 /anotherDir 
+
+-rw-r--r--   3 theuser supergroup  286631664 2009-03-16 21:15 /anotherDir/biggerfile 
+
+-rw-r--r--   3 theuser supergroup       8754 2009-03-16 21:17 /anotherDir/smallFile 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:11 /mapredsystem 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:11 /mapredsystem/theuser 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:11 /mapredsystem/theuser/mapredsystem 
+
+drwx-wx-wx   - theuser supergroup          0 2009-03-16 21:11 /mapredsystem/theuser/mapredsystem/ip.redacted.com 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:12 /one 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:12 /one/two 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:16 /user 
+
+drwxr-xr-x   - theuser supergroup          0 2009-03-16 21:19 /user/theuser 
+</source>          
+
+<p>Applying the Offline Image Processor against this file with default options would result in the following output:</p>
+<source>
+machine:hadoop-0.21.0-dev theuser$ bin/hdfs oiv -i fsimagedemo -o fsimage.txt 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:16 / 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:17 /anotherDir 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:11 /mapredsystem 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:12 /one 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:16 /user 
+
+-rw-r--r--  3   theuser supergroup    286631664 2009-03-16 14:15 /anotherDir/biggerfile 
+
+-rw-r--r--  3   theuser supergroup         8754 2009-03-16 14:17 /anotherDir/smallFile 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:11 /mapredsystem/theuser 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:11 /mapredsystem/theuser/mapredsystem 
+
+drwx-wx-wx  -   theuser supergroup            0 2009-03-16 14:11 /mapredsystem/theuser/mapredsystem/ip.redacted.com 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:12 /one/two 
+
+drwxr-xr-x  -   theuser supergroup            0 2009-03-16 14:19 /user/theuser 
+</source>
+
+<p>Similarly, applying the Indented processor would generate output that begins with:</p>
+<source>
+machine:hadoop-0.21.0-dev theuser$ bin/hdfs oiv -i fsimagedemo -p Indented -o fsimage.txt 
+
+FSImage 
+
+  ImageVersion = -19 
+
+  NamespaceID = 2109123098 
+
+  GenerationStamp = 1003 
+
+  INodes [NumInodes = 12] 
+
+    Inode 
+
+      INodePath =  
+
+      Replication = 0 
+
+      ModificationTime = 2009-03-16 14:16 
+
+      AccessTime = 1969-12-31 16:00 
+
+      BlockSize = 0 
+
+      Blocks [NumBlocks = -1] 
+
+      NSQuota = 2147483647 
+
+      DSQuota = -1 
+
+      Permissions 
+
+        Username = theuser 
+
+        GroupName = supergroup 
+
+        PermString = rwxr-xr-x 
+
+…remaining output omitted…
+</source>          
+          
       </section> <!-- example-->
 
     </section>
@@ -210,7 +260,7 @@
     </section>
    
     <section id="analysis">
-      <title>Analyzing results of Offline Image Viewer</title>
+      <title>Analyzing Results</title>
       <p>The Offline Image Viewer makes it easy to gather large amounts of data about the hdfs namespace.
          This information can then be used to explore file system usage patterns or find
         specific files that match arbitrary criteria, along with other types of namespace analysis. The Delimited 
@@ -227,7 +277,7 @@
       <p>Each of the following scripts assumes you have generated an output file using the Delimited processor named
         <code>foo</code> and will be storing the results of the Pig analysis in a file named <code>results</code>.</p>
       <section>
-      <title>Total number of files for each user</title>
+      <title>Total Number of Files for Each User</title>
       <p>This script processes each path within the namespace, groups them by the file owner and determines the total
       number of files each user owns.</p>
       <p><strong>numFilesOfEachUser.pig:</strong></p>
@@ -270,7 +320,8 @@
         <code>marge 2456</code><br/>
       </p>
       </section>
-      <section><title>Files that have never been accessed</title>
+      
+      <section><title>Files That Have Never Been Accessed</title>
       <p>This script finds files that were created but whose access times were never changed, meaning they were never opened or viewed.</p>
             <p><strong>neverAccessed.pig:</strong></p>
       <source>
@@ -306,7 +357,7 @@
       <p>This script can be run against pig with the following command and its output file's content will be a list of files that were created but never viewed afterwards.</p>
       <p><code>bin/pig -x local -param inputFile=../foo -param outputFile=../results ../neverAccessed.pig</code><br/></p>
       </section>
-      <section><title>Probable duplicated files based on file size</title>
+      <section><title>Probable Duplicated Files Based on File Size</title>
       <p>This script groups files together based on their size, drops any that are of less than 100mb and returns a list of the file size, number of files found and a tuple of the file paths.  This can be used to find likely duplicates within the filesystem namespace.</p>
       
             <p><strong>probableDuplicates.pig:</strong></p>
@@ -357,14 +408,16 @@
       <p>This script can be run against pig with the following command:</p>
       <p><code>bin/pig -x local -param inputFile=../foo -param outputFile=../results ../probableDuplicates.pig</code><br/></p>
       <p> The output file's content will be similar to that below:</p>
-      <p>
-        <code>1077288632  2 {(/user/tennant/work1/part-00501),(/user/tennant/work1/part-00993)}</code><br/>
-        <code>1077288664  4 {(/user/tennant/work0/part-00567),(/user/tennant/work0/part-03980),(/user/tennant/work1/part-00725),(/user/eccelston/output/part-03395)}</code><br/>
-        <code>1077288668  3 {(/user/tennant/work0/part-03705),(/user/tennant/work0/part-04242),(/user/tennant/work1/part-03839)}</code><br/>
-        <code>1077288698  2 {(/user/tennant/work0/part-00435),(/user/eccelston/output/part-01382)}</code><br/>
-        <code>1077288702  2 {(/user/tennant/work0/part-03864),(/user/eccelston/output/part-03234)}</code><br/>
-      </p>
-      <p>Each line includes the file size in bytes that was found to be duplicated, the number of duplicates found, and a list of the duplicated paths. Files less than 100MB are ignored, providing a reasonable likelihood that files of these exact sizes may be duplicates.</p>
+      
+<source>
+1077288632 2 {(/user/tennant/work1/part-00501),(/user/tennant/work1/part-00993)} 
+1077288664 4 {(/user/tennant/work0/part-00567),(/user/tennant/work0/part-03980),(/user/tennant/work1/part-00725),(/user/eccelston/output/part-03395)} 
+1077288668 3 {(/user/tennant/work0/part-03705),(/user/tennant/work0/part-04242),(/user/tennant/work1/part-03839)} 
+1077288698 2 {(/user/tennant/work0/part-00435),(/user/eccelston/output/part-01382)} 
+1077288702 2 {(/user/tennant/work0/part-03864),(/user/eccelston/output/part-03234)} 
+</source>      
+      <p>Each line includes the file size in bytes that was found to be duplicated, the number of duplicates found, and a list of the duplicated paths. 
+      Files less than 100MB are ignored, providing a reasonable likelihood that files of these exact sizes may be duplicates.</p>
       </section>
     </section>
 

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml Fri Sep 18 01:58:17 2009
@@ -24,17 +24,33 @@
 
   <header>
     <title>
-      HDFS Permissions Guide
+      Permissions Guide
     </title>
   </header>
 
   <body>
     <section> <title>Overview</title>
       <p>
-		The Hadoop Distributed File System (HDFS) implements a permissions model for files and directories that shares much of the POSIX model. Each file and directory is associated with an <em>owner</em> and a <em>group</em>. The file or directory has separate permissions for the user that is the owner, for other users that are members of the group, and for all other users. For files, the <em>r</em> permission is required to read the file, and the <em>w</em> permission is required to write or append to the file. For directories, the <em>r</em> permission is required to list the contents of the directory, the <em>w</em> permission is required to create or delete files or directories, and the <em>x</em> permission is required to access a child of the directory. In contrast to the POSIX model, there are no <em>setuid</em> or <em>setgid</em> bits for files as there is no notion of executable files. For directories, there are no <em>setuid</em> or <em>setgid</em> bits directory as a s
 implification. The <em>Sticky bit</em> can be set on directories, preventing anyone except the superuser, directory owner or file owner from deleting or moving the files within the directory. Setting the sticky bit for a file has no effect. Collectively, the permissions of a file or directory are its <em>mode</em>. In general, Unix customs for representing and displaying modes will be used, including the use of octal numbers in this description. When a file or directory is created, its owner is the user identity of the client process, and its group is the group of the parent directory (the BSD rule).
+		The Hadoop Distributed File System (HDFS) implements a permissions model for files and directories that shares much of the POSIX model. 
+		Each file and directory is associated with an <em>owner</em> and a <em>group</em>. The file or directory has separate permissions for the 
+		user that is the owner, for other users that are members of the group, and for all other users. 
+		
+		For files, the <em>r</em> permission is required to read the file, and the <em>w</em> permission is required to write or append to the file. 
+		
+		For directories, the <em>r</em> permission is required to list the contents of the directory, the <em>w</em> permission is required to create 
+		or delete files or directories, and the <em>x</em> permission is required to access a child of the directory. 
+		</p>
+	 <p>	
+		In contrast to the POSIX model, there are no <em>setuid</em> or <em>setgid</em> bits for files as there is no notion of executable files. 
+		For directories, there are no <em>setuid</em> or <em>setgid</em> bits directory as a simplification. The <em>Sticky bit</em> can be set 
+		on directories, preventing anyone except the superuser, directory owner or file owner from deleting or moving the files within the directory. 
+		Setting the sticky bit for a file has no effect. Collectively, the permissions of a file or directory are its <em>mode</em>. In general, Unix 
+		customs for representing and displaying modes will be used, including the use of octal numbers in this description. When a file or directory 
+		is created, its owner is the user identity of the client process, and its group is the group of the parent directory (the BSD rule).
 	</p>
 	<p>
-		Each client process that accesses HDFS has a two-part identity composed of the <em>user name</em>, and <em>groups list</em>. Whenever HDFS must do a permissions check for a file or directory <code>foo</code> accessed by a client process,
+		Each client process that accesses HDFS has a two-part identity composed of the <em>user name</em>, and <em>groups list</em>. 
+		Whenever HDFS must do a permissions check for a file or directory <code>foo</code> accessed by a client process,
 	</p>
 	<ul>
 		<li>
@@ -67,22 +83,34 @@
 </ul>
 
 <p>
-In the future there will be other ways of establishing user identity (think Kerberos, LDAP, and others). There is no expectation that this first method is secure in protecting one user from impersonating another. This user identity mechanism combined with the permissions model allows a cooperative community to share file system resources in an organized fashion.
+In the future there will be other ways of establishing user identity (think Kerberos, LDAP, and others). There is no expectation that 
+this first method is secure in protecting one user from impersonating another. This user identity mechanism combined with the 
+permissions model allows a cooperative community to share file system resources in an organized fashion.
 </p>
 <p>
-In any case, the user identity mechanism is extrinsic to HDFS itself. There is no provision within HDFS for creating user identities, establishing groups, or processing user credentials.
+In any case, the user identity mechanism is extrinsic to HDFS itself. There is no provision within HDFS for creating user identities, 
+establishing groups, or processing user credentials.
 </p>
 </section>
 
 <section> <title>Understanding the Implementation</title>
 <p>
-Each file or directory operation passes the full path name to the name node, and the permissions checks are applied along the path for each operation. The client framework will implicitly associate the user identity with the connection to the name node, reducing the need for changes to the existing client API. It has always been the case that when one operation on a file succeeds, the operation might fail when repeated because the file, or some directory on the path, no longer exists. For instance, when the client first begins reading a file, it makes a first request to the name node to discover the location of the first blocks of the file. A second request made to find additional blocks may fail. On the other hand, deleting a file does not revoke access by a client that already knows the blocks of the file. With the addition of permissions, a client's access to a file may be withdrawn between requests. Again, changing permissions does not revoke the access of a client that 
 already knows the file's blocks.
+Each file or directory operation passes the full path name to the name node, and the permissions checks are applied along the 
+path for each operation. The client framework will implicitly associate the user identity with the connection to the name node, 
+reducing the need for changes to the existing client API. It has always been the case that when one operation on a file succeeds, 
+the operation might fail when repeated because the file, or some directory on the path, no longer exists. For instance, when the 
+client first begins reading a file, it makes a first request to the name node to discover the location of the first blocks of the file. 
+A second request made to find additional blocks may fail. On the other hand, deleting a file does not revoke access by a client 
+that already knows the blocks of the file. With the addition of permissions, a client's access to a file may be withdrawn between 
+requests. Again, changing permissions does not revoke the access of a client that already knows the file's blocks.
 </p>
 <p>
-The map-reduce framework delegates the user identity by passing strings without special concern for confidentiality. The owner and group of a file or directory are stored as strings; there is no conversion from user and group identity numbers as is conventional in Unix.
+The MapReduce framework delegates the user identity by passing strings without special concern for confidentiality. The owner 
+and group of a file or directory are stored as strings; there is no conversion from user and group identity numbers as is conventional in Unix.
 </p>
 <p>
-The permissions features of this release did not require any changes to the behavior of data nodes. Blocks on the data nodes do not have any of the <em>Hadoop</em> ownership or permissions attributes associated with them.
+The permissions features of this release did not require any changes to the behavior of data nodes. Blocks on the data nodes 
+do not have any of the <em>Hadoop</em> ownership or permissions attributes associated with them.
 </p>
 </section>
      
@@ -93,7 +121,8 @@
 <p>New methods:</p>
 <ul>
 	<li>
-		<code>public FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException;</code>
+		<code>public FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short 
+		replication, long blockSize, Progressable progress) throws IOException;</code>
 	</li>
 	<li>
 		<code>public boolean mkdirs(Path f, FsPermission permission) throws IOException;</code>
@@ -105,84 +134,115 @@
 		<code>public void setOwner(Path p, String username, String groupname) throws IOException;</code>
 	</li>
 	<li>
-		<code>public FileStatus getFileStatus(Path f) throws IOException;</code> will additionally return the user, group and mode associated with the path.
+		<code>public FileStatus getFileStatus(Path f) throws IOException;</code> will additionally return the user, 
+		group and mode associated with the path.
 	</li>
 
 </ul>
 <p>
-The mode of a new file or directory is restricted my the <code>umask</code> set as a configuration parameter. When the existing <code>create(path, &hellip;)</code> method (<em>without</em> the permission parameter) is used, the mode of the new file is <code>666&thinsp;&amp;&thinsp;^umask</code>. When the new <code>create(path, </code><em>permission</em><code>, &hellip;)</code> method (<em>with</em> the permission parameter <em>P</em>) is used, the mode of the new file is <code>P&thinsp;&amp;&thinsp;^umask&thinsp;&amp;&thinsp;666</code>. When a new directory is created with the existing <code>mkdirs(path)</code> method (<em>without</em> the permission parameter), the mode of the new directory is <code>777&thinsp;&amp;&thinsp;^umask</code>. When the new <code>mkdirs(path, </code><em>permission</em> <code>)</code> method (<em>with</em> the permission parameter <em>P</em>) is used, the mode of new directory is <code>P&thinsp;&amp;&thinsp;^umask&thinsp;&amp;&thinsp;777</code>. 
+The mode of a new file or directory is restricted my the <code>umask</code> set as a configuration parameter. 
+When the existing <code>create(path, &hellip;)</code> method (<em>without</em> the permission parameter) 
+is used, the mode of the new file is <code>666&thinsp;&amp;&thinsp;^umask</code>. When the 
+new <code>create(path, </code><em>permission</em><code>, &hellip;)</code> method 
+(<em>with</em> the permission parameter <em>P</em>) is used, the mode of the new file is 
+<code>P&thinsp;&amp;&thinsp;^umask&thinsp;&amp;&thinsp;666</code>. When a new directory is 
+created with the existing <code>mkdirs(path)</code> method (<em>without</em> the permission parameter), 
+the mode of the new directory is <code>777&thinsp;&amp;&thinsp;^umask</code>. When the 
+new <code>mkdirs(path, </code><em>permission</em> <code>)</code> method (<em>with</em> the 
+permission parameter <em>P</em>) is used, the mode of new directory is 
+<code>P&thinsp;&amp;&thinsp;^umask&thinsp;&amp;&thinsp;777</code>. 
 </p>
 </section>
 
      
 <section> <title>Changes to the Application Shell</title>
 <p>New operations:</p>
-<dl>
-	<dt><code>chmod [-R]</code> <em>mode file &hellip;</em></dt>
-	<dd>
-		Only the owner of a file or the super-user is permitted to change the mode of a file.
-	</dd>
-	<dt><code>chgrp [-R]</code> <em>group file &hellip;</em></dt>
-	<dd>
-		The user invoking <code>chgrp</code> must belong to the specified group and be the owner of the file, or be the super-user.
-	</dd>
-	<dt><code>chown [-R]</code> <em>[owner][:[group]] file &hellip;</em></dt>
-	<dd>
-		The owner of a file may only be altered by a super-user.
-	</dd>
-	<dt><code>ls </code> <em>file &hellip;</em></dt><dd></dd>
-	<dt><code>lsr </code> <em>file &hellip;</em></dt>
-	<dd>
-		The output is reformatted to display the owner, group and mode.
-	</dd>
-</dl></section>
+<ul>
+	<li><code>chmod [-R]</code> <em>mode file &hellip;</em>
+	<br />Only the owner of a file or the super-user is permitted to change the mode of a file.
+    </li>
+    
+	<li><code>chgrp [-R]</code> <em>group file &hellip;</em>
+	<br />The user invoking <code>chgrp</code> must belong to the specified group and be the owner of the file, or be the super-user.
+    </li>
+    
+	<li><code>chown [-R]</code> <em>[owner][:[group]] file &hellip;</em>
+    <br />The owner of a file may only be altered by a super-user.
+    </li>
+	
+	<li><code>ls </code> <em>file &hellip;</em>
+	</li>
+
+	<li><code>lsr </code> <em>file &hellip;</em>
+    <br />The output is reformatted to display the owner, group and mode.
+	</li>
+</ul>
+</section>
 
      
 <section> <title>The Super-User</title>
 <p>
-	The super-user is the user with the same identity as name node process itself. Loosely, if you started the name node, then you are the super-user. The super-user can do anything in that permissions checks never fail for the super-user. There is no persistent notion of who <em>was</em> the super-user; when the name node is started the process identity determines who is the super-user <em>for now</em>. The HDFS super-user does not have to be the super-user of the name node host, nor is it necessary that all clusters have the same super-user. Also, an experimenter running HDFS on a personal workstation, conveniently becomes that installation's super-user without any configuration.
+	The super-user is the user with the same identity as name node process itself. Loosely, if you started the name 
+	node, then you are the super-user. The super-user can do anything in that permissions checks never fail for the 
+	super-user. There is no persistent notion of who <em>was</em> the super-user; when the name node is started 
+	the process identity determines who is the super-user <em>for now</em>. The HDFS super-user does not have 
+	to be the super-user of the name node host, nor is it necessary that all clusters have the same super-user. Also, 
+	an experimenter running HDFS on a personal workstation, conveniently becomes that installation's super-user 
+	without any configuration.
 	</p>
 	<p>
-	In addition, the administrator my identify a distinguished group using a configuration parameter. If set, members of this group are also super-users.
+	In addition, the administrator my identify a distinguished group using a configuration parameter. If set, members 
+	of this group are also super-users.
 </p>
 </section>
 
 <section> <title>The Web Server</title>
 <p>
-The identity of the web server is a configuration parameter. That is, the name node has no notion of the identity of the <em>real</em> user, but the web server behaves as if it has the identity (user and groups) of a user chosen by the administrator. Unless the chosen identity matches the super-user, parts of the name space may be invisible to the web server.</p>
+The identity of the web server is a configuration parameter. That is, the name node has no notion of the identity of 
+the <em>real</em> user, but the web server behaves as if it has the identity (user and groups) of a user chosen 
+by the administrator. Unless the chosen identity matches the super-user, parts of the name space may be invisible 
+to the web server.</p>
 </section>
 
 <section> <title>On-line Upgrade</title>
 <p>
-If a cluster starts with a version 0.15 data set (<code>fsimage</code>), all files and directories will have owner <em>O</em>, group <em>G</em>, and mode <em>M</em>, where <em>O</em> and <em>G</em> are the user and group identity of the super-user, and <em>M</em> is a configuration parameter. </p>
+If a cluster starts with a version 0.15 data set (<code>fsimage</code>), all files and directories will have 
+owner <em>O</em>, group <em>G</em>, and mode <em>M</em>, where <em>O</em> and <em>G</em> 
+are the user and group identity of the super-user, and <em>M</em> is a configuration parameter. </p>
 </section>
 
 <section> <title>Configuration Parameters</title>
-<dl>
-	<dt><code>dfs.permissions = true </code></dt>
-	<dd>
-		If <code>yes</code> use the permissions system as described here. If <code>no</code>, permission <em>checking</em> is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories.
-		<p>
-		</p>
-		Regardless of whether permissions are on or off, <code>chmod</code>, <code>chgrp</code> and <code>chown</code> <em>always</em> check permissions. These functions are only useful in the permissions context, and so there is no backwards compatibility issue. Furthermore, this allows administrators to reliably set owners and permissions in advance of turning on regular permissions checking.
-	</dd>
-	<dt><code>dfs.web.ugi = webuser,webgroup</code></dt>
-	<dd>
-		The user name to be used by the web server. Setting this to the name of the super-user allows any web client to see everything. Changing this to an otherwise unused identity allows web clients to see only those things visible using "other" permissions. Additional groups may be added to the comma-separated list.
-	</dd>
-	<dt><code>dfs.permissions.supergroup = supergroup</code></dt>
-	<dd>
-		The name of the group of super-users.
-	</dd>
-	<dt><code>dfs.upgrade.permission = 0777</code></dt>
-	<dd>
-		The choice of initial mode during upgrade. The <em>x</em> permission is <em>never</em> set for files. For configuration files, the decimal value <em>511<sub>10</sub></em> may be used.
-	</dd>
-	<dt><code>dfs.umask = 022</code></dt>
-	<dd>
-		The <code>umask</code> used when creating files and directories. For configuration files, the decimal value <em>18<sub>10</sub></em> may be used.
-	</dd>
-</dl>
+<ul>
+	<li><code>dfs.permissions = true </code>
+		<br />If <code>yes</code> use the permissions system as described here. If <code>no</code>, permission 
+		<em>checking</em> is turned off, but all other behavior is unchanged. Switching from one parameter 
+		value to the other does not change the mode, owner or group of files or directories.
+		<br />Regardless of whether permissions are on or off, <code>chmod</code>, <code>chgrp</code> and 
+		<code>chown</code> <em>always</em> check permissions. These functions are only useful in the 
+		permissions context, and so there is no backwards compatibility issue. Furthermore, this allows 
+		administrators to reliably set owners and permissions in advance of turning on regular permissions checking.
+    </li>
+
+	<li><code>dfs.web.ugi = webuser,webgroup</code>
+	<br />The user name to be used by the web server. Setting this to the name of the super-user allows any 
+		web client to see everything. Changing this to an otherwise unused identity allows web clients to see 
+		only those things visible using "other" permissions. Additional groups may be added to the comma-separated list.
+    </li>
+    
+	<li><code>dfs.permissions.supergroup = supergroup</code>
+	<br />The name of the group of super-users.
+	</li>
+
+	<li><code>dfs.upgrade.permission = 0777</code>
+	<br />The choice of initial mode during upgrade. The <em>x</em> permission is <em>never</em> set for files. 
+		For configuration files, the decimal value <em>511<sub>10</sub></em> may be used.
+    </li>
+    
+	<li><code>dfs.umask = 022</code>
+    <br />The <code>umask</code> used when creating files and directories. For configuration files, the decimal 
+		value <em>18<sub>10</sub></em> may be used.
+	</li>
+</ul>
 </section>
 
      

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml Fri Sep 18 01:58:17 2009
@@ -20,13 +20,16 @@
 
 <document>
 
- <header> <title> HDFS Quotas Guide</title> </header>
+ <header> <title>Quotas Guide</title> </header>
 
  <body>
+ 
+ <section> <title>Overview</title>
 
- <p> The Hadoop Distributed File System (HDFS) allows the administrator to set quotas for the number of names used and the
+ <p> The Hadoop Distributed File System (HDFS) allows the <strong>administrator</strong> to set quotas for the number of names used and the
 amount of space used for individual directories. Name quotas and space quotas operate independently, but the administration and
 implementation of the two types of quotas are closely parallel. </p>
+</section>
 
 <section> <title>Name Quotas</title>
 

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/hdfs_user_guide.xml Fri Sep 18 01:58:17 2009
@@ -24,7 +24,7 @@
 
   <header>
     <title>
-      HDFS User Guide
+      HDFS Users Guide
     </title>
   </header>
 
@@ -32,9 +32,8 @@
     <section> <title>Purpose</title>
       <p>
  This document is a starting point for users working with
- Hadoop Distributed File System (HDFS) either as a part of a
- <a href="http://hadoop.apache.org/">Hadoop</a>
- cluster or as a stand-alone general purpose distributed file system.
+ Hadoop Distributed File System (HDFS) either as a part of a Hadoop cluster  
+ or as a stand-alone general purpose distributed file system.
  While HDFS is designed to "just work" in many environments, a working
  knowledge of HDFS helps greatly with configuration improvements and
  diagnostics on a specific cluster.
@@ -46,7 +45,7 @@
  HDFS is the primary distributed storage used by Hadoop applications. A
  HDFS cluster primarily consists of a NameNode that manages the
  file system metadata and DataNodes that store the actual data. The
- <a href="hdfs_design.html">HDFS Architecture</a> describes HDFS in detail. This user guide primarily deals with 
+ <a href="hdfs_design.html">HDFS Architecture Guide</a> describes HDFS in detail. This user guide primarily deals with 
  the interaction of users and administrators with HDFS clusters. 
  The <a href="images/hdfsarchitecture.gif">HDFS architecture diagram</a> depicts 
  basic interactions among NameNode, the DataNodes, and the clients. 
@@ -61,8 +60,7 @@
     <li>
     	Hadoop, including HDFS, is well suited for distributed storage
     	and distributed processing using commodity hardware. It is fault
-    	tolerant, scalable, and extremely simple to expand.
-    	<a href="mapred_tutorial.html">Map/Reduce</a>,
+    	tolerant, scalable, and extremely simple to expand. MapReduce, 
     	well known for its simplicity and applicability for large set of
     	distributed applications, is an integral part of Hadoop.
     </li>
@@ -134,18 +132,17 @@
     </li>
     </ul>
     
-    </section> <section> <title> Pre-requisites </title>
+    </section> <section> <title> Prerequisites </title>
     <p>
- 	The following documents describe installation and set up of a
- 	Hadoop cluster : 
+ 	The following documents describe how to install and set up a Hadoop cluster: 
     </p>
  	<ul>
  	<li>
- 		<a href="quickstart.html">Hadoop Quick Start</a>
+ 		<a href="http://hadoop.apache.org/common/docs/current/single_node_setup.html">Single Node Setup</a>
  		for first-time users.
  	</li>
  	<li>
- 		<a href="cluster_setup.html">Hadoop Cluster Setup</a>
+ 		<a href="http://hadoop.apache.org/common/docs/current/cluster_setup.html">Cluster Setup</a>
  		for large, distributed clusters.
  	</li>
     </ul>
@@ -173,14 +170,15 @@
       Hadoop includes various shell-like commands that directly
       interact with HDFS and other file systems that Hadoop supports.
       The command
-      <code>bin/hadoop fs -help</code>
+      <code>bin/hdfs dfs -help</code>
       lists the commands supported by Hadoop
       shell. Furthermore, the command
-      <code>bin/hadoop fs -help command-name</code>
+      <code>bin/hdfs dfs -help command-name</code>
       displays more detailed help for a command. These commands support
-      most of the normal files ystem operations like copying files,
+      most of the normal files system operations like copying files,
       changing file permissions, etc. It also supports a few HDFS
-      specific operations like changing replication of files.
+      specific operations like changing replication of files. 
+      For more information see <a href="http://hadoop.apache.org/common/docs/current/file_system_shell.html">File System Shell Guide</a>.
      </p>
 
    <section> <title> DFSAdmin Command </title>
@@ -223,17 +221,19 @@
     </li>
    	</ul>
    	<p>
-   	  For command usage, see <a href="commands_manual.html#dfsadmin">dfsadmin command</a>.
+   	  For command usage, see  
+   	  <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#dfsadmin">dfsadmin</a>.
    	</p>  
    </section>
    
    </section> 
 	<section> <title>Secondary NameNode</title>
-   <p>
-     The Secondary NameNode has been deprecated; considering using the 
-   <a href="hdfs_user_guide.html#Checkpoint+node">Checkpoint node</a> or 
-   <a href="hdfs_user_guide.html#Backup+node">Backup node</a> instead.
-   </p>
+   <note>
+   The Secondary NameNode has been deprecated. 
+   Instead, consider using the 
+   <a href="hdfs_user_guide.html#Checkpoint+node">Checkpoint Node</a> or 
+   <a href="hdfs_user_guide.html#Backup+node">Backup Node</a>.
+   </note>
    <p>	
      The NameNode stores modifications to the file system as a log
      appended to a native file system file, <code>edits</code>. 
@@ -277,10 +277,11 @@
      read by the primary NameNode if necessary.
    </p>
    <p>
-     For command usage, see <a href="commands_manual.html#secondarynamenode"><code>secondarynamenode</code> command</a>.
+     For command usage, see  
+     <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#secondarynamenode">secondarynamenode</a>.
    </p>
    
-   </section><section> <title> Checkpoint node </title>
+   </section><section> <title> Checkpoint Node </title>
    <p>NameNode persists its namespace using two files: <code>fsimage</code>,
       which is the latest checkpoint of the namespace and <code>edits</code>,
       a journal (log) of changes to the namespace since the checkpoint.
@@ -329,17 +330,17 @@
    </p>
    <p>Multiple checkpoint nodes may be specified in the cluster configuration file.</p>
    <p>
-     For command usage, see
-     <a href="commands_manual.html#namenode"><code>namenode</code> command</a>.
+     For command usage, see  
+     <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#namenode">namenode</a>.
    </p>
    </section>
 
-   <section> <title> Backup node </title>
+   <section> <title> Backup Node </title>
    <p>	
     The Backup node provides the same checkpointing functionality as the 
     Checkpoint node, as well as maintaining an in-memory, up-to-date copy of the
     file system namespace that is always synchronized with the active NameNode state.
-    Along with accepting a journal stream of filesystem edits from 
+    Along with accepting a journal stream of file system edits from 
     the NameNode and persisting this to disk, the Backup node also applies 
     those edits into its own copy of the namespace in memory, thus creating 
     a backup of the namespace.
@@ -384,12 +385,12 @@
     For a complete discussion of the motivation behind the creation of the 
     Backup node and Checkpoint node, see 
     <a href="https://issues.apache.org/jira/browse/HADOOP-4539">HADOOP-4539</a>.
-    For command usage, see 
-    <a href="commands_manual.html#namenode"><code>namenode</code> command</a>.
+    For command usage, see  
+     <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#namenode">namenode</a>.
    </p>
    </section>
 
-   <section> <title> Import checkpoint </title>
+   <section> <title> Import Checkpoint </title>
    <p>
      The latest checkpoint can be imported to the NameNode if
      all other copies of the image and the edits files are lost.
@@ -418,8 +419,8 @@
      consistent, but does not modify it in any way.
    </p>
    <p>
-     For command usage, see
-     <a href="commands_manual.html#namenode"><code>namenode</code> command</a>.
+     For command usage, see  
+      <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#namenode">namenode</a>.
    </p>
    </section>
 
@@ -461,7 +462,8 @@
       <a href="http://issues.apache.org/jira/browse/HADOOP-1652">HADOOP-1652</a>.
     </p>
     <p>
-     For command usage, see <a href="commands_manual.html#balancer">balancer command</a>.
+     For command usage, see  
+     <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#balancer">balancer</a>.
    </p>
     
    </section> <section> <title> Rack Awareness </title>
@@ -512,7 +514,8 @@
       <code>fsck</code> ignores open files but provides an option to select all files during reporting.
       The HDFS <code>fsck</code> command is not a
       Hadoop shell command. It can be run as '<code>bin/hadoop fsck</code>'.
-      For command usage, see <a href="commands_manual.html#fsck"><code>fsck</code> command</a>. 
+      For command usage, see  
+      <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html#fsck">fsck</a>.
       <code>fsck</code> can be run on the whole file system or on a subset of files.
      </p>
      
@@ -527,7 +530,7 @@
       of Hadoop and rollback the cluster to the state it was in 
       before
       the upgrade. HDFS upgrade is described in more detail in 
-      <a href="http://wiki.apache.org/hadoop/Hadoop%20Upgrade">upgrade wiki</a>.
+      <a href="http://wiki.apache.org/hadoop/Hadoop%20Upgrade">Hadoop Upgrade</a> Wiki page.
       HDFS can have one such backup at a time. Before upgrading,
       administrators need to remove existing backup using <code>bin/hadoop
       dfsadmin -finalizeUpgrade</code> command. The following
@@ -571,13 +574,13 @@
       treated as the superuser for HDFS. Future versions of HDFS will
       support network authentication protocols like Kerberos for user
       authentication and encryption of data transfers. The details are discussed in the 
-      <a href="hdfs_permissions_guide.html">HDFS Admin Guide: Permissions</a>.
+      <a href="hdfs_permissions_guide.html">Permissions Guide</a>.
      </p>
      
    </section> <section> <title> Scalability </title>
      <p>
-      Hadoop currently runs on clusters with thousands of nodes.
-      <a href="http://wiki.apache.org/hadoop/PoweredBy">Powered By Hadoop</a>
+      Hadoop currently runs on clusters with thousands of nodes. The  
+      <a href="http://wiki.apache.org/hadoop/PoweredBy">PoweredBy</a> Wiki page 
       lists some of the organizations that deploy Hadoop on large
       clusters. HDFS has one NameNode for each cluster. Currently
       the total memory available on NameNode is the primary scalability
@@ -585,8 +588,8 @@
       files stored in HDFS helps with increasing cluster size without
       increasing memory requirements on NameNode.
    
-      The default configuration may not suite very large clustes.
-      <a href="http://wiki.apache.org/hadoop/FAQ">Hadoop FAQ</a> page lists
+      The default configuration may not suite very large clustes. The 
+      <a href="http://wiki.apache.org/hadoop/FAQ">FAQ</a> Wiki page lists
       suggested configuration improvements for large Hadoop clusters.
      </p>
      
@@ -599,15 +602,16 @@
       </p>
       <ul>
       <li>
-        <a href="http://hadoop.apache.org/">Hadoop Home Page</a>: The start page for everything Hadoop.
+        <a href="http://hadoop.apache.org/">Hadoop Site</a>: The home page for the Apache Hadoop site.
       </li>
       <li>
-        <a href="http://wiki.apache.org/hadoop/FrontPage">Hadoop Wiki</a>
-        : Front page for Hadoop Wiki documentation. Unlike this
-        guide which is part of Hadoop source tree, Hadoop Wiki is
+        <a href="http://wiki.apache.org/hadoop/FrontPage">Hadoop Wiki</a>:
+        The home page (FrontPage) for the Hadoop Wiki. Unlike the released documentation, 
+        which is part of Hadoop source tree, Hadoop Wiki is
         regularly edited by Hadoop Community.
       </li>
-      <li> <a href="http://wiki.apache.org/hadoop/FAQ">FAQ</a> from Hadoop Wiki.
+      <li> <a href="http://wiki.apache.org/hadoop/FAQ">FAQ</a>: 
+      The FAQ Wiki page.
       </li>
       <li>
         Hadoop <a href="http://hadoop.apache.org/core/docs/current/api/">
@@ -623,7 +627,7 @@
          description of most of the configuration variables available.
       </li>
       <li>
-        <a href="commands_manual.html">Hadoop Command Guide</a>: commands usage.
+        <a href="http://hadoop.apache.org/common/docs/current/commands_manual.html">Hadoop Commands Guide</a>: Hadoop commands usage.
       </li>
       </ul>
      </section>

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/index.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/index.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/index.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/index.xml Fri Sep 18 01:58:17 2009
@@ -19,19 +19,28 @@
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
 
 <document>
-  
   <header>
     <title>Overview</title>
   </header>
-  
   <body>
+  
   <p>
-  The Hadoop Documentation provides the information you need to get started using Hadoop, the Hadoop Distributed File System (HDFS), and Hadoop on Demand (HOD).
-  </p><p>
-Begin with the <a href="quickstart.html">Hadoop Quick Start</a> which shows you how to set up a single-node Hadoop installation. Then move on to the <a href="cluster_setup.html">Hadoop Cluster Setup</a> to learn how to set up a multi-node Hadoop installation. Once your Hadoop installation is in place, try out the <a href="mapred_tutorial.html">Hadoop Map/Reduce Tutorial</a>. 
-  </p><p>
-If you have more questions, you can ask on the <a href="ext:lists">Hadoop Core Mailing Lists</a> or browse the <a href="ext:archive">Mailing List Archives</a>.
-    </p>
-  </body>
+  The HDFS Documentation provides the information you need to get started using the Hadoop Distributed File System. 
+  Begin with the <a href="hdfs_user_guide.html">HDFS Users Guide</a> to obtain an overview of the system and then
+  move on to the <a href="hdfs_design.html">HDFS Architecture Guide</a> for more detailed information.
+  </p>
   
+  <p>
+   HDFS commonly works in tandem with a cluster environment and MapReduce applications. 
+   For information about Hadoop clusters (single or multi node) see the 
+ <a href="http://hadoop.apache.org/common/docs/current/index.html">Hadoop Common Documentation</a>.
+   For information about MapReduce see the 
+ <a href="http://hadoop.apache.org/mapreduce/docs/current/index.html">MapReduce Documentation</a>.
+  </p>   
+  
+<p>
+If you have more questions, you can ask on the <a href="ext:lists">HDFS Mailing Lists</a> or browse the <a href="ext:archive">Mailing List Archives</a>.
+</p>
+
+</body>
 </document>

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/libhdfs.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/libhdfs.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/libhdfs.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/libhdfs.xml Fri Sep 18 01:58:17 2009
@@ -19,20 +19,22 @@
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
           "http://forrest.apache.org/dtd/document-v20.dtd">
 
-
 <document>
 <header>
-<title>C API to HDFS: libhdfs</title>
+<title>C API libhdfs</title>
 <meta name="http-equiv">Content-Type</meta>
 <meta name="content">text/html;</meta>
 <meta name="charset">utf-8</meta>
 </header>
 <body>
 <section>
-<title>C API to HDFS: libhdfs</title>
+<title>Overview</title>
 
 <p>
-libhdfs is a JNI based C api for Hadoop's DFS. It provides C apis to a subset of the HDFS APIs to manipulate DFS files and the filesystem. libhdfs is part of the hadoop distribution and comes pre-compiled in ${HADOOP_HOME}/libhdfs/libhdfs.so .
+libhdfs is a JNI based C API for Hadoop's Distributed File System (HDFS).
+It provides C APIs to a subset of the HDFS APIs to manipulate HDFS files and
+the filesystem. libhdfs is part of the Hadoop distribution and comes 
+pre-compiled in ${HADOOP_HOME}/libhdfs/libhdfs.so .
 </p>
 
 </section>
@@ -47,7 +49,7 @@
 </p>
 </section>
 <section>
-<title>A sample program</title>
+<title>A Sample Program</title>
 
 <source>
 #include "hdfs.h" 
@@ -69,29 +71,40 @@
     }
    hdfsCloseFile(fs, writeFile);
 }
-
 </source>
 </section>
 
 <section>
-<title>How to link with the library</title>
+<title>How To Link With The Library</title>
 <p>
-See the Makefile for hdfs_test.c in the libhdfs source directory (${HADOOP_HOME}/src/c++/libhdfs/Makefile) or something like:
+See the Makefile for hdfs_test.c in the libhdfs source directory (${HADOOP_HOME}/src/c++/libhdfs/Makefile) or something like:<br />
 gcc above_sample.c -I${HADOOP_HOME}/src/c++/libhdfs -L${HADOOP_HOME}/libhdfs -lhdfs -o above_sample
 </p>
 </section>
 <section>
-<title>Common problems</title>
+<title>Common Problems</title>
 <p>
-The most common problem is the CLASSPATH is not set properly when calling a program that uses libhdfs. Make sure you set it to all the hadoop jars needed to run Hadoop itself. Currently, there is no way to programmatically generate the classpath, but a good bet is to include all the jar files in ${HADOOP_HOME} and ${HADOOP_HOME}/lib as well as the right configuration directory containing hdfs-site.xml
+The most common problem is the CLASSPATH is not set properly when calling a program that uses libhdfs. 
+Make sure you set it to all the Hadoop jars needed to run Hadoop itself. Currently, there is no way to 
+programmatically generate the classpath, but a good bet is to include all the jar files in ${HADOOP_HOME} 
+and ${HADOOP_HOME}/lib as well as the right configuration directory containing hdfs-site.xml
 </p>
 </section>
 <section>
-<title>libhdfs is thread safe</title>
-<p>Concurrency and Hadoop FS "handles" - the hadoop FS implementation includes a FS handle cache which caches based on the URI of the namenode along with the user connecting. So, all calls to hdfsConnect will return the same handle but calls to hdfsConnectAsUser with different users will return different handles.  But, since HDFS client handles are completely thread safe, this has no bearing on concurrency. 
-</p>
-<p>Concurrency and libhdfs/JNI - the libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs should be as thread safe as the underlying calls to the Hadoop FS.
-</p>
+<title>Thread Safe</title>
+<p>libdhfs is thread safe.</p>
+<ul>
+<li>Concurrency and Hadoop FS "handles" 
+<br />The Hadoop FS implementation includes a FS handle cache which caches based on the URI of the 
+namenode along with the user connecting. So, all calls to hdfsConnect will return the same handle but 
+calls to hdfsConnectAsUser with different users will return different handles.  But, since HDFS client 
+handles are completely thread safe, this has no bearing on concurrency. 
+</li>
+<li>Concurrency and libhdfs/JNI 
+<br />The libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs 
+should be as thread safe as the underlying calls to the Hadoop FS.
+</li>
+</ul>
 </section>
 </body>
 </document>

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/site.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/site.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/site.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/site.xml Fri Sep 18 01:58:17 2009
@@ -33,44 +33,20 @@
 <site label="Hadoop" href="" xmlns="http://apache.org/forrest/linkmap/1.0">
   
    <docs label="Getting Started"> 
-		<overview   				label="Overview" 					href="index.html" />
-		<quickstart 				label="Quick Start"        		href="quickstart.html" />
-		<setup     					label="Cluster Setup"      		href="cluster_setup.html" />
-		<mapred    				label="Map/Reduce Tutorial" 	href="mapred_tutorial.html" />
-  </docs>	
-		
- <docs label="Programming Guides">
-		<commands 				label="Commands"     					href="commands_manual.html" />
-		<distcp    					label="DistCp"       						href="distcp.html" />
-		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
-		<streaming 				label="Streaming"          				href="streaming.html" />
-		<fair_scheduler 			label="Fair Scheduler" 					href="fair_scheduler.html"/>
-        <hdfsproxy 			label="HDFS Proxy" 					href="hdfsproxy.html"/>
-		<cap_scheduler 		label="Capacity Scheduler" 			href="capacity_scheduler.html"/>
-		<SLA					 	label="Service Level Authorization" 	href="service_level_auth.html"/>
-		<vaidya    					label="Vaidya" 								href="vaidya.html"/>
-		<archives  				label="Archives"     						href="hadoop_archives.html"/>
+     <hdfsproxy 			label="HDFS Proxy" 					href="hdfsproxy.html"/>
+     <hdfs_user      				label="User Guide"    							href="hdfs_user_guide.html" />
+     <hdfs_arch     				label="Architecture"  								href="hdfs_design.html" />	
    </docs>
-   
-   <docs label="HDFS">
-		<hdfs_user      				label="User Guide"    							href="hdfs_user_guide.html" />
-		<hdfs_arch     				label="Architecture"  								href="hdfs_design.html" />	
-		<hdfs_fs       	 				label="File System Shell Guide"     		href="hdfs_shell.html" />
-		<hdfs_perm      				label="Permissions Guide"    					href="hdfs_permissions_guide.html" />
-		<hdfs_quotas     			label="Quotas Guide" 							href="hdfs_quota_admin_guide.html" />
-		<hdfs_SLG        			label="Synthetic Load Generator Guide"  href="SLG_user_guide.html" />
-		<hdfs_imageviewer						label="Offline Image Viewer Guide"	href="hdfs_imageviewer.html" />
-		<hdfs_libhdfs   				label="C API libhdfs"         						href="libhdfs.html" /> 
-                <docs label="Testing">
-                    <faultinject_framework              label="Fault Injection"                                                     href="faultinject_framework.html" />
-                </docs>
-   </docs> 
-   
-   <docs label="HOD">
-		<hod_user 	label="User Guide" 	href="hod_user_guide.html"/>
-		<hod_admin 	label="Admin Guide" 	href="hod_admin_guide.html"/>
-		<hod_config 	label="Config Guide" 	href="hod_config_guide.html"/> 
-   </docs> 
+   <docs label="Guides">
+      <hdfs_perm      				label="Permissions Guide"    					href="hdfs_permissions_guide.html" />
+      <hdfs_quotas     			label="Quotas Guide" 							href="hdfs_quota_admin_guide.html" />
+      <hdfs_SLG        			label="Synthetic Load Generator Guide"  href="SLG_user_guide.html" />
+      <hdfs_imageviewer						label="Offline Image Viewer Guide"	href="hdfs_imageviewer.html" />
+      <hdfs_libhdfs   				label="C API libhdfs"         						href="libhdfs.html" /> 
+    </docs>
+    <docs label="Testing">
+      <faultinject_framework              label="Fault Injection"                                                     href="faultinject_framework.html" />
+    </docs>
    
    <docs label="Miscellaneous"> 
 		<api       	label="API Docs"           href="ext:api/index" />
@@ -82,19 +58,20 @@
    </docs> 
    
   <external-refs>
-    <site      href="http://hadoop.apache.org/core/"/>
-    <lists     href="http://hadoop.apache.org/core/mailing_lists.html"/>
-    <archive   href="http://mail-archives.apache.org/mod_mbox/hadoop-core-commits/"/>
-    <releases  href="http://hadoop.apache.org/core/releases.html">
-      <download href="#Download" />
+    <site      href="http://hadoop.apache.org/hdfs/"/>
+    <lists     href="http://hadoop.apache.org/hdfs/mailing_lists.html"/>
+    <archive   href="http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-commits/"/>
+    <releases  href="http://hadoop.apache.org/hdfs/releases.html">
+              <download href="#Download" />
     </releases>
-    <jira      href="http://hadoop.apache.org/core/issue_tracking.html"/>
-    <wiki      href="http://wiki.apache.org/hadoop/" />
-    <faq       href="http://wiki.apache.org/hadoop/FAQ" />
-    <hadoop-default href="http://hadoop.apache.org/core/docs/current/hadoop-default.html" />
-    <core-default href="http://hadoop.apache.org/core/docs/current/core-default.html" />
-    <hdfs-default href="http://hadoop.apache.org/core/docs/current/hdfs-default.html" />
-    <mapred-default href="http://hadoop.apache.org/core/docs/current/mapred-default.html" />
+    <jira      href="http://hadoop.apache.org/hdfs/issue_tracking.html"/>
+    <wiki      href="http://wiki.apache.org/hadoop/HDFS" />
+    <faq       href="http://wiki.apache.org/hadoop/HDFS/FAQ" />
+    
+    <common-default href="http://hadoop.apache.org/common/docs/current/common-default.html" />
+    <hdfs-default href="http://hadoop.apache.org/hdfs/docs/current/hdfs-default.html" />
+    <mapred-default href="http://hadoop.apache.org/mapreduce/docs/current/mapred-default.html" />
+    
     <zlib      href="http://www.zlib.net/" />
     <gzip      href="http://www.gzip.org/" />
     <bzip      href="http://www.bzip.org/" />

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/tabs.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/tabs.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/tabs.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/content/xdocs/tabs.xml Fri Sep 18 01:58:17 2009
@@ -30,8 +30,8 @@
     directory (ends in '/'), in which case /index.html will be added
   -->
 
-  <tab label="Project" href="http://hadoop.apache.org/core/" />
-  <tab label="Wiki" href="http://wiki.apache.org/hadoop" />
-  <tab label="Hadoop 0.21 Documentation" dir="" />  
+  <tab label="Project" href="http://hadoop.apache.org/hdfs/" />
+  <tab label="Wiki" href="http://wiki.apache.org/hadoop/hdfs" />
+  <tab label="HDFS 0.21 Documentation" dir="" />  
   
 </tabs>

Added: hadoop/hdfs/trunk/src/docs/src/documentation/resources/images/hdfs-logo.jpg
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/resources/images/hdfs-logo.jpg?rev=816432&view=auto
==============================================================================
Binary file - no diff available.

Propchange: hadoop/hdfs/trunk/src/docs/src/documentation/resources/images/hdfs-logo.jpg
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Modified: hadoop/hdfs/trunk/src/docs/src/documentation/skinconf.xml
URL: http://svn.apache.org/viewvc/hadoop/hdfs/trunk/src/docs/src/documentation/skinconf.xml?rev=816432&r1=816431&r2=816432&view=diff
==============================================================================
--- hadoop/hdfs/trunk/src/docs/src/documentation/skinconf.xml (original)
+++ hadoop/hdfs/trunk/src/docs/src/documentation/skinconf.xml Fri Sep 18 01:58:17 2009
@@ -67,8 +67,8 @@
   <!-- project logo -->
   <project-name>Hadoop</project-name>
   <project-description>Scalable Computing Platform</project-description>
-  <project-url>http://hadoop.apache.org/core/</project-url>
-  <project-logo>images/core-logo.gif</project-logo>
+  <project-url>http://hadoop.apache.org/hdfs/</project-url>
+  <project-logo>images/hdfs-logo.jpg</project-logo>
 
   <!-- group logo -->
   <group-name>Hadoop</group-name>
@@ -146,13 +146,13 @@
     <!--Headers -->
 	#content h1 {
 	  margin-bottom: .5em;
-	  font-size: 200%; color: black;
+	  font-size: 185%; color: black;
 	  font-family: arial;
 	}  
-    h2, .h3 { font-size: 195%; color: black; font-family: arial; }
-	h3, .h4 { font-size: 140%; color: black; font-family: arial; margin-bottom: 0.5em; }
+    h2, .h3 { font-size: 175%; color: black; font-family: arial; }
+	h3, .h4 { font-size: 135%; color: black; font-family: arial; margin-bottom: 0.5em; }
 	h4, .h5 { font-size: 125%; color: black;  font-style: italic; font-weight: bold; font-family: arial; }
-	h5, h6 { font-size: 110%; color: #363636; font-weight: bold; } 
+	h5, h6 { font-size: 110%; color: #363636; font-weight: bold; }    
    
    <!--Code Background -->
     pre.code {