You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by om...@apache.org on 2011/03/04 05:07:36 UTC

svn commit: r1077365 [5/5] - in /hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation: ./ content/xdocs/ resources/images/

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/libhdfs.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/libhdfs.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/libhdfs.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/libhdfs.xml Fri Mar  4 04:07:36 2011
@@ -1,18 +1,19 @@
 <?xml version="1.0"?>
 <!--
-  Copyright 2002-2004 The Apache Software Foundation
-
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
 -->
 
 <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
@@ -21,17 +22,20 @@
 
 <document>
 <header>
-<title>C API to HDFS: libhdfs</title>
+<title>C API libhdfs</title>
 <meta name="http-equiv">Content-Type</meta>
 <meta name="content">text/html;</meta>
 <meta name="charset">utf-8</meta>
 </header>
 <body>
 <section>
-<title>C API to HDFS: libhdfs</title>
+<title>Overview</title>
 
 <p>
-libhdfs is a JNI based C api for Hadoop's DFS. It provides C apis to a subset of the HDFS APIs to manipulate DFS files and the filesystem. libhdfs is part of the hadoop distribution and comes pre-compiled in ${HADOOP_HOME}/libhdfs/libhdfs.so .
+libhdfs is a JNI based C API for Hadoop's Distributed File System (HDFS).
+It provides C APIs to a subset of the HDFS APIs to manipulate HDFS files and
+the filesystem. libhdfs is part of the Hadoop distribution and comes 
+pre-compiled in ${HADOOP_HOME}/libhdfs/libhdfs.so .
 </p>
 
 </section>
@@ -46,7 +50,7 @@ The header file for libhdfs describes ea
 </p>
 </section>
 <section>
-<title>A sample program</title>
+<title>A Sample Program</title>
 
 <source>
 #include "hdfs.h" 
@@ -73,24 +77,36 @@ int main(int argc, char **argv) {
 </section>
 
 <section>
-<title>How to link with the library</title>
+<title>How To Link With The Library</title>
 <p>
-See the Makefile for hdfs_test.c in the libhdfs source directory (${HADOOP_HOME}/src/c++/libhdfs/Makefile) or something like:
+See the Makefile for hdfs_test.c in the libhdfs source directory (${HADOOP_HOME}/src/c++/libhdfs/Makefile) or something like:<br />
 gcc above_sample.c -I${HADOOP_HOME}/src/c++/libhdfs -L${HADOOP_HOME}/libhdfs -lhdfs -o above_sample
 </p>
 </section>
 <section>
-<title>Common problems</title>
+<title>Common Problems</title>
 <p>
-The most common problem is the CLASSPATH is not set properly when calling a program that uses libhdfs. Make sure you set it to all the hadoop jars needed to run Hadoop itself. Currently, there is no way to programmatically generate the classpath, but a good bet is to include all the jar files in ${HADOOP_HOME} and ${HADOOP_HOME}/lib as well as the right configuration directory containing hdfs-site.xml
+The most common problem is the CLASSPATH is not set properly when calling a program that uses libhdfs. 
+Make sure you set it to all the Hadoop jars needed to run Hadoop itself. Currently, there is no way to 
+programmatically generate the classpath, but a good bet is to include all the jar files in ${HADOOP_HOME} 
+and ${HADOOP_HOME}/lib as well as the right configuration directory containing hdfs-site.xml
 </p>
 </section>
 <section>
-<title>libhdfs is thread safe</title>
-<p>Concurrency and Hadoop FS "handles" - the hadoop FS implementation includes a FS handle cache which caches based on the URI of the namenode along with the user connecting. So, all calls to hdfsConnect will return the same handle but calls to hdfsConnectAsUser with different users will return different handles.  But, since HDFS client handles are completely thread safe, this has no bearing on concurrency. 
-</p>
-<p>Concurrency and libhdfs/JNI - the libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs should be as thread safe as the underlying calls to the Hadoop FS.
-</p>
+<title>Thread Safe</title>
+<p>libhdfs is thread safe</p>
+<ul>
+<li>Concurrency and Hadoop FS "handles" 
+<br />The Hadoop FS implementation includes a FS handle cache which caches based on the URI of the 
+namenode along with the user connecting. So, all calls to hdfsConnect will return the same handle but 
+calls to hdfsConnectAsUser with different users will return different handles.  But, since HDFS client 
+handles are completely thread safe, this has no bearing on concurrency. 
+</li>
+<li>Concurrency and libhdfs/JNI 
+<br />The libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs 
+should be as thread safe as the underlying calls to the Hadoop FS.
+</li>
+</ul>
 </section>
 </body>
 </document>

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml Fri Mar  4 04:07:36 2011
@@ -20,7 +20,7 @@
 <document>
   
   <header>
-    <title>Map/Reduce Tutorial</title>
+    <title>MapReduce Tutorial</title>
   </header>
   
   <body>
@@ -29,21 +29,21 @@
       <title>Purpose</title>
       
       <p>This document comprehensively describes all user-facing facets of the 
-      Hadoop Map/Reduce framework and serves as a tutorial.
+      Hadoop MapReduce framework and serves as a tutorial.
       </p>
     </section>
     
     <section>
-      <title>Pre-requisites</title>
+      <title>Prerequisites</title>
       
       <p>Ensure that Hadoop is installed, configured and is running. More
       details:</p> 
       <ul>
         <li>
-          <a href="quickstart.html">Hadoop Quick Start</a> for first-time users.
+          <a href="single_node_setup.html">Single Node Setup</a> for first-time users.
         </li>
         <li>
-          <a href="cluster_setup.html">Hadoop Cluster Setup</a> for large, 
+          <a href="cluster_setup.html">Cluster Setup</a> for large, 
           distributed clusters.
         </li>
       </ul>
@@ -52,12 +52,12 @@
     <section>
       <title>Overview</title>
       
-      <p>Hadoop Map/Reduce is a software framework for easily writing 
+      <p>Hadoop MapReduce is a software framework for easily writing 
       applications which process vast amounts of data (multi-terabyte data-sets) 
       in-parallel on large clusters (thousands of nodes) of commodity 
       hardware in a reliable, fault-tolerant manner.</p>
       
-      <p>A Map/Reduce <em>job</em> usually splits the input data-set into 
+      <p>A MapReduce <em>job</em> usually splits the input data-set into 
       independent chunks which are processed by the <em>map tasks</em> in a
       completely parallel manner. The framework sorts the outputs of the maps, 
       which are then input to the <em>reduce tasks</em>. Typically both the 
@@ -66,13 +66,13 @@
       tasks.</p>
       
       <p>Typically the compute nodes and the storage nodes are the same, that is, 
-      the Map/Reduce framework and the Hadoop Distributed File System (see <a href="hdfs_design.html">HDFS Architecture </a>) 
+      the MapReduce framework and the Hadoop Distributed File System (see <a href="hdfs_design.html">HDFS Architecture Guide</a>) 
       are running on the same set of nodes. This configuration
       allows the framework to effectively schedule tasks on the nodes where data 
       is already present, resulting in very high aggregate bandwidth across the 
       cluster.</p>
       
-      <p>The Map/Reduce framework consists of a single master 
+      <p>The MapReduce framework consists of a single master 
       <code>JobTracker</code> and one slave <code>TaskTracker</code> per 
       cluster-node. The master is responsible for scheduling the jobs' component 
       tasks on the slaves, monitoring them and re-executing the failed tasks. The 
@@ -89,7 +89,7 @@
       information to the job-client.</p>
       
       <p>Although the Hadoop framework is implemented in Java<sup>TM</sup>, 
-      Map/Reduce applications need not be written in Java.</p>
+      MapReduce applications need not be written in Java.</p>
       <ul>
         <li>
           <a href="ext:api/org/apache/hadoop/streaming/package-summary">
@@ -100,7 +100,7 @@
         <li>
           <a href="ext:api/org/apache/hadoop/mapred/pipes/package-summary">
           Hadoop Pipes</a> is a <a href="http://www.swig.org/">SWIG</a>-
-          compatible <em>C++ API</em> to implement Map/Reduce applications (non 
+          compatible <em>C++ API</em> to implement MapReduce applications (non 
           JNI<sup>TM</sup> based).
         </li>
       </ul>
@@ -109,7 +109,7 @@
     <section>
       <title>Inputs and Outputs</title>
 
-      <p>The Map/Reduce framework operates exclusively on 
+      <p>The MapReduce framework operates exclusively on 
       <code>&lt;key, value&gt;</code> pairs, that is, the framework views the 
       input to the job as a set of <code>&lt;key, value&gt;</code> pairs and 
       produces a set of <code>&lt;key, value&gt;</code> pairs as the output of 
@@ -123,7 +123,7 @@
       WritableComparable</a> interface to facilitate sorting by the framework.
       </p>
 
-      <p>Input and Output types of a Map/Reduce job:</p>
+      <p>Input and Output types of a MapReduce job:</p>
       <p>
         (input) <code>&lt;k1, v1&gt;</code> 
         -&gt; 
@@ -144,14 +144,14 @@
     <section>
       <title>Example: WordCount v1.0</title>
       
-      <p>Before we jump into the details, lets walk through an example Map/Reduce 
+      <p>Before we jump into the details, lets walk through an example MapReduce 
       application to get a flavour for how they work.</p>
       
       <p><code>WordCount</code> is a simple application that counts the number of
       occurences of each word in a given input set.</p>
       
       <p>This works with a local-standalone, pseudo-distributed or fully-distributed 
-      Hadoop installation(see <a href="quickstart.html"> Hadoop Quick Start</a>).</p>
+      Hadoop installation (<a href="single_node_setup.html">Single Node Setup</a>).</p>
       
       <section>
         <title>Source Code</title>
@@ -608,7 +608,7 @@
         as arguments that are unzipped/unjarred and a link with name of the
         jar/zip are created in the current working directory of tasks. More
         details about the command line options are available at 
-        <a href="commands_manual.html"> Hadoop Command Guide.</a></p>
+        <a href="commands_manual.html">Commands Guide.</a></p>
         
         <p>Running <code>wordcount</code> example with 
         <code>-libjars</code> and <code>-files</code>:<br/>
@@ -696,10 +696,10 @@
     </section>
     
     <section>
-      <title>Map/Reduce - User Interfaces</title>
+      <title>MapReduce - User Interfaces</title>
       
       <p>This section provides a reasonable amount of detail on every user-facing 
-      aspect of the Map/Reduce framwork. This should help users implement, 
+      aspect of the MapReduce framework. This should help users implement, 
       configure and tune their jobs in a fine-grained manner. However, please 
       note that the javadoc for each class/interface remains the most 
       comprehensive documentation available; this is only meant to be a tutorial.
@@ -738,7 +738,7 @@
           to be of the same type as the input records. A given input pair may 
           map to zero or many output pairs.</p> 
  
-          <p>The Hadoop Map/Reduce framework spawns one map task for each 
+          <p>The Hadoop MapReduce framework spawns one map task for each 
           <code>InputSplit</code> generated by the <code>InputFormat</code> for 
           the job.</p>
           
@@ -949,7 +949,7 @@
           <title>Reporter</title>
         
           <p><a href="ext:api/org/apache/hadoop/mapred/reporter">
-          Reporter</a> is a facility for Map/Reduce applications to report 
+          Reporter</a> is a facility for MapReduce applications to report 
           progress, set application-level status messages and update 
           <code>Counters</code>.</p>
  
@@ -972,12 +972,12 @@
         
           <p><a href="ext:api/org/apache/hadoop/mapred/outputcollector">
           OutputCollector</a> is a generalization of the facility provided by
-          the Map/Reduce framework to collect data output by the 
+          the MapReduce framework to collect data output by the 
           <code>Mapper</code> or the <code>Reducer</code> (either the 
           intermediate outputs or the output of the job).</p>
         </section>
       
-        <p>Hadoop Map/Reduce comes bundled with a 
+        <p>Hadoop MapReduce comes bundled with a 
         <a href="ext:api/org/apache/hadoop/mapred/lib/package-summary">
         library</a> of generally useful mappers, reducers, and partitioners.</p>
       </section>
@@ -986,10 +986,10 @@
         <title>Job Configuration</title>
         
         <p><a href="ext:api/org/apache/hadoop/mapred/jobconf">
-        JobConf</a> represents a Map/Reduce job configuration.</p>
+        JobConf</a> represents a MapReduce job configuration.</p>
  
         <p><code>JobConf</code> is the primary interface for a user to describe
-        a Map/Reduce job to the Hadoop framework for execution. The framework 
+        a MapReduce job to the Hadoop framework for execution. The framework 
         tries to faithfully execute the job as described by <code>JobConf</code>, 
         however:</p> 
         <ul>
@@ -1057,7 +1057,7 @@
         <code>-Djava.library.path=&lt;&gt;</code> etc. If the 
         <code>mapred.{map|reduce}.child.java.opts</code> parameters contains the 
         symbol <em>@taskid@</em> it is interpolated with value of 
-        <code>taskid</code> of the Map/Reduce task.</p>
+        <code>taskid</code> of the MapReduce task.</p>
         
         <p>Here is an example with multiple arguments and substitutions, 
         showing jvm GC logging, and start of a passwordless JVM JMX agent so that
@@ -1110,7 +1110,7 @@
         for configuring the launched child tasks from task tracker. Configuring 
         the memory options for daemons is documented in 
         <a href="cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons">
-        cluster_setup.html </a></p>
+       Configuring the Environment of the Hadoop Daemons</a>.</p>
         
         <p>The memory available to some parts of the framework is also
         configurable. In map and reduce tasks, performance may be influenced
@@ -1460,7 +1460,7 @@
         with the <code>JobTracker</code>.</p>
  
         <p><code>JobClient</code> provides facilities to submit jobs, track their 
-        progress, access component-tasks' reports and logs, get the Map/Reduce 
+        progress, access component-tasks' reports and logs, get the MapReduce 
         cluster's status information and so on.</p>
  
         <p>The job submission process involves:</p>
@@ -1472,7 +1472,7 @@
             <code>DistributedCache</code> of the job, if necessary.
           </li>
           <li>
-            Copying the job's jar and configuration to the Map/Reduce system 
+            Copying the job's jar and configuration to the MapReduce system 
             directory on the <code>FileSystem</code>.
           </li>
           <li>
@@ -1512,7 +1512,7 @@
           <code>mapreduce.cluster.job-authorization-enabled</code> is set to
           true. When enabled, access control checks are done by the JobTracker
           and the TaskTracker before allowing users to view
-          job details or to modify a job using Map/Reduce APIs,
+          job details or to modify a job using MapReduce APIs,
           CLI or web user interfaces.</p>
           
           <p>A job submitter can specify access control lists for viewing or
@@ -1563,8 +1563,8 @@
         <section>
           <title>Job Control</title>
  
-          <p>Users may need to chain Map/Reduce jobs to accomplish complex
-          tasks which cannot be done via a single Map/Reduce job. This is fairly
+          <p>Users may need to chain MapReduce jobs to accomplish complex
+          tasks which cannot be done via a single MapReduce job. This is fairly
           easy since the output of the job typically goes to distributed 
           file-system, and the output, in turn, can be used as the input for the 
           next job.</p>
@@ -1675,10 +1675,10 @@
         <title>Job Input</title>
         
         <p><a href="ext:api/org/apache/hadoop/mapred/inputformat">
-        InputFormat</a> describes the input-specification for a Map/Reduce job.
+        InputFormat</a> describes the input-specification for a MapReduce job.
         </p> 
  
-        <p>The Map/Reduce framework relies on the <code>InputFormat</code> of 
+        <p>The MapReduce framework relies on the <code>InputFormat</code> of 
         the job to:</p>
         <ol>
           <li>Validate the input-specification of the job.</li>
@@ -1757,10 +1757,10 @@
         <title>Job Output</title>
         
         <p><a href="ext:api/org/apache/hadoop/mapred/outputformat">
-        OutputFormat</a> describes the output-specification for a Map/Reduce 
+        OutputFormat</a> describes the output-specification for a MapReduce 
         job.</p>
 
-        <p>The Map/Reduce framework relies on the <code>OutputFormat</code> of 
+        <p>The MapReduce framework relies on the <code>OutputFormat</code> of 
         the job to:</p>
         <ol>
           <li>
@@ -1782,9 +1782,9 @@
         
         <p><a href="ext:api/org/apache/hadoop/mapred/outputcommitter">
         OutputCommitter</a> describes the commit of task output for a 
-        Map/Reduce job.</p>
+        MapReduce job.</p>
 
-        <p>The Map/Reduce framework relies on the <code>OutputCommitter</code>
+        <p>The MapReduce framework relies on the <code>OutputCommitter</code>
         of the job to:</p>
         <ol>
           <li>
@@ -1842,7 +1842,7 @@
           (using the attemptid, say <code>attempt_200709221812_0001_m_000000_0</code>), 
           not just per task.</p> 
  
-          <p>To avoid these issues the Map/Reduce framework, when the 
+          <p>To avoid these issues the MapReduce framework, when the 
           <code>OutputCommitter</code> is <code>FileOutputCommitter</code>, 
           maintains a special 
           <code>${mapred.output.dir}/_temporary/_${taskid}</code> sub-directory
@@ -1866,10 +1866,10 @@
           <p>Note: The value of <code>${mapred.work.output.dir}</code> during 
           execution of a particular task-attempt is actually 
           <code>${mapred.output.dir}/_temporary/_{$taskid}</code>, and this value is 
-          set by the Map/Reduce framework. So, just create any side-files in the 
+          set by the MapReduce framework. So, just create any side-files in the 
           path  returned by
           <a href="ext:api/org/apache/hadoop/mapred/fileoutputformat/getworkoutputpath">
-          FileOutputFormat.getWorkOutputPath() </a>from Map/Reduce 
+          FileOutputFormat.getWorkOutputPath() </a>from MapReduce 
           task to take advantage of this feature.</p>
           
           <p>The entire discussion holds true for maps of jobs with 
@@ -1918,7 +1918,7 @@
           <title>Counters</title>
           
           <p><code>Counters</code> represent global counters, defined either by 
-          the Map/Reduce framework or applications. Each <code>Counter</code> can 
+          the MapReduce framework or applications. Each <code>Counter</code> can 
           be of any <code>Enum</code> type. Counters of a particular 
           <code>Enum</code> are bunched into groups of type 
           <code>Counters.Group</code>.</p>
@@ -1942,7 +1942,7 @@
           files efficiently.</p>
  
           <p><code>DistributedCache</code> is a facility provided by the 
-          Map/Reduce framework to cache files (text, archives, jars and so on) 
+          MapReduce framework to cache files (text, archives, jars and so on) 
           needed by applications.</p>
  
           <p>Applications specify the files to be cached via urls (hdfs://)
@@ -2049,7 +2049,7 @@
           interface supports the handling of generic Hadoop command-line options.
           </p>
           
-          <p><code>Tool</code> is the standard for any Map/Reduce tool or 
+          <p><code>Tool</code> is the standard for any MapReduce tool or 
           application. The application should delegate the handling of 
           standard command-line options to 
           <a href="ext:api/org/apache/hadoop/util/genericoptionsparser">
@@ -2082,7 +2082,7 @@
           <title>IsolationRunner</title>
           
           <p><a href="ext:api/org/apache/hadoop/mapred/isolationrunner">
-          IsolationRunner</a> is a utility to help debug Map/Reduce programs.</p>
+          IsolationRunner</a> is a utility to help debug MapReduce programs.</p>
           
           <p>To use the <code>IsolationRunner</code>, first set 
           <code>keep.failed.task.files</code> to <code>true</code> 
@@ -2122,7 +2122,7 @@
           <p>Once user configures that profiling is needed, she/he can use
           the configuration property 
           <code>mapred.task.profile.{maps|reduces}</code> to set the ranges
-          of Map/Reduce tasks to profile. The value can be set using the api 
+          of MapReduce tasks to profile. The value can be set using the api 
           <a href="ext:api/org/apache/hadoop/mapred/jobconf/setprofiletaskrange">
           JobConf.setProfileTaskRange(boolean,String)</a>.
           By default, the specified range is <code>0-2</code>.</p>
@@ -2143,8 +2143,8 @@
         
         <section>
           <title>Debugging</title>
-          <p>The Map/Reduce framework provides a facility to run user-provided 
-          scripts for debugging. When a Map/Reduce task fails, a user can run 
+          <p>The MapReduce framework provides a facility to run user-provided 
+          scripts for debugging. When a MapReduce task fails, a user can run 
           a debug script, to process task logs for example. The script is 
           given access to the task's stdout and stderr outputs, syslog and 
           jobconf. The output from the debug script's stdout and stderr is 
@@ -2177,7 +2177,7 @@
             
           <p>The arguments to the script are the task's stdout, stderr, 
           syslog and jobconf files. The debug command, run on the node where
-          the Map/Reduce task failed, is: <br/>
+          the MapReduce task failed, is: <br/>
           <code> $script $stdout $stderr $syslog $jobconf </code> </p> 
 
           <p> Pipes programs have the c++ program name as a fifth argument
@@ -2197,14 +2197,14 @@
           <title>JobControl</title>
           
           <p><a href="ext:api/org/apache/hadoop/mapred/jobcontrol/package-summary">
-          JobControl</a> is a utility which encapsulates a set of Map/Reduce jobs
+          JobControl</a> is a utility which encapsulates a set of MapReduce jobs
           and their dependencies.</p>
         </section>
         
         <section>
           <title>Data Compression</title>
           
-          <p>Hadoop Map/Reduce provides facilities for the application-writer to
+          <p>Hadoop MapReduce provides facilities for the application-writer to
           specify compression for both intermediate map-outputs and the
           job-outputs i.e. output of the reduces. It also comes bundled with
           <a href="ext:api/org/apache/hadoop/io/compress/compressioncodec">
@@ -2333,12 +2333,12 @@
       <title>Example: WordCount v2.0</title>
       
       <p>Here is a more complete <code>WordCount</code> which uses many of the
-      features provided by the Map/Reduce framework we discussed so far.</p>
+      features provided by the MapReduce framework we discussed so far.</p>
       
       <p>This needs the HDFS to be up and running, especially for the 
       <code>DistributedCache</code>-related features. Hence it only works with a 
-      <a href="quickstart.html#SingleNodeSetup">pseudo-distributed</a> or
-      <a href="quickstart.html#Fully-Distributed+Operation">fully-distributed</a> 
+      <a href="single_node_setup.html#SingleNodeSetup">pseudo-distributed</a> or
+      <a href="single_node_setup.html#Fully-Distributed+Operation">fully-distributed</a>
       Hadoop installation.</p>      
       
       <section>
@@ -3285,7 +3285,7 @@
         <title>Highlights</title>
         
         <p>The second version of <code>WordCount</code> improves upon the 
-        previous one by using some features offered by the Map/Reduce framework:
+        previous one by using some features offered by the MapReduce framework:
         </p>
         <ul>
           <li>

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/native_libraries.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/native_libraries.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/native_libraries.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/native_libraries.xml Fri Mar  4 04:07:36 2011
@@ -1,10 +1,11 @@
 <?xml version="1.0"?>
 <!--
-  Copyright 2002-2004 The Apache Software Foundation
-
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
@@ -25,114 +26,114 @@
   
   <body>
   
+  <section>
+  <title>Overview</title>
+  
+<p>This guide describes the native hadoop library and includes a small discussion about native shared libraries.</p>
+
+      <p><strong>Note:</strong> Depending on your environment, the term "native libraries" <em>could</em> 
+      refer to all *.so's you need to compile; and, the term "native compression" <em>could</em> refer to all *.so's 
+      you need to compile that are specifically related to compression.
+      Currently, however, this document only addresses the native hadoop library (<em>libhadoop.so</em>).</p>
+  
+  </section>
+  
     <section>
-      <title>Purpose</title>
-      
-      <p>Hadoop has native implementations of certain components for reasons of 
-      both performance and non-availability of Java implementations. These 
-      components are available in a single, dynamically-linked, native library. 
-      On the *nix platform it is <em>libhadoop.so</em>. This document describes 
-      the usage and details on how to build the native libraries.</p>
-    </section>
-    
-    <section>
-      <title>Components</title>
-      
-      <p>Hadoop currently has the following 
-      <a href="ext:api/org/apache/hadoop/io/compress/compressioncodec">
-      compression codecs</a> as the native components:</p>
-      <ul>
-        <li><a href="ext:zlib">zlib</a></li>
-        <li><a href="ext:gzip">gzip</a></li>
-        <li><a href="ext:bzip">bzip2</a></li>
-      </ul>
+      <title>Native Hadoop Library </title>
       
-      <p>Of the above, the availability of native hadoop libraries is imperative 
-      for the gzip and bzip2 compression codecs to work.</p>
-    </section>
-
+      <p>Hadoop has native implementations of certain components for  
+      performance reasons and for non-availability of Java implementations. These 
+      components are available in a single, dynamically-linked native library called
+       the native hadoop library. On the *nix platforms the library is named <em>libhadoop.so</em>. </p>
+   
     <section>
       <title>Usage</title>
       
-      <p>It is fairly simple to use the native hadoop libraries:</p>
+      <p>It is fairly easy to use the native hadoop library:</p>
 
-      <ul>
+      <ol>
+              <li>
+          Review the <a href="#Components">components</a>.
+        </li>
         <li>
-          Take a look at the 
-          <a href="#Supported+Platforms">supported platforms</a>.
+          Review the <a href="#Supported+Platforms">supported platforms</a>.
         </li>
         <li>
-          Either <a href="ext:releases/download">download</a> the pre-built 
-          32-bit i386-Linux native hadoop libraries (available as part of hadoop 
-          distribution in <code>lib/native</code> directory) or 
-          <a href="#Building+Native+Hadoop+Libraries">build</a> them yourself.
+          Either <a href="#Download">download</a> a hadoop release, which will 
+          include a pre-built version of the native hadoop library, or
+          <a href="#Build">build</a> your own version of the 
+          native hadoop library. Whether you download or build, the name for the library is 
+          the same: <em>libhadoop.so</em>
         </li>
         <li>
-          Make sure you have any of or all of <strong>&gt;zlib-1.2</strong>,
-          <strong>&gt;gzip-1.2</strong>, and <strong>&gt;bzip2-1.0</strong>
-          packages for your platform installed; 
-          depending on your needs.
+          Install the compression codec development packages 
+          (<strong>&gt;zlib-1.2</strong>, <strong>&gt;gzip-1.2</strong>):
+          <ul>
+              <li>If you download the library, install one or more development packages - 
+              whichever compression codecs you want to use with your deployment.</li>
+              <li>If you build the library, it is <strong>mandatory</strong> 
+              to install both development packages.</li>
+          </ul>
         </li>
-      </ul>
-      
-      <p>The <code>bin/hadoop</code> script ensures that the native hadoop 
-      library is on the library path via the system property 
-      <em>-Djava.library.path=&lt;path&gt;</em>.</p>
-
-      <p>To check everything went alright check the hadoop log files for:</p>
-      
-      <p>
-        <code>
-          DEBUG util.NativeCodeLoader - Trying to load the custom-built 
-          native-hadoop library... 
-        </code><br/>
-        <code>
-          INFO  util.NativeCodeLoader - Loaded the native-hadoop library
-        </code>
-      </p>
-
-      <p>If something goes wrong, then:</p>
-      <p>
-        <code>
-          INFO util.NativeCodeLoader - Unable to load native-hadoop library for 
-          your platform... using builtin-java classes where applicable
-        </code>
+         <li>
+          Check the <a href="#Runtime">runtime</a> log files.
+        </li>
+      </ol>
+     </section>
+    <section>
+      <title>Components</title>
+     <p>The native hadoop library includes two components, the zlib and gzip 
+      <a href="http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html"> 
+      compression codecs</a>:
       </p>
+      <ul>
+        <li><a href="ext:zlib">zlib</a></li>
+        <li><a href="ext:gzip">gzip</a></li>
+      </ul>
+      <p>The native hadoop library is imperative for gzip to work.</p>
     </section>
     
     <section>
       <title>Supported Platforms</title>
       
-      <p>Hadoop native library is supported only on *nix platforms only.
-      Unfortunately it is known not to work on <a href="ext:cygwin">Cygwin</a> 
-      and <a href="ext:osx">Mac OS X</a> and has mainly been used on the 
-      GNU/Linux platform.</p>
+      <p>The native hadoop library is supported on *nix platforms only.
+      The library does not to work with <a href="ext:cygwin">Cygwin</a> 
+      or the <a href="ext:osx">Mac OS X</a> platform.</p>
 
-      <p>It has been tested on the following GNU/Linux distributions:</p>
+      <p>The native hadoop library is mainly used on the GNU/Linus platform and 
+      has been tested on these distributions:</p>
       <ul>
         <li>
-          <a href="http://www.redhat.com/rhel/">RHEL4</a>/<a href="http://fedora.redhat.com/">Fedora</a>
+          <a href="http://www.redhat.com/rhel/">RHEL4</a>/<a href="http://fedoraproject.org/">Fedora</a>
         </li>
         <li><a href="http://www.ubuntu.com/">Ubuntu</a></li>
         <li><a href="http://www.gentoo.org/">Gentoo</a></li>
       </ul>
 
-      <p>On all the above platforms a 32/64 bit Hadoop native library will work 
+      <p>On all the above distributions a 32/64 bit native hadoop library will work 
       with a respective 32/64 bit jvm.</p>
     </section>
     
     <section>
-      <title>Building Native Hadoop Libraries</title>
+      <title>Download</title>
+      
+      <p>The pre-built 32-bit i386-Linux native hadoop library is available as part of the 
+      hadoop distribution and is located in the <code>lib/native</code> directory. You can download the 
+      hadoop distribution from <a href="ext:releases/download">Hadoop Common Releases</a>.</p>
+      
+      <p>Be sure to install the zlib and/or gzip development packages - whichever compression 
+      codecs you want to use with your deployment.</p>
+     </section>    
+    
+    <section>
+      <title>Build</title>
       
-      <p>Hadoop native library is written in 
-      <a href="http://en.wikipedia.org/wiki/ANSI_C">ANSI C</a> and built using 
-      the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). 
-      This means it should be straight-forward to build them on any platform with 
-      a standards compliant C compiler and the GNU autotools-chain. 
-      See <a href="#Supported+Platforms">supported platforms</a>.</p>
+      <p>The native hadoop library is written in <a href="http://en.wikipedia.org/wiki/ANSI_C">ANSI C</a> 
+      and is built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). 
+      This means it should be straight-forward to build the library on any platform with a standards-compliant 
+      C compiler and the GNU autotools-chain (see the <a href="#Supported+Platforms">supported platforms</a>).</p>
 
-      <p>In particular the various packages you would need on the target 
-      platform are:</p>
+      <p>The packages you need to install on the target platform are:</p>
       <ul>
         <li>
           C compiler (e.g. <a href="http://gcc.gnu.org/">GNU C Compiler</a>)
@@ -148,52 +149,69 @@
         </li>
       </ul>
 
-      <p>Once you have the pre-requisites use the standard <code>build.xml</code> 
-      and pass along the <code>compile.native</code> flag (set to 
-      <code>true</code>) to build the native hadoop library:</p>
+      <p>Once you installed the prerequisite packages use the standard hadoop <code>build.xml</code> 
+      file and pass along the <code>compile.native</code> flag (set to <code>true</code>) to build the native hadoop library:</p>
 
       <p><code>$ ant -Dcompile.native=true &lt;target&gt;</code></p>
 
-      <p>The native hadoop library is not built by default since not everyone is 
-      interested in building them.</p>
-
-      <p>You should see the newly-built native hadoop library in:</p>
+      <p>You should see the newly-built library in:</p>
 
       <p><code>$ build/native/&lt;platform&gt;/lib</code></p>
 
-      <p>where &lt;platform&gt; is combination of the system-properties: 
-      <code>${os.name}-${os.arch}-${sun.arch.data.model}</code>; for e.g. 
-      Linux-i386-32.</p>
-
-      <section>
-        <title>Notes</title>
-        
+      <p>where &lt;<code>platform</code>&gt; is a combination of the system-properties: 
+      <code>${os.name}-${os.arch}-${sun.arch.data.model}</code> (for example, Linux-i386-32).</p>
+
+      <p>Please note the following:</p>
         <ul>
           <li>
-            It is <strong>mandatory</strong> to have the 
-            zlib, gzip, and bzip2
-            development packages on the target platform for building the 
-            native hadoop library; however for deployment it is sufficient to 
-            install one of them if you wish to use only one of them.
+            It is <strong>mandatory</strong> to install both the zlib and gzip
+            development packages on the target platform in order to build the 
+            native hadoop library; however, for deployment it is sufficient to 
+            install just one package if you wish to use only one codec.
           </li>
           <li>
-            It is necessary to have the correct 32/64 libraries of both zlib 
-            depending on the 32/64 bit jvm for the target platform for 
-            building/deployment of the native hadoop library.
+            It is necessary to have the correct 32/64 libraries for zlib,  
+            depending on the 32/64 bit jvm for the target platform, in order to 
+            build and deploy the native hadoop library.
           </li>
         </ul>
-      </section>
     </section>
+    
+     <section>
+      <title>Runtime</title>
+      <p>The <code>bin/hadoop</code> script ensures that the native hadoop
+      library is on the library path via the system property: <br/>
+      <em>-Djava.library.path=&lt;path&gt;</em></p>
+
+      <p>During runtime, check the hadoop log files for your MapReduce tasks.</p>
+      
+      <ul>
+         <li>If everything is all right, then:<br/><br/>
+          <code> DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...  </code><br/>
+          <code> INFO  util.NativeCodeLoader - Loaded the native-hadoop library </code><br/>
+         </li>
+         
+         <li>If something goes wrong, then:<br/><br/>
+         <code>
+          INFO util.NativeCodeLoader - Unable to load native-hadoop library for 
+          your platform... using builtin-java classes where applicable
+        </code>
+         
+         </li>
+      </ul>
+    </section>
+     </section>
+    
     <section>
-      <title> Loading native libraries through DistributedCache </title>
-      <p>User can load native shared libraries through  
-      <a href="mapred_tutorial.html#DistributedCache">DistributedCache</a>
-      for <em>distributing</em> and <em>symlinking</em> the library files</p>
+      <title>Native Shared Libraries</title>
+      <p>You can load <strong>any</strong> native shared library using  
+      <a href="mapred_tutorial.html#DistributedCache">DistributedCache</a> 
+      for <em>distributing</em> and <em>symlinking</em> the library files.</p>
       
-      <p>Here is an example, describing how to distribute the library and
-      load it from map/reduce task. </p>
+      <p>This example shows you how to distribute a shared library, <code>mylib.so</code>, 
+      and load it from a MapReduce task.</p>
       <ol>
-      <li> First copy the library to the HDFS. <br/>
+      <li> First copy the library to the HDFS: <br/>
       <code>bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1</code>
       </li>
       <li> The job launching program should contain the following: <br/>
@@ -201,10 +219,13 @@
       <code> DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so.1#mylib.so", conf);
       </code>
       </li>
-      <li> The map/reduce task can contain: <br/>
+      <li> The MapReduce task can contain: <br/>
       <code> System.loadLibrary("mylib.so"); </code>
       </li>
       </ol>
+      
+     <p><br/><strong>Note:</strong> If you downloaded or built the native hadoop library, you don’t need to use DistibutedCache to 
+     make the library available to your MapReduce tasks.</p>
     </section>
   </body>
   

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/service_level_auth.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/service_level_auth.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/service_level_auth.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/service_level_auth.xml Fri Mar  4 04:07:36 2011
@@ -1,10 +1,11 @@
 <?xml version="1.0"?>
 <!--
-  Copyright 2002-2004 The Apache Software Foundation
-
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
@@ -33,17 +34,15 @@
     </section>
     
     <section>
-      <title>Pre-requisites</title>
+      <title>Prerequisites</title>
       
-      <p>Ensure that Hadoop is installed, configured and setup correctly. More
-      details:</p> 
+      <p>Make sure Hadoop is installed, configured and setup correctly. For more information see: </p> 
       <ul>
         <li>
-          <a href="quickstart.html">Hadoop Quick Start</a> for first-time users.
+          <a href="single_node_setup.html">Single Node Setup</a> for first-time users.
         </li>
         <li>
-          <a href="cluster_setup.html">Hadoop Cluster Setup</a> for large, 
-          distributed clusters.
+          <a href="cluster_setup.html">Cluster Setup</a> for large, distributed clusters.
         </li>
       </ul>
     </section>
@@ -54,7 +53,7 @@
       <p>Service Level Authorization is the initial authorization mechanism to
       ensure clients connecting to a particular Hadoop <em>service</em> have the
       necessary, pre-configured, permissions and are authorized to access the given
-      service. For e.g. a Map/Reduce cluster can use this mechanism to allow a
+      service. For example, a MapReduce cluster can use this mechanism to allow a
       configured list of users/groups to submit jobs.</p>
       
       <p>The <code>${HADOOP_CONF_DIR}/hadoop-policy.xml</code> configuration file 
@@ -197,33 +196,33 @@
         <title>Examples</title>
         
         <p>Allow only users <code>alice</code>, <code>bob</code> and users in the 
-        <code>mapreduce</code> group to submit jobs to the Map/Reduce cluster:</p>
+        <code>mapreduce</code> group to submit jobs to the MapReduce cluster:</p>
         
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.job.submission.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;alice,bob mapreduce&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.job.submission.protocol.acl&lt;/name&gt;
+     &lt;value&gt;alice,bob mapreduce&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
         <p></p><p>Allow only DataNodes running as the users who belong to the 
         group <code>datanodes</code> to communicate with the NameNode:</p> 
-        
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.datanode.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt; datanodes&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+ 
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.datanode.protocol.acl&lt;/name&gt;
+     &lt;value&gt;datanodes&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
         <p></p><p>Allow any user to talk to the HDFS cluster as a DFSClient:</p>
-        
-        <table>
-          <tr><td>&nbsp;&nbsp;&lt;property&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;name&gt;security.client.protocol.acl&lt;/name&gt;</td></tr>
-            <tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;*&lt;/value&gt;</td></tr>
-          <tr><td>&nbsp;&nbsp;&lt;/property&gt;</td></tr>
-        </table>
+
+<source>
+&lt;property&gt;
+     &lt;name&gt;security.client.protocol.acl&lt;/name&gt;
+     &lt;value&gt;*&lt;/value&gt;
+&lt;/property&gt;
+</source>        
         
       </section>
     </section>

Added: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/single_node_setup.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/single_node_setup.xml?rev=1077365&view=auto
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/single_node_setup.xml (added)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/single_node_setup.xml Fri Mar  4 04:07:36 2011
@@ -0,0 +1,293 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+
+<document>
+  
+  <header>
+    <title>Single Node Setup</title>
+  </header>
+  
+  <body>
+  
+    <section>
+      <title>Purpose</title>
+      
+      <p>This document describes how to set up and configure a single-node Hadoop
+      installation so that you can quickly perform simple operations using Hadoop
+      MapReduce and the Hadoop Distributed File System (HDFS).</p>
+      
+    </section>
+    
+    <section id="PreReqs">
+      <title>Prerequisites</title>
+      
+      <section>
+        <title>Supported Platforms</title>
+        
+        <ul>
+          <li>
+            GNU/Linux is supported as a development and production platform. 
+            Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
+          </li>
+          <li>
+            Win32 is supported as a <em>development platform</em>. Distributed 
+            operation has not been well tested on Win32, so it is not 
+            supported as a <em>production platform</em>.
+          </li>
+        </ul>        
+      </section>
+      
+      <section>
+        <title>Required Software</title>
+        <p>Required software for Linux and Windows include:</p>
+        <ol>
+          <li>
+            Java<sup>TM</sup> 1.6.x, preferably from Sun, must be installed.
+          </li>
+          <li>
+            <strong>ssh</strong> must be installed and <strong>sshd</strong> must 
+            be running to use the Hadoop scripts that manage remote Hadoop 
+            daemons.
+          </li>
+        </ol>
+        <p>Additional requirements for Windows include:</p>
+        <ol>
+          <li>
+            <a href="http://www.cygwin.com/">Cygwin</a> - Required for shell 
+            support in addition to the required software above. 
+          </li>
+        </ol>
+      </section>
+
+      <section>
+        <title>Installing Software</title>
+          
+        <p>If your cluster doesn't have the requisite software you will need to
+        install it.</p>
+          
+        <p>For example on Ubuntu Linux:</p>
+        <p>
+          <code>$ sudo apt-get install ssh</code><br/>
+          <code>$ sudo apt-get install rsync</code>
+        </p>
+          
+        <p>On Windows, if you did not install the required software when you 
+        installed cygwin, start the cygwin installer and select the packages:</p>
+        <ul>
+          <li>openssh - the <em>Net</em> category</li>
+        </ul>
+      </section>
+      
+    </section>
+    
+    <section>
+      <title>Download</title>
+      
+      <p>
+        To get a Hadoop distribution, download a recent 
+        <a href="ext:releases">stable release</a> from one of the Apache Download
+        Mirrors.
+      </p>
+    </section>
+
+    <section>
+      <title>Prepare to Start the Hadoop Cluster</title>
+      <p>
+        Unpack the downloaded Hadoop distribution. In the distribution, edit the
+        file <code>conf/hadoop-env.sh</code> to define at least 
+        <code>JAVA_HOME</code> to be the root of your Java installation.
+      </p>
+
+	  <p>
+	    Try the following command:<br/>
+        <code>$ bin/hadoop</code><br/>
+        This will display the usage documentation for the <strong>hadoop</strong> 
+        script.
+      </p>
+      
+      <p>Now you are ready to start your Hadoop cluster in one of the three supported
+      modes:
+      </p>
+      <ul>
+        <li>Local (Standalone) Mode</li>
+        <li>Pseudo-Distributed Mode</li>
+        <li>Fully-Distributed Mode</li>
+      </ul>
+    </section>
+    
+    <section id="Local">
+      <title>Standalone Operation</title>
+      
+      <p>By default, Hadoop is configured to run in a non-distributed 
+      mode, as a single Java process. This is useful for debugging.</p>
+      
+      <p>
+        The following example copies the unpacked <code>conf</code> directory to 
+        use as input and then finds and displays every match of the given regular 
+        expression. Output is written to the given <code>output</code> directory.
+        <br/>
+        <code>$ mkdir input</code><br/>
+        <code>$ cp conf/*.xml input</code><br/>
+        <code>
+          $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
+        </code><br/>
+        <code>$ cat output/*</code>
+      </p>
+    </section>
+    
+    <section id="PseudoDistributed">
+      <title>Pseudo-Distributed Operation</title>
+
+	  <p>Hadoop can also be run on a single-node in a pseudo-distributed mode 
+	  where each Hadoop daemon runs in a separate Java process.</p>
+	  
+      <section>
+        <title>Configuration</title>
+        <p>Use the following:
+        <br/><br/>
+        <code>conf/core-site.xml</code>:</p>
+        
+        <source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;fs.default.name&lt;/name&gt;
+         &lt;value&gt;hdfs://localhost:9000&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>
+      
+        <p><br/><code>conf/hdfs-site.xml</code>:</p>  
+<source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;dfs.replication&lt;/name&gt;
+         &lt;value&gt;1&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>        
+        
+      
+        <p><br/><code>conf/mapred-site.xml</code>:</p>
+<source>
+&lt;configuration&gt;
+     &lt;property&gt;
+         &lt;name&gt;mapred.job.tracker&lt;/name&gt;
+         &lt;value&gt;localhost:9001&lt;/value&gt;
+     &lt;/property&gt;
+&lt;/configuration&gt;
+</source>        
+        
+        
+        
+      </section>
+
+      <section>
+        <title>Setup passphraseless <em>ssh</em></title>
+        
+        <p>
+          Now check that you can ssh to the localhost without a passphrase:<br/>
+          <code>$ ssh localhost</code>
+        </p>
+        
+        <p>
+          If you cannot ssh to localhost without a passphrase, execute the 
+          following commands:<br/>
+   		  <code>$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa</code><br/>
+		  <code>$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys</code>
+		</p>
+      </section>
+    
+      <section>
+        <title>Execution</title>
+        
+        <p>
+          Format a new distributed-filesystem:<br/>
+          <code>$ bin/hadoop namenode -format</code>
+        </p>
+
+		<p>
+		  Start the hadoop daemons:<br/>
+          <code>$ bin/start-all.sh</code>
+        </p>
+
+        <p>The hadoop daemon log output is written to the 
+        <code>${HADOOP_LOG_DIR}</code> directory (defaults to 
+        <code>${HADOOP_HOME}/logs</code>).</p>
+
+        <p>Browse the web interface for the NameNode and the JobTracker; by
+        default they are available at:</p>
+        <ul>
+          <li>
+            <code>NameNode</code> - 
+            <a href="http://localhost:50070/">http://localhost:50070/</a>
+          </li>
+          <li>
+            <code>JobTracker</code> - 
+            <a href="http://localhost:50030/">http://localhost:50030/</a>
+          </li>
+        </ul>
+        
+        <p>
+          Copy the input files into the distributed filesystem:<br/>
+		  <code>$ bin/hadoop fs -put conf input</code>
+		</p>
+		
+        <p>
+          Run some of the examples provided:<br/>
+          <code>
+            $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
+          </code>
+        </p>
+        
+        <p>Examine the output files:</p>
+        <p>
+          Copy the output files from the distributed filesystem to the local 
+          filesytem and examine them:<br/>
+          <code>$ bin/hadoop fs -get output output</code><br/>
+          <code>$ cat output/*</code>
+        </p>
+        <p> or </p>
+        <p>
+          View the output files on the distributed filesystem:<br/>
+          <code>$ bin/hadoop fs -cat output/*</code>
+        </p>
+
+		<p>
+		  When you're done, stop the daemons with:<br/>
+		  <code>$ bin/stop-all.sh</code>
+		</p>
+      </section>
+    </section>
+    
+    <section id="FullyDistributed">
+      <title>Fully-Distributed Operation</title>
+      
+	  <p>For information on setting up fully-distributed, non-trivial clusters
+	  see <a href="cluster_setup.html">Cluster Setup</a>.</p>  
+    </section>
+    
+    <p>
+      <em>Java and JNI are trademarks or registered trademarks of 
+      Sun Microsystems, Inc. in the United States and other countries.</em>
+    </p>
+    
+  </body>
+  
+</document>

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/site.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/site.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/site.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/site.xml Fri Mar  4 04:07:36 2011
@@ -31,52 +31,50 @@ See http://forrest.apache.org/docs/linki
 
 <site label="Hadoop" href="" xmlns="http://apache.org/forrest/linkmap/1.0">
   
-   <docs label="Getting Started"> 
-		<overview   				label="Overview" 					href="index.html" />
-		<quickstart 				label="Quick Start"        		href="quickstart.html" />
-		<setup     					label="Cluster Setup"      		href="cluster_setup.html" />
-		<mapred    				label="Map/Reduce Tutorial" 	href="mapred_tutorial.html" />
-  </docs>	
-		
- <docs label="Programming Guides">
-		<commands 				label="Commands"     					href="commands_manual.html" />
-		<distcp    					label="DistCp"       						href="distcp.html" />
-		<native_lib    				label="Native Libraries" 					href="native_libraries.html" />
-		<streaming 				label="Streaming"          				href="streaming.html" />
-		<fair_scheduler 			label="Fair Scheduler" 					href="fair_scheduler.html"/>
-		<cap_scheduler 		label="Capacity Scheduler" 			href="capacity_scheduler.html"/>
-		<SLA					 	label="Service Level Authorization" 	href="service_level_auth.html"/>
-		<vaidya    					label="Vaidya" 								href="vaidya.html"/>
-		<archives  				label="Archives"     						href="hadoop_archives.html"/>
- 		<gridmix  				label="Gridmix"     href="gridmix.html"/>
-		<sec_impersonation			label="Secure Impersonation" 			href="Secure_Impersonation.html"/>
-   </docs>
-   
-   <docs label="HDFS">
-		<hdfs_user      				label="User Guide"    							href="hdfs_user_guide.html" />
-		<hdfs_arch     				label="Architecture"  								href="hdfs_design.html" />	
-		<hdfs_fs       	 				label="File System Shell Guide"     		href="hdfs_shell.html" />
-		<hdfs_perm      				label="Permissions Guide"    					href="hdfs_permissions_guide.html" />
-		<hdfs_quotas     			label="Quotas Guide" 							href="hdfs_quota_admin_guide.html" />
-		<hdfs_SLG        			label="Synthetic Load Generator Guide"  href="SLG_user_guide.html" />
-		<hdfs_libhdfs   				label="C API libhdfs"         						href="libhdfs.html" /> 
-   </docs> 
-   
-   <docs label="HOD">
-		<hod_user 	label="User Guide" 	href="hod_user_guide.html"/>
-		<hod_admin 	label="Admin Guide" 	href="hod_admin_guide.html"/>
-		<hod_config 	label="Config Guide" 	href="hod_config_guide.html"/> 
-   </docs> 
-   
-   <docs label="Miscellaneous"> 
-		<api       	label="API Docs"           href="ext:api/index" />
-		<jdiff     	label="API Changes"      href="ext:jdiff/changes" />
-		<wiki      	label="Wiki"               	href="ext:wiki" />
-		<faq       	label="FAQ"                	href="ext:faq" />
-		<relnotes  label="Release Notes" 	href="ext:relnotes" />
-		<changes	label="Change Log"       href="ext:changes" />
-   </docs> 
-   
+  <docs label="Getting Started">
+    <overview label="Overview"  href="index.html" />  
+    <single   label="Single Node Setup"  href="single_node_setup.html" />
+    <cluster  label="Cluster Setup"  href="cluster_setup.html" />
+  </docs>  
+       
+  <docs label="MapReduce">
+    <mapred     label="MapReduce Tutorial"   href="mapred_tutorial.html" />
+    <streaming  label="Hadoop Streaming"  href="streaming.html" />
+    <commands label="Hadoop Commands"  href="commands_manual.html" />
+    <distcp         label="DistCp"  href="distcp.html" />
+    <vaidya         label="Vaidya"  href="vaidya.html"/>
+    <archives     label="Hadoop Archives" href="hadoop_archives.html"/>
+    <gridmix       label="Gridmix"  href="gridmix.html"/>
+    <cap_scheduler  label="Capacity Scheduler" href="capacity_scheduler.html"/>
+    <fair_scheduler    label="Fair Scheduler"  href="fair_scheduler.html"/>
+    <cap_scheduler  label="Hod Scheduler"  href="hod_scheduler.html"/>
+  </docs>
+  
+  <docs label="HDFS">
+    <hdfs_user          label="HDFS Users "  href="hdfs_user_guide.html" />
+    <hdfs_arch          label="HDFS Architecture" href="hdfs_design.html" />
+    <hdfs_perm        label="Permissions" href="hdfs_permissions_guide.html" />
+    <hdfs_quotas      label="Quotas" href="hdfs_quota_admin_guide.html" />
+    <hdfs_SLG         label="Synthetic Load Generator"  href="SLG_user_guide.html" />
+    <hdfs_libhdfs       label="C API libhdfs" href="libhdfs.html" />
+  </docs>
+  
+  <docs label="Common"> 
+    <fsshell       label="File System Shell" href="file_system_shell.html" />
+    <SLA      label="Service Level Authorization" href="service_level_auth.html"/>
+    <native_lib   label="Native Libraries" href="native_libraries.html" />
+  </docs>    
+  
+  <docs label="Miscellaneous"> 
+    <sec_impersonation label="Secure Impersonation" href="Secure_Impersonation.html"/>
+    <api         label="API Docs"           href="ext:api/index" />
+    <jdiff       label="API Changes"      href="ext:jdiff/changes" />
+    <wiki        label="Wiki"                 href="ext:wiki" />
+    <faq         label="FAQ"                  href="ext:faq" />
+    <relnotes  label="Release Notes"   href="ext:relnotes" />
+    <changes  label="Change Log"       href="ext:changes" />
+  </docs> 
+
   <external-refs>
     <site      href="http://hadoop.apache.org/core/"/>
     <lists     href="http://hadoop.apache.org/core/mailing_lists.html"/>

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/streaming.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/streaming.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/streaming.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/streaming.xml Fri Mar  4 04:07:36 2011
@@ -552,6 +552,8 @@ if __name__ == "__main__":
 </source>
 </section>
 
+
+<!-- QUESTION -->
 <section>
 <title>Hadoop Field Selection Class</title>
 <p>
@@ -789,6 +791,17 @@ For example, mapred.job.id becomes mapre
 </p>
 </section>
 
+
+<!-- QUESTION -->
+<section>
+<title>How do I get the JobConf variables in a streaming job's mapper/reducer?</title>
+<p>
+See the <a href="mapred_tutorial.html#Configured+Parameters">Configured Parameters</a>. 
+During the execution of a streaming job, the names of the "mapred" parameters are transformed. The dots ( . ) become underscores ( _ ).
+For example, mapred.job.id becomes mapred_job_id and mapred.jar becomes mapred_jar. In your code, use the parameter names with the underscores.
+</p>
+</section>
+
 </section>
 </body>
 </document>

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/vaidya.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/vaidya.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/vaidya.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/content/xdocs/vaidya.xml Fri Mar  4 04:07:36 2011
@@ -40,7 +40,7 @@
     </section>
     
     <section>
-      <title>Pre-requisites</title>
+      <title>Prerequisites</title>
       
       <p>Ensure that Hadoop is installed and configured. More details:</p> 
       <ul>
@@ -58,11 +58,11 @@
       
       <p>Hadoop Vaidya (Vaidya in Sanskrit language means "one who knows", or "a physician") 
 	    is a rule based performance diagnostic tool for 
-        Map/Reduce jobs. It performs a post execution analysis of map/reduce 
+        MapReduce jobs. It performs a post execution analysis of map/reduce 
         job by parsing and collecting execution statistics through job history 
         and job configuration files. It runs a set of predefined tests/rules 
         against job execution statistics to diagnose various performance problems. 
-        Each test rule detects a specific performance problem with the Map/Reduce job and provides 
+        Each test rule detects a specific performance problem with the MapReduce job and provides 
         a targeted advice to the user. This tool generates an XML report based on 
         the evaluation results of individual test rules.
       </p>
@@ -74,9 +74,9 @@
 	 
 	 <p> This section describes main concepts and terminology involved with Hadoop Vaidya,</p>
 		<ul>
-			<li> <em>PostExPerformanceDiagnoser</em>: This class extends the base Diagnoser class and acts as a driver for post execution performance analysis of Map/Reduce Jobs. 
+			<li> <em>PostExPerformanceDiagnoser</em>: This class extends the base Diagnoser class and acts as a driver for post execution performance analysis of MapReduce Jobs. 
                        It detects performance inefficiencies by executing a set of performance diagnosis rules against the job execution statistics.</li>
-			<li> <em>Job Statistics</em>: This includes the job configuration information (job.xml) and various counters logged by Map/Reduce job as a part of the job history log
+			<li> <em>Job Statistics</em>: This includes the job configuration information (job.xml) and various counters logged by MapReduce job as a part of the job history log
 		           file. The counters are parsed and collected into the Job Statistics data structures, which contains global job level aggregate counters and 
 			     a set of counters for each Map and Reduce task.</li>
 			<li> <em>Diagnostic Test/Rule</em>: This is a program logic that detects the inefficiency of M/R job based on the job statistics. The
@@ -140,8 +140,7 @@
     <section>
 		<title>How to Write and Execute your own Tests</title>
 		<p>Writing and executing your own test rules is not very hard. You can take a look at Hadoop Vaidya source code for existing set of tests. 
-		   The source code is at this <a href="http://svn.apache.org/viewvc/hadoop/core/trunk/src/contrib/vaidya/src/java/org/apache/hadoop/vaidya/">hadoop svn repository location</a>
-		   . The default set of tests are under <code>"postexdiagnosis/tests/"</code> folder.</p>
+		   The source code is at this <a href="http://svn.apache.org/viewvc/hadoop/core/trunk/src/contrib/vaidya/src/java/org/apache/hadoop/vaidya/">hadoop svn repository location</a>. The default set of tests are under <code>"postexdiagnosis/tests/"</code> folder.</p>
 		<ul>
 		  <li>Writing a test class for your new test case should extend the <code>org.apache.hadoop.vaidya.DiagnosticTest</code> class and 
 		       it should override following three methods from the base class, 

Added: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/resources/images/hadoop-logo-2.gif
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/resources/images/hadoop-logo-2.gif?rev=1077365&view=auto
==============================================================================
Binary file - no diff available.

Propchange: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/resources/images/hadoop-logo-2.gif
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Modified: hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/skinconf.xml
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/skinconf.xml?rev=1077365&r1=1077364&r2=1077365&view=diff
==============================================================================
--- hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/skinconf.xml (original)
+++ hadoop/common/branches/branch-0.20-security-patches/src/docs/src/documentation/skinconf.xml Fri Mar  4 04:07:36 2011
@@ -67,7 +67,7 @@ which will be used to configure the chos
   <project-name>Hadoop</project-name>
   <project-description>Scalable Computing Platform</project-description>
   <project-url>http://hadoop.apache.org/core/</project-url>
-  <project-logo>images/core-logo.gif</project-logo>
+  <project-logo>images/hadoop-logo-2.gif</project-logo>
 
   <!-- group logo -->
   <group-name>Hadoop</group-name>