You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by en...@apache.org on 2013/02/14 23:55:17 UTC

svn commit: r1446379 - /hbase/trunk/src/docbkx/configuration.xml

Author: enis
Date: Thu Feb 14 22:55:17 2013
New Revision: 1446379

URL: http://svn.apache.org/r1446379
Log:
HBASE-7834. Document Hadoop version support matrix in the book

Modified:
    hbase/trunk/src/docbkx/configuration.xml

Modified: hbase/trunk/src/docbkx/configuration.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/configuration.xml?rev=1446379&r1=1446378&r2=1446379&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/configuration.xml (original)
+++ hbase/trunk/src/docbkx/configuration.xml Thu Feb 14 22:55:17 2013
@@ -221,18 +221,48 @@ to ensure well-formedness of your docume
         xlink:href="http://hadoop.apache.org">Hadoop</link><indexterm>
             <primary>Hadoop</primary>
           </indexterm></title>
-         <note><title>Please read all of this section</title>
-         <para>Please read this section to the end.  Up front we
-         wade through the weeds of Hadoop versions.  Later we talk of what you must do in HBase
-         to make it work w/ a particular Hadoop version.</para>
-         </note>
-
-          <para>
-        HBase will lose data unless it is running on an HDFS that has a durable
-        <code>sync</code> implementation. Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0
-	DO NOT have this attribute.
-        Currently only Hadoop versions 0.20.205.x or any release in excess of this
-        version -- this includes hadoop 1.0.0 -- have a working, durable sync
+         <para>Selecting a Hadoop version is critical for your HBase deployment. Below table shows some information about what versions of Hadoop are supported by various HBase versions. Based on the version of HBase, you should select the most appropriate version of Hadoop. We are not in the Hadoop distro selection business. You can use Hadoop distributions from Apache, or learn about vendor distributions of Hadoop at <link xlink:href="http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support"/></para>
+         <para>
+	     <table>
+		 <title>Hadoop version support matrix</title>
+		 <tgroup cols='4' align='left' colsep='1' rowsep='1'><colspec colname='c1' align='left'/><colspec colname='c2' align='center'/><colspec colname='c3' align='center'/><colspec colname='c4' align='center'/>
+         <thead>
+	     <row><entry>               </entry><entry>HBase-0.92.x</entry><entry>HBase-0.94.x</entry><entry>HBase-0.96</entry></row>
+	     </thead><tbody>
+         <row><entry>Hadoop-0.20.205</entry><entry>S</entry>          <entry>S</entry>           <entry>X</entry></row>
+         <row><entry>Hadoop-0.22.x  </entry><entry>S</entry>          <entry>S</entry>           <entry>X</entry></row>
+         <row><entry>Hadoop-1.0.x   </entry><entry>S</entry>          <entry>S</entry>           <entry>S</entry></row>
+         <row><entry>Hadoop-1.1.x   </entry><entry>NT</entry>         <entry>S</entry>           <entry>S</entry></row>
+         <row><entry>Hadoop-0.23.x  </entry><entry>X</entry>          <entry>S</entry>           <entry>NT</entry></row>
+         <row><entry>Hadoop-2.x     </entry><entry>X</entry>          <entry>S</entry>           <entry>S</entry></row>
+		 </tbody></tgroup></table>
+
+        Where
+		<simplelist type='vert' columns='1'>
+		<member>S = supported and tested,</member>
+		<member>X = not supported,</member>
+		<member>NT = it should run, but not tested enough.</member>
+		</simplelist>
+        </para>
+        <para>
+	Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its <filename>lib</filename> directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it is <emphasis>critical</emphasis> that the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase everywhere on your cluster. Hadoop version mismatch issues have various manifestations but often all looks like its hung up. 
+    </para>
+    <section xml:id="hadoop.hbase-0.94">
+	<title>Apache HBase 0.92 and 0.94</title>
+	<para>HBase 0.92 and 0.94 versions can work with Hadoop versions, 0.20.205, 0.22.x, 1.0.x, and 1.1.x. HBase-0.94 can additionally work with Hadoop-0.23.x and 2.x, but you may have to recompile the code using the specific maven profile (see top level pom.xml)</para>
+   </section>
+
+    <section xml:id="hadoop.hbase-0.96">
+	<title>Apache HBase 0.96</title>
+	<para>Apache HBase 0.96.0 requires Apache Hadoop 1.x at a minimum, and it can run equally well on hadoop-2.0.
+	As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required. We will no longer run properly on older Hadoops such as 0.20.205 or branch-0.20-append. Do not move to Apache HBase 0.96.x if you cannot upgrade your Hadoop<footnote><para>See <link xlink:href="http://search-hadoop.com/m/7vFVx4EsUb2">HBase, mail # dev - DISCUSS: Have hbase require at least hadoop 1.0.0 in hbase 0.96.0?</link></para></footnote>.</para>
+   </section>
+
+    <section xml:id="hadoop.older.versions">
+	<title>Hadoop versions 0.20.x - 1.x</title>
+	<para>
+     HBase will lose data unless it is running on an HDFS that has a durable
+        <code>sync</code> implementation.  DO NOT use Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 which DO NOT have this attribute. Currently only Hadoop versions 0.20.205.x or any release in excess of this version -- this includes hadoop-1.0.0 -- have a working, durable sync
           <footnote>
           <para>The Cloudera blog post <link xlink:href="http://www.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/">An update on Apache Hadoop 1.0</link>
           by Charles Zedlweski has a nice exposition on how all the Hadoop versions relate.
@@ -252,73 +282,13 @@ to ensure well-formedness of your docume
         </programlisting>
         You will have to restart your cluster after making this edit.  Ignore the chicken-little
         comment you'll find in the <filename>hdfs-default.xml</filename> in the
-        description for the <varname>dfs.support.append</varname> configuration; it says it is not enabled because there
-        are <quote>... bugs in the 'append code' and is not supported in any production
-        cluster.</quote>. This comment is stale, from another era, and while I'm sure there
-        are bugs, the sync/append code has been running
-        in production at large scale deploys and is on
-        by default in the offerings of hadoop by commercial vendors
-        <footnote><para>Until recently only the
-        <link xlink:href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/">branch-0.20-append</link>
-        branch had a working sync but no official release was ever made from this branch.
-        You had to build it yourself. Michael Noll wrote a detailed blog,
-        <link xlink:href="http://www.michael-noll.com/blog/2011/04/14/building-an-hadoop-0-20-x-version-for-hbase-0-90-2/">Building
-        an Hadoop 0.20.x version for Apache HBase 0.90.2</link>, on how to build an
-    Hadoop from branch-0.20-append.  Recommended.</para></footnote>
-    <footnote><para>Praveen Kumar has written
-            a complimentary article,
-            <link xlink:href="http://praveen.kumar.in/2011/06/20/building-hadoop-and-hbase-for-hbase-maven-application-development/">Building Hadoop and HBase for HBase Maven application development</link>.
-</para></footnote><footnote>Cloudera have <varname>dfs.support.append</varname> set to true by default.</footnote>.
-        Please use the most up-to-date Hadoop possible.</para>
-   <note><title>Apache HBase 0.96.0 requires Apache Hadoop 1.0.0 at a minimum</title>
-   <para>As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required.  We will no
-   longer run properly on older Hadoops such as <filename>0.20.205</filename> or <filename>branch-0.20-append</filename>.
-   Do not move to Apache HBase 0.96.x if you cannot upgrade your Hadoop<footnote><para>See <link xlink:href="http://search-hadoop.com/m/7vFVx4EsUb2">HBase, mail # dev - DISCUSS: Have hbase require at least hadoop 1.0.0 in hbase 0.96.0?</link></para></footnote>.</para>
-   <para>Apache HBase 0.96.0 runs on Apache Hadoop 2.0.
-   </para>
-   </note>
-
-<para>Or use the
-    <link xlink:href="http://www.cloudera.com/">Cloudera</link> or
-    <link xlink:href="http://www.mapr.com/">MapR</link> distributions.
-    Cloudera' <link xlink:href="http://archive.cloudera.com/docs/">CDH3</link>
-    is Apache Hadoop 0.20.x plus patches including all of the
-    <link xlink:href="http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/">branch-0.20-append</link>
-    additions needed to add a durable sync. Use the released, most recent version of CDH3.  In CDH, append
-    support is enabled by default so you do not need to make the above mentioned edits to
-    <filename>hdfs-site.xml</filename> or to <filename>hbase-site.xml</filename>.</para>
-    <para>
-    <link xlink:href="http://www.mapr.com/">MapR</link>
-    includes a commercial, reimplementation of HDFS.
-    It has a durable sync as well as some other interesting features that are not
-    yet in Apache Hadoop.  Their <link xlink:href="http://www.mapr.com/products/mapr-editions/m3-edition">M3</link>
-    product is free to use and unlimited.
-    </para>
-
-        <para>Because HBase depends on Hadoop, it bundles an instance of the
-        Hadoop jar under its <filename>lib</filename> directory. The bundled jar is ONLY for use in standalone mode.
-        In distributed mode, it is <emphasis>critical</emphasis> that the version of Hadoop that is out
-        on your cluster match what is under HBase.  Replace the hadoop jar found in the HBase
-        <filename>lib</filename> directory with the hadoop jar you are running on
-        your cluster to avoid version mismatch issues. Make sure you
-        replace the jar in HBase everywhere on your cluster.  Hadoop version
-        mismatch issues have various manifestations but often all looks like
-        its hung up.</para>
-    <note xml:id="bigtop"><title>Packaging and Apache BigTop</title>
-        <para><link xlink:href="http://bigtop.apache.org">Apache Bigtop</link>
-            is an umbrella for packaging and tests of the Apache Hadoop
-            ecosystem, including Apache HBase. Bigtop performs testing at various
-            levels (packaging, platform, runtime, upgrade, etc...), developed by a
-            community, with a focus on the system as a whole, rather than individual
-            projects. We recommend installing Apache HBase packages as provided by a
-            Bigtop release rather than rolling your own piecemeal integration of
-            various component releases.</para>
-    </note>
-
+        description for the <varname>dfs.support.append</varname> configuration.
+     </para>
+     </section>
        <section xml:id="hadoop.security">
           <title>Apache HBase on Secure Hadoop</title>
           <para>Apache HBase will run on any Hadoop 0.20.x that incorporates Hadoop
-          security features -- e.g. Y! 0.20S or CDH3B3 -- as long as you do as
+          security features as long as you do as
           suggested above and replace the Hadoop jar that ships with HBase
           with the secure version.  If you want to read more about how to setup
           Secure HBase, see <xref linkend="hbase.secure.configuration" />.</para>