You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by je...@apache.org on 2015/05/06 06:26:34 UTC

hbase git commit: HBASE-13251 Correct HBase, MapReduce, and the CLASSPATH section in HBase Ref Guide (li xiang)

Repository: hbase
Updated Branches:
  refs/heads/master 2e132db85 -> 664b2e4f1


HBASE-13251 Correct HBase, MapReduce, and the CLASSPATH section in HBase Ref Guide (li xiang)


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/664b2e4f
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/664b2e4f
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/664b2e4f

Branch: refs/heads/master
Commit: 664b2e4f11a06af2bc6d4876a3d6ed270b28e898
Parents: 2e132db
Author: Jerry He <je...@apache.org>
Authored: Tue May 5 21:25:06 2015 -0700
Committer: Jerry He <je...@apache.org>
Committed: Tue May 5 21:25:06 2015 -0700

----------------------------------------------------------------------
 .../apache/hadoop/hbase/util/ByteStringer.java  |  2 +-
 src/main/asciidoc/_chapters/mapreduce.adoc      | 27 ++++++++++++++------
 2 files changed, 20 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/664b2e4f/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
----------------------------------------------------------------------
diff --git a/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java b/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
index 5b10b83..afa9297 100644
--- a/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
+++ b/hbase-protocol/src/main/java/org/apache/hadoop/hbase/util/ByteStringer.java
@@ -25,7 +25,7 @@ import com.google.protobuf.ByteString;
 import com.google.protobuf.HBaseZeroCopyByteString;
 
 /**
- * Hack to workaround HBASE-1304 issue that keeps bubbling up when a mapreduce context.
+ * Hack to workaround HBASE-10304 issue that keeps bubbling up when a mapreduce context.
  */
 @InterfaceAudience.Private
 public class ByteStringer {

http://git-wip-us.apache.org/repos/asf/hbase/blob/664b2e4f/src/main/asciidoc/_chapters/mapreduce.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/mapreduce.adoc b/src/main/asciidoc/_chapters/mapreduce.adoc
index a008a4f..2a42af2 100644
--- a/src/main/asciidoc/_chapters/mapreduce.adoc
+++ b/src/main/asciidoc/_chapters/mapreduce.adoc
@@ -51,27 +51,38 @@ In the notes below, we refer to o.a.h.h.mapreduce but replace with the o.a.h.h.m
 
 By default, MapReduce jobs deployed to a MapReduce cluster do not have access to either the HBase configuration under `$HBASE_CONF_DIR` or the HBase classes.
 
-To give the MapReduce jobs the access they need, you could add _hbase-site.xml_ to the _$HADOOP_HOME/conf/_ directory and add the HBase JARs to the _HADOOP_HOME/conf/_ directory, then copy these changes across your cluster.
-You could add _hbase-site.xml_ to _$HADOOP_HOME/conf_ and add HBase jars to the _$HADOOP_HOME/lib_ directory.
-You would then need to copy these changes across your cluster or edit _$HADOOP_HOMEconf/hadoop-env.sh_ and add them to the `HADOOP_CLASSPATH` variable.
+To give the MapReduce jobs the access they need, you could add _hbase-site.xml_ to _$HADOOP_HOME/conf_ and add HBase jars to the _$HADOOP_HOME/lib_ directory.
+You would then need to copy these changes across your cluster. Or you can edit _$HADOOP_HOME/conf/hadoop-env.sh_ and add them to the `HADOOP_CLASSPATH` variable.
 However, this approach is not recommended because it will pollute your Hadoop install with HBase references.
 It also requires you to restart the Hadoop cluster before Hadoop can use the HBase data.
 
+The recommended approach is to let HBase add its dependency jars itself and use `HADOOP_CLASSPATH` or `-libjars`.
+
 Since HBase 0.90.x, HBase adds its dependency JARs to the job configuration itself.
 The dependencies only need to be available on the local `CLASSPATH`.
-The following example runs the bundled HBase link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter] MapReduce job against a table named `usertable` If you have not set the environment variables expected in the command (the parts prefixed by a `$` sign and curly braces), you can use the actual system paths instead.
+The following example runs the bundled HBase link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html[RowCounter] MapReduce job against a table named `usertable`.
+If you have not set the environment variables expected in the command (the parts prefixed by a `$` sign and surrounded by curly braces), you can use the actual system paths instead.
 Be sure to use the correct version of the HBase JAR for your system.
-The backticks (``` symbols) cause ths shell to execute the sub-commands, setting the `CLASSPATH` as part of the command.
+The backticks (``` symbols) cause ths shell to execute the sub-commands, setting the output of `hbase classpath` (the command to dump HBase CLASSPATH) to `HADOOP_CLASSPATH`.
 This example assumes you use a BASH-compatible shell.
 
 [source,bash]
 ----
-$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server-VERSION.jar rowcounter usertable
+$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/lib/hbase-server-VERSION.jar rowcounter usertable
 ----
 
 When the command runs, internally, the HBase JAR finds the dependencies it needs for ZooKeeper, Guava, and its other dependencies on the passed `HADOOP_CLASSPATH` and adds the JARs to the MapReduce job configuration.
 See the source at `TableMapReduceUtil#addDependencyJars(org.apache.hadoop.mapreduce.Job)` for how this is done.
 
+The command `hbase mapredcp` can also help you dump the CLASSPATH entries required by MapReduce, which are the same jars `TableMapReduceUtil#addDependencyJars` would add.
+You can add them together with HBase conf directory to `HADOOP_CLASSPATH`.
+For jobs that do not package their dependencies or call `TableMapReduceUtil#addDependencyJars`, the following command structure is necessary:
+
+[source,bash]
+----
+$ HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase mapredcp`:${HBASE_HOME}/conf hadoop jar MyApp.jar MyJobMainClass -libjars $(${HBASE_HOME}/bin/hbase mapredcp | tr ':' ',') ...
+----
+
 [NOTE]
 ====
 The example may not work if you are running HBase from its build directory rather than an installed location.
@@ -85,11 +96,11 @@ If this occurs, try modifying the command as follows, so that it uses the HBase
 
 [source,bash]
 ----
-$ HADOOP_CLASSPATH=${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar rowcounter usertable
+$ HADOOP_CLASSPATH=${HBASE_BUILD_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar:`${HBASE_BUILD_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_BUILD_HOME}/hbase-server/target/hbase-server-VERSION-SNAPSHOT.jar rowcounter usertable
 ----
 ====
 
-.Notice to MapReduce users of HBase 0.96.1 and above
+.Notice to MapReduce users of HBase between 0.96.1 and 0.98.4
 [CAUTION]
 ====
 Some MapReduce jobs that use HBase fail to launch.