You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Hari Sreekumar <hs...@clickable.com> on 2010/11/20 22:33:16 UTC

ClassNotFoundException while running some HBase m/r jobs

Hi,

I am getting this exception while running m/r jobs on HBase:

10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to process :
1
10/11/21 02:53:01 INFO mapred.JobClient: Running job: job_201011210240_0002
10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
attempt_201011210240_0002_m_000036_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
        at
org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.mapreduce.TableOutputFormat
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
        at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
        ... 4 more

What could be the probable reasons for this? I have made sure that
hbase-0.20.6.jar, which contains this particular class, is included in the
class path. In fact, if I run non-m/r jobs, it works fine. e.g, I ran a jar
file successfully that uses HAdmin to create some tables. Here is a part of
the output from these jobs:

10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Sun Microsystems Inc.
10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/java/jdk1.6.0_22/jre
10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
:/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
/lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp
10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=<NA>

As you can see, /opt/hbase/hbase-0.20.6.jar is included in the classpath.
What else could be it?

thanks,
hari

RE: ClassNotFoundException while running some HBase m/r jobs

Posted by Adam Portley <a_...@hotmail.com>.
If the jar is included in HADOOP_CLASSPATH in your config but you're still getting ClassNotFound, you might try printing out the map task environment.  You can do this easily by running a hadoop streaming job with mapper set to be printenv or something similar to print the environment to the output path.  If the environment does not match the config, it often means that the TaskTrackers were not restarted after the config was edited.  
 
> Date: Sun, 21 Nov 2010 21:48:14 +0530
> Subject: Re: ClassNotFoundException while running some HBase m/r jobs
> From: hsreekumar@clickable.com
> To: user@hbase.apache.org
> 
> Hi Ted,
> I tried doing the same thing with ant. And it worked! Thanks guys!
> But what I have now is a 26 MB fat jar file, since I included all jars I
> was including the the classpathref (shown above in ant script). Is there any
> other way I can get it to work? What is the root cause of this problem? This
> solution works but looks very unclean. Ideally, the jar files should get
> found, right? Or is this meant to be used like this by design?
> 
> thanks,
> hari
> 
> On Sun, Nov 21, 2010 at 8:55 PM, Ted Yu <yu...@gmail.com> wrote:
> 
> > We package hbase class files (using maven) into our jar. E.g.
> > [hadoop@us01-ciqps1-name01 sims]$ jar tvf lib/flow-3.0.0.54-294813.jar |
> > grep hbase
> > ...
> > 6743 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftUtilities.class
> > 24472 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftServer$HBaseHandler.class
> > 3897 Sat Jul 03 09:17:38 GMT 2010
> > org/apache/hadoop/hbase/thrift/ThriftServer.class
> > 565 Sat Jul 03 07:16:26 GMT 2010
> > org/apache/hadoop/hbase/TableNotFoundException.class
> > 2306 Sat Jul 03 07:16:26 GMT 2010
> > org/apache/hadoop/hbase/HStoreKey$StoreKeyComparator.class
> > 722 Sat Jul 03 07:16:22 GMT 2010
> > org/apache/hadoop/hbase/DoNotRetryIOException.class
> >
> > FYI
> >
> > On Sun, Nov 21, 2010 at 7:18 AM, Hari Sreekumar <hsreekumar@clickable.com
> > >wrote:
> >
> > > Hi Ted,
> > >
> > > Sure.. I use this command:
> > > $HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat
> > >
> > > /tmp/customerData.dat is the argument (text file from which data is to be
> > > uploaded) and BulkUpload is the class name.
> > >
> > > thanks,
> > > hari
> > >
> > > On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Can you show us the command which you use to launch the M/R job ?
> > > >
> > > > Thanks
> > > >
> > > > On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <
> > > hsreekumar@clickable.com
> > > > >wrote:
> > > >
> > > > > Hey Lars,
> > > > > You mean copying all required jar files to the lib/ folder in each
> > > jar?
> > > > > Is it worth the redundancy? I'll check if it works if I do that.
> > > > Currently,
> > > > > I am using ant to build my jar file with these instructions to
> > include
> > > > > files:
> > > > >
> > > > > <path id="classpath">
> > > > > <fileset dir="${lib.dir}" includes="**/*.jar"/>
> > > > > <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
> > > > > <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
> > > > > <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
> > > > > <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
> > > > > </path>
> > > > >
> > > > > <target name="compile" depends="clean">
> > > > > <mkdir dir="${build.dir}"/>
> > > > > <javac srcdir="${src.dir}" destdir="${build.dir}"
> > > > > classpathref="classpath"/>
> > > > > <copy todir="${build.dir}">
> > > > > <fileset dir="${input.dir}" includes="*.*"/>
> > > > > </copy>
> > > > > </target>
> > > > >
> > > > > I'll try copying all jars into the lib and including only the lib
> > > folder
> > > > > now.
> > > > >
> > > > > thanks,
> > > > > hari
> > > > >
> > > > > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Hari,
> > > > > >
> > > > > > I would try the "fat" jar approach. It is much easier to maintain
> > as
> > > > > > each job jar contains its required dependencies. Adding it to the
> > > > > > nodes and config is becoming a maintenance nightmare very quickly.
> > I
> > > > > > am personally using Maven to build my job jars and the Maven
> > > "Package"
> > > > > > plugin that has a custom package descriptor which - upon building
> > the
> > > > > > project - wraps everything up for me in one fell swoop.
> > > > > >
> > > > > > Lars
> > > > > >
> > > > > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > > > > > <hs...@clickable.com> wrote:
> > > > > > > Hi Lars,
> > > > > > >
> > > > > > > I tried copying conf to all nodes and copying jar, it is
> > > still
> > > > > > > giving the same error. Weird thing is that tasks on the master
> > node
> > > > are
> > > > > > also
> > > > > > > failing with the same error, even though all my files are
> > available
> > > > on
> > > > > > > master. I am sure I'm missing something basic here, but unable to
> > > > > > pinpoint
> > > > > > > the exact problem.
> > > > > > >
> > > > > > > hari
> > > > > > >
> > > > > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <
> > > lars.george@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Hari,
> > > > > > >>
> > > > > > >> This is most certainly a classpath issue. You either have to add
> > > the
> > > > > jar
> > > > > > to
> > > > > > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > > > > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and*
> > > > restart
> > > > > > the
> > > > > > >> TaskTracker process!) or put the jar into the job jar into a
> > /lib
> > > > > > directory.
> > > > > > >>
> > > > > > >> Lars
> > > > > > >>
> > > > > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <
> > > hsreekumar@clickable.com
> > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi,
> > > > > > >> >
> > > > > > >> > I am getting this exception while running m/r jobs on HBase:
> > > > > > >> >
> > > > > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input
> > paths
> > > to
> > > > > > >> process :
> > > > > > >> > 1
> > > > > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > > > > > >> job_201011210240_0002
> > > > > > >> > 10/11/21 02:53:02 INFO mapred.JobClient: map 0% reduce 0%
> > > > > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > > > > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > > > > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > > >> > at
> > > > > > >> >
> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > > > > > >> > at
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > > > > > >> > at
> > > org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > > > > > >> > at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > > > > > >> > at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > > >> > Caused by: java.lang.ClassNotFoundException:
> > > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > > >> > at
> > java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > > > >> > at java.security.AccessController.doPrivileged(Native
> > > > Method)
> > > > > > >> > at
> > > > java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > > > >> > at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > > > >> > at
> > > > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > > > > >> > at
> > java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > > > >> > at java.lang.Class.forName0(Native Method)
> > > > > > >> > at java.lang.Class.forName(Class.java:247)
> > > > > > >> > at
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > > > > > >> > at
> > > > > > >> >
> > > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > > > > > >> > ... 4 more
> > > > > > >> >
> > > > > > >> > What could be the probable reasons for this? I have made sure
> > > that
> > > > > > >> > hbase-0.20.6.jar, which contains this particular class, is
> > > > included
> > > > > in
> > > > > > >> the
> > > > > > >> > class path. In fact, if I run non-m/r jobs, it works fine.
> > e.g,
> > > I
> > > > > ran
> > > > > > a
> > > > > > >> jar
> > > > > > >> > file successfully that uses HAdmin to create some tables. Here
> > > is
> > > > a
> > > > > > part
> > > > > > >> of
> > > > > > >> > the output from these jobs:
> > > > > > >> >
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.vendor=Sun Microsystems Inc.
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> > environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.io.tmpdir=/tmp
> > > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > > >> > environment:java.compiler=<NA>
> > > > > > >> >
> > > > > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > > > > > classpath.
> > > > > > >> > What else could be it?
> > > > > > >> >
> > > > > > >> > thanks,
> > > > > > >> > hari
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
 		 	   		  

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Hari Sreekumar <hs...@clickable.com>.
Hi Ted,
    I tried doing the same thing with ant. And it worked! Thanks guys!
    But what I have now is a 26 MB fat jar file, since I included all jars I
was including the the classpathref (shown above in ant script). Is there any
other way I can get it to work? What is the root cause of this problem? This
solution works but looks very unclean. Ideally, the jar files should get
found, right? Or is this meant to be used like this by design?

thanks,
hari

On Sun, Nov 21, 2010 at 8:55 PM, Ted Yu <yu...@gmail.com> wrote:

> We package hbase class files (using maven) into our jar. E.g.
> [hadoop@us01-ciqps1-name01 sims]$ jar tvf lib/flow-3.0.0.54-294813.jar |
> grep hbase
> ...
>  6743 Sat Jul 03 09:17:38 GMT 2010
> org/apache/hadoop/hbase/thrift/ThriftUtilities.class
>  24472 Sat Jul 03 09:17:38 GMT 2010
> org/apache/hadoop/hbase/thrift/ThriftServer$HBaseHandler.class
>  3897 Sat Jul 03 09:17:38 GMT 2010
> org/apache/hadoop/hbase/thrift/ThriftServer.class
>   565 Sat Jul 03 07:16:26 GMT 2010
> org/apache/hadoop/hbase/TableNotFoundException.class
>  2306 Sat Jul 03 07:16:26 GMT 2010
> org/apache/hadoop/hbase/HStoreKey$StoreKeyComparator.class
>   722 Sat Jul 03 07:16:22 GMT 2010
> org/apache/hadoop/hbase/DoNotRetryIOException.class
>
> FYI
>
> On Sun, Nov 21, 2010 at 7:18 AM, Hari Sreekumar <hsreekumar@clickable.com
> >wrote:
>
> > Hi Ted,
> >
> >  Sure.. I use this command:
> > $HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat
> >
> > /tmp/customerData.dat is the argument (text file from which data is to be
> > uploaded) and BulkUpload is the class name.
> >
> > thanks,
> > hari
> >
> > On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Can you show us the command which you use to launch the M/R job ?
> > >
> > > Thanks
> > >
> > > On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <
> > hsreekumar@clickable.com
> > > >wrote:
> > >
> > > > Hey Lars,
> > > >   You mean copying all required jar files to the lib/ folder in each
> > jar?
> > > > Is it worth the redundancy? I'll check if it works if I do that.
> > > Currently,
> > > > I am using ant to build my jar file with these instructions to
> include
> > > > files:
> > > >
> > > > <path id="classpath">
> > > >        <fileset dir="${lib.dir}" includes="**/*.jar"/>
> > > >        <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
> > > >        <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
> > > >        <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
> > > >        <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
> > > >    </path>
> > > >
> > > >    <target name="compile" depends="clean">
> > > >        <mkdir dir="${build.dir}"/>
> > > >        <javac srcdir="${src.dir}" destdir="${build.dir}"
> > > > classpathref="classpath"/>
> > > >        <copy todir="${build.dir}">
> > > >            <fileset dir="${input.dir}" includes="*.*"/>
> > > >        </copy>
> > > >    </target>
> > > >
> > > > I'll try copying all jars into the lib and including only the lib
> > folder
> > > > now.
> > > >
> > > > thanks,
> > > > hari
> > > >
> > > > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Hari,
> > > > >
> > > > > I would try the "fat" jar approach. It is much easier to maintain
> as
> > > > > each job jar contains its required dependencies. Adding it to the
> > > > > nodes and config is becoming a maintenance nightmare very quickly.
> I
> > > > > am personally using Maven to build my job jars and the Maven
> > "Package"
> > > > > plugin that has a custom package descriptor which - upon building
> the
> > > > > project - wraps everything up for me in one fell swoop.
> > > > >
> > > > > Lars
> > > > >
> > > > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > > > > <hs...@clickable.com> wrote:
> > > > > > Hi Lars,
> > > > > >
> > > > > >        I tried copying conf to all nodes and copying jar, it is
> > still
> > > > > > giving the same error. Weird thing is that tasks on the master
> node
> > > are
> > > > > also
> > > > > > failing with the same error, even though all my files are
> available
> > > on
> > > > > > master. I am sure I'm missing something basic here, but unable to
> > > > > pinpoint
> > > > > > the exact problem.
> > > > > >
> > > > > > hari
> > > > > >
> > > > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <
> > lars.george@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > >> Hi Hari,
> > > > > >>
> > > > > >> This is most certainly a classpath issue. You either have to add
> > the
> > > > jar
> > > > > to
> > > > > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > > > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and*
> > > restart
> > > > > the
> > > > > >> TaskTracker process!) or put the jar into the job jar into a
> /lib
> > > > > directory.
> > > > > >>
> > > > > >> Lars
> > > > > >>
> > > > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <
> > hsreekumar@clickable.com
> > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > I am getting this exception while running m/r jobs on HBase:
> > > > > >> >
> > > > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input
> paths
> > to
> > > > > >> process :
> > > > > >> > 1
> > > > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > > > > >> job_201011210240_0002
> > > > > >> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> > > > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > > > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > > > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > >> >        at
> > > > > >> >
> > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > > > > >> >        at
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > > > > >> >        at
> > org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > > > > >> >        at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > > > > >> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > > >> > Caused by: java.lang.ClassNotFoundException:
> > > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > > >> >        at
> java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > > >> >        at java.security.AccessController.doPrivileged(Native
> > > Method)
> > > > > >> >        at
> > > java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > > >> >        at
> java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > > >> >        at
> > > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > > > >> >        at
> java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > > >> >        at java.lang.Class.forName0(Native Method)
> > > > > >> >        at java.lang.Class.forName(Class.java:247)
> > > > > >> >        at
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > > > > >> >        at
> > > > > >> >
> > > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > > > > >> >        ... 4 more
> > > > > >> >
> > > > > >> > What could be the probable reasons for this? I have made sure
> > that
> > > > > >> > hbase-0.20.6.jar, which contains this particular class, is
> > > included
> > > > in
> > > > > >> the
> > > > > >> > class path. In fact, if I run non-m/r jobs, it works fine.
> e.g,
> > I
> > > > ran
> > > > > a
> > > > > >> jar
> > > > > >> > file successfully that uses HAdmin to create some tables. Here
> > is
> > > a
> > > > > part
> > > > > >> of
> > > > > >> > the output from these jobs:
> > > > > >> >
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> > environment:java.vendor=Sun Microsystems Inc.
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> > environment:java.io.tmpdir=/tmp
> > > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > > >> > environment:java.compiler=<NA>
> > > > > >> >
> > > > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > > > > classpath.
> > > > > >> > What else could be it?
> > > > > >> >
> > > > > >> > thanks,
> > > > > >> > hari
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Ted Yu <yu...@gmail.com>.
We package hbase class files (using maven) into our jar. E.g.
[hadoop@us01-ciqps1-name01 sims]$ jar tvf lib/flow-3.0.0.54-294813.jar |
grep hbase
...
  6743 Sat Jul 03 09:17:38 GMT 2010
org/apache/hadoop/hbase/thrift/ThriftUtilities.class
 24472 Sat Jul 03 09:17:38 GMT 2010
org/apache/hadoop/hbase/thrift/ThriftServer$HBaseHandler.class
  3897 Sat Jul 03 09:17:38 GMT 2010
org/apache/hadoop/hbase/thrift/ThriftServer.class
   565 Sat Jul 03 07:16:26 GMT 2010
org/apache/hadoop/hbase/TableNotFoundException.class
  2306 Sat Jul 03 07:16:26 GMT 2010
org/apache/hadoop/hbase/HStoreKey$StoreKeyComparator.class
   722 Sat Jul 03 07:16:22 GMT 2010
org/apache/hadoop/hbase/DoNotRetryIOException.class

FYI

On Sun, Nov 21, 2010 at 7:18 AM, Hari Sreekumar <hs...@clickable.com>wrote:

> Hi Ted,
>
>  Sure.. I use this command:
> $HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat
>
> /tmp/customerData.dat is the argument (text file from which data is to be
> uploaded) and BulkUpload is the class name.
>
> thanks,
> hari
>
> On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Can you show us the command which you use to launch the M/R job ?
> >
> > Thanks
> >
> > On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <
> hsreekumar@clickable.com
> > >wrote:
> >
> > > Hey Lars,
> > >   You mean copying all required jar files to the lib/ folder in each
> jar?
> > > Is it worth the redundancy? I'll check if it works if I do that.
> > Currently,
> > > I am using ant to build my jar file with these instructions to include
> > > files:
> > >
> > > <path id="classpath">
> > >        <fileset dir="${lib.dir}" includes="**/*.jar"/>
> > >        <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
> > >        <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
> > >        <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
> > >        <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
> > >    </path>
> > >
> > >    <target name="compile" depends="clean">
> > >        <mkdir dir="${build.dir}"/>
> > >        <javac srcdir="${src.dir}" destdir="${build.dir}"
> > > classpathref="classpath"/>
> > >        <copy todir="${build.dir}">
> > >            <fileset dir="${input.dir}" includes="*.*"/>
> > >        </copy>
> > >    </target>
> > >
> > > I'll try copying all jars into the lib and including only the lib
> folder
> > > now.
> > >
> > > thanks,
> > > hari
> > >
> > > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com>
> > > wrote:
> > >
> > > > Hi Hari,
> > > >
> > > > I would try the "fat" jar approach. It is much easier to maintain as
> > > > each job jar contains its required dependencies. Adding it to the
> > > > nodes and config is becoming a maintenance nightmare very quickly. I
> > > > am personally using Maven to build my job jars and the Maven
> "Package"
> > > > plugin that has a custom package descriptor which - upon building the
> > > > project - wraps everything up for me in one fell swoop.
> > > >
> > > > Lars
> > > >
> > > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > > > <hs...@clickable.com> wrote:
> > > > > Hi Lars,
> > > > >
> > > > >        I tried copying conf to all nodes and copying jar, it is
> still
> > > > > giving the same error. Weird thing is that tasks on the master node
> > are
> > > > also
> > > > > failing with the same error, even though all my files are available
> > on
> > > > > master. I am sure I'm missing something basic here, but unable to
> > > > pinpoint
> > > > > the exact problem.
> > > > >
> > > > > hari
> > > > >
> > > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <
> lars.george@gmail.com>
> > > > wrote:
> > > > >
> > > > >> Hi Hari,
> > > > >>
> > > > >> This is most certainly a classpath issue. You either have to add
> the
> > > jar
> > > > to
> > > > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and*
> > restart
> > > > the
> > > > >> TaskTracker process!) or put the jar into the job jar into a /lib
> > > > directory.
> > > > >>
> > > > >> Lars
> > > > >>
> > > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <
> hsreekumar@clickable.com
> > >
> > > > >> wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> >
> > > > >> > I am getting this exception while running m/r jobs on HBase:
> > > > >> >
> > > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths
> to
> > > > >> process :
> > > > >> > 1
> > > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > > > >> job_201011210240_0002
> > > > >> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> > > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > >> >        at
> > > > >> >
> > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > > > >> >        at
> > > > >> >
> > > > >>
> > > >
> > >
> >
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > > > >> >        at
> org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > > > >> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > > > >> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > > >> > Caused by: java.lang.ClassNotFoundException:
> > > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > > >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > > >> >        at java.security.AccessController.doPrivileged(Native
> > Method)
> > > > >> >        at
> > java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > > >> >        at
> > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > > >> >        at java.lang.Class.forName0(Native Method)
> > > > >> >        at java.lang.Class.forName(Class.java:247)
> > > > >> >        at
> > > > >> >
> > > > >>
> > > >
> > >
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > > > >> >        at
> > > > >> >
> > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > > > >> >        ... 4 more
> > > > >> >
> > > > >> > What could be the probable reasons for this? I have made sure
> that
> > > > >> > hbase-0.20.6.jar, which contains this particular class, is
> > included
> > > in
> > > > >> the
> > > > >> > class path. In fact, if I run non-m/r jobs, it works fine. e.g,
> I
> > > ran
> > > > a
> > > > >> jar
> > > > >> > file successfully that uses HAdmin to create some tables. Here
> is
> > a
> > > > part
> > > > >> of
> > > > >> > the output from these jobs:
> > > > >> >
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> > environment:java.vendor=Sun Microsystems Inc.
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> >
> > > > >>
> > > >
> > >
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > > > >> >
> > > > >>
> > > >
> > >
> >
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > > > >> >
> > > > >>
> > > >
> > >
> >
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > > > >> >
> > > > >>
> > > >
> > >
> >
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > > > >> >
> > > > >>
> > > >
> > >
> >
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> >
> > > > >>
> > > >
> > >
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> > environment:java.io.tmpdir=/tmp
> > > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > > >> > environment:java.compiler=<NA>
> > > > >> >
> > > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > > > classpath.
> > > > >> > What else could be it?
> > > > >> >
> > > > >> > thanks,
> > > > >> > hari
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Hari Sreekumar <hs...@clickable.com>.
Hi Ted,

 Sure.. I use this command:
$HADOOP_HOME/bin/hadoop jar ~/MRJobs.jar BulkUpload /tmp/customerData.dat

/tmp/customerData.dat is the argument (text file from which data is to be
uploaded) and BulkUpload is the class name.

thanks,
hari

On Sun, Nov 21, 2010 at 8:35 PM, Ted Yu <yu...@gmail.com> wrote:

> Can you show us the command which you use to launch the M/R job ?
>
> Thanks
>
> On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <hsreekumar@clickable.com
> >wrote:
>
> > Hey Lars,
> >   You mean copying all required jar files to the lib/ folder in each jar?
> > Is it worth the redundancy? I'll check if it works if I do that.
> Currently,
> > I am using ant to build my jar file with these instructions to include
> > files:
> >
> > <path id="classpath">
> >        <fileset dir="${lib.dir}" includes="**/*.jar"/>
> >        <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
> >        <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
> >        <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
> >        <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
> >    </path>
> >
> >    <target name="compile" depends="clean">
> >        <mkdir dir="${build.dir}"/>
> >        <javac srcdir="${src.dir}" destdir="${build.dir}"
> > classpathref="classpath"/>
> >        <copy todir="${build.dir}">
> >            <fileset dir="${input.dir}" includes="*.*"/>
> >        </copy>
> >    </target>
> >
> > I'll try copying all jars into the lib and including only the lib folder
> > now.
> >
> > thanks,
> > hari
> >
> > On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com>
> > wrote:
> >
> > > Hi Hari,
> > >
> > > I would try the "fat" jar approach. It is much easier to maintain as
> > > each job jar contains its required dependencies. Adding it to the
> > > nodes and config is becoming a maintenance nightmare very quickly. I
> > > am personally using Maven to build my job jars and the Maven "Package"
> > > plugin that has a custom package descriptor which - upon building the
> > > project - wraps everything up for me in one fell swoop.
> > >
> > > Lars
> > >
> > > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > > <hs...@clickable.com> wrote:
> > > > Hi Lars,
> > > >
> > > >        I tried copying conf to all nodes and copying jar, it is still
> > > > giving the same error. Weird thing is that tasks on the master node
> are
> > > also
> > > > failing with the same error, even though all my files are available
> on
> > > > master. I am sure I'm missing something basic here, but unable to
> > > pinpoint
> > > > the exact problem.
> > > >
> > > > hari
> > > >
> > > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <la...@gmail.com>
> > > wrote:
> > > >
> > > >> Hi Hari,
> > > >>
> > > >> This is most certainly a classpath issue. You either have to add the
> > jar
> > > to
> > > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > > >> HADOOP_CLASSPATH line (and copy it to all servers again *and*
> restart
> > > the
> > > >> TaskTracker process!) or put the jar into the job jar into a /lib
> > > directory.
> > > >>
> > > >> Lars
> > > >>
> > > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <hsreekumar@clickable.com
> >
> > > >> wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > I am getting this exception while running m/r jobs on HBase:
> > > >> >
> > > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to
> > > >> process :
> > > >> > 1
> > > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > > >> job_201011210240_0002
> > > >> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> > > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > >> >        at
> > > >> >
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > > >> >        at
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > > >> >        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > > >> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > > >> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > > >> > Caused by: java.lang.ClassNotFoundException:
> > > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > > >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > > >> >        at java.security.AccessController.doPrivileged(Native
> Method)
> > > >> >        at
> java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > > >> >        at
> > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > > >> >        at java.lang.Class.forName0(Native Method)
> > > >> >        at java.lang.Class.forName(Class.java:247)
> > > >> >        at
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > > >> >        at
> > > >> >
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > > >> >        ... 4 more
> > > >> >
> > > >> > What could be the probable reasons for this? I have made sure that
> > > >> > hbase-0.20.6.jar, which contains this particular class, is
> included
> > in
> > > >> the
> > > >> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, I
> > ran
> > > a
> > > >> jar
> > > >> > file successfully that uses HAdmin to create some tables. Here is
> a
> > > part
> > > >> of
> > > >> > the output from these jobs:
> > > >> >
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> > environment:java.vendor=Sun Microsystems Inc.
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> >
> > > >>
> > >
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > > >> >
> > > >>
> > >
> >
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > > >> >
> > > >>
> > >
> >
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > > >> >
> > > >>
> > >
> >
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > > >> >
> > > >>
> > >
> >
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > > >> >
> > > >>
> > >
> >
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> >
> > > >>
> > >
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> > environment:java.io.tmpdir=/tmp
> > > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > > >> > environment:java.compiler=<NA>
> > > >> >
> > > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > > classpath.
> > > >> > What else could be it?
> > > >> >
> > > >> > thanks,
> > > >> > hari
> > > >>
> > > >
> > >
> >
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Ted Yu <yu...@gmail.com>.
Can you show us the command which you use to launch the M/R job ?

Thanks

On Sun, Nov 21, 2010 at 5:26 AM, Hari Sreekumar <hs...@clickable.com>wrote:

> Hey Lars,
>   You mean copying all required jar files to the lib/ folder in each jar?
> Is it worth the redundancy? I'll check if it works if I do that. Currently,
> I am using ant to build my jar file with these instructions to include
> files:
>
> <path id="classpath">
>        <fileset dir="${lib.dir}" includes="**/*.jar"/>
>        <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
>        <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
>        <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
>        <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
>    </path>
>
>    <target name="compile" depends="clean">
>        <mkdir dir="${build.dir}"/>
>        <javac srcdir="${src.dir}" destdir="${build.dir}"
> classpathref="classpath"/>
>        <copy todir="${build.dir}">
>            <fileset dir="${input.dir}" includes="*.*"/>
>        </copy>
>    </target>
>
> I'll try copying all jars into the lib and including only the lib folder
> now.
>
> thanks,
> hari
>
> On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com>
> wrote:
>
> > Hi Hari,
> >
> > I would try the "fat" jar approach. It is much easier to maintain as
> > each job jar contains its required dependencies. Adding it to the
> > nodes and config is becoming a maintenance nightmare very quickly. I
> > am personally using Maven to build my job jars and the Maven "Package"
> > plugin that has a custom package descriptor which - upon building the
> > project - wraps everything up for me in one fell swoop.
> >
> > Lars
> >
> > On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> > <hs...@clickable.com> wrote:
> > > Hi Lars,
> > >
> > >        I tried copying conf to all nodes and copying jar, it is still
> > > giving the same error. Weird thing is that tasks on the master node are
> > also
> > > failing with the same error, even though all my files are available on
> > > master. I am sure I'm missing something basic here, but unable to
> > pinpoint
> > > the exact problem.
> > >
> > > hari
> > >
> > > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <la...@gmail.com>
> > wrote:
> > >
> > >> Hi Hari,
> > >>
> > >> This is most certainly a classpath issue. You either have to add the
> jar
> > to
> > >> all TaskTracker servers and add it into the hadoop-env.sh in the
> > >> HADOOP_CLASSPATH line (and copy it to all servers again *and* restart
> > the
> > >> TaskTracker process!) or put the jar into the job jar into a /lib
> > directory.
> > >>
> > >> Lars
> > >>
> > >> On Nov 20, 2010, at 22:33, Hari Sreekumar <hs...@clickable.com>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I am getting this exception while running m/r jobs on HBase:
> > >> >
> > >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to
> > >> process :
> > >> > 1
> > >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> > >> job_201011210240_0002
> > >> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> > >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > >> >        at
> > >> >
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> > >> >        at
> > >> >
> > >>
> >
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> > >> >        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> > >> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> > >> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > >> > Caused by: java.lang.ClassNotFoundException:
> > >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> > >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > >> >        at java.security.AccessController.doPrivileged(Native Method)
> > >> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > >> >        at
> > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > >> >        at java.lang.Class.forName0(Native Method)
> > >> >        at java.lang.Class.forName(Class.java:247)
> > >> >        at
> > >> >
> > >>
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> > >> >        at
> > >> >
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> > >> >        ... 4 more
> > >> >
> > >> > What could be the probable reasons for this? I have made sure that
> > >> > hbase-0.20.6.jar, which contains this particular class, is included
> in
> > >> the
> > >> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, I
> ran
> > a
> > >> jar
> > >> > file successfully that uses HAdmin to create some tables. Here is a
> > part
> > >> of
> > >> > the output from these jobs:
> > >> >
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> > environment:java.vendor=Sun Microsystems Inc.
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> >
> > >>
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> > >> >
> > >>
> >
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> > >> >
> > >>
> >
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> > >> >
> > >>
> >
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> > >> >
> > >>
> >
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> > >> >
> > >>
> >
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> >
> > >>
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> > environment:java.io.tmpdir=/tmp
> > >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > >> > environment:java.compiler=<NA>
> > >> >
> > >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> > classpath.
> > >> > What else could be it?
> > >> >
> > >> > thanks,
> > >> > hari
> > >>
> > >
> >
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Hari Sreekumar <hs...@clickable.com>.
Hey Lars,
   You mean copying all required jar files to the lib/ folder in each jar?
Is it worth the redundancy? I'll check if it works if I do that. Currently,
I am using ant to build my jar file with these instructions to include
files:

<path id="classpath">
        <fileset dir="${lib.dir}" includes="**/*.jar"/>
        <fileset dir="${env.HADOOP_HOME}" includes="*.jar"/>
        <fileset dir="${env.HBASE_HOME}" includes="*.jar"/>
        <fileset dir="${env.HADOOP_HOME}/lib" includes="**/*.jar"/>
        <fileset dir="${env.HBASE_HOME}/lib" includes="**/*.jar"/>
    </path>

    <target name="compile" depends="clean">
        <mkdir dir="${build.dir}"/>
        <javac srcdir="${src.dir}" destdir="${build.dir}"
classpathref="classpath"/>
        <copy todir="${build.dir}">
            <fileset dir="${input.dir}" includes="*.*"/>
        </copy>
    </target>

I'll try copying all jars into the lib and including only the lib folder
now.

thanks,
hari

On Sun, Nov 21, 2010 at 5:32 PM, Lars George <la...@gmail.com> wrote:

> Hi Hari,
>
> I would try the "fat" jar approach. It is much easier to maintain as
> each job jar contains its required dependencies. Adding it to the
> nodes and config is becoming a maintenance nightmare very quickly. I
> am personally using Maven to build my job jars and the Maven "Package"
> plugin that has a custom package descriptor which - upon building the
> project - wraps everything up for me in one fell swoop.
>
> Lars
>
> On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
> <hs...@clickable.com> wrote:
> > Hi Lars,
> >
> >        I tried copying conf to all nodes and copying jar, it is still
> > giving the same error. Weird thing is that tasks on the master node are
> also
> > failing with the same error, even though all my files are available on
> > master. I am sure I'm missing something basic here, but unable to
> pinpoint
> > the exact problem.
> >
> > hari
> >
> > On Sun, Nov 21, 2010 at 3:11 AM, Lars George <la...@gmail.com>
> wrote:
> >
> >> Hi Hari,
> >>
> >> This is most certainly a classpath issue. You either have to add the jar
> to
> >> all TaskTracker servers and add it into the hadoop-env.sh in the
> >> HADOOP_CLASSPATH line (and copy it to all servers again *and* restart
> the
> >> TaskTracker process!) or put the jar into the job jar into a /lib
> directory.
> >>
> >> Lars
> >>
> >> On Nov 20, 2010, at 22:33, Hari Sreekumar <hs...@clickable.com>
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > I am getting this exception while running m/r jobs on HBase:
> >> >
> >> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to
> >> process :
> >> > 1
> >> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> >> job_201011210240_0002
> >> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> >> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> >> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> >> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> >> >        at
> >> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> >> >        at
> >> >
> >>
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> >> >        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> >> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> >> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >> > Caused by: java.lang.ClassNotFoundException:
> >> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >> >        at java.security.AccessController.doPrivileged(Native Method)
> >> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> >> >        at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> >> >        at java.lang.Class.forName0(Native Method)
> >> >        at java.lang.Class.forName(Class.java:247)
> >> >        at
> >> >
> >>
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> >> >        at
> >> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> >> >        ... 4 more
> >> >
> >> > What could be the probable reasons for this? I have made sure that
> >> > hbase-0.20.6.jar, which contains this particular class, is included in
> >> the
> >> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, I ran
> a
> >> jar
> >> > file successfully that uses HAdmin to create some tables. Here is a
> part
> >> of
> >> > the output from these jobs:
> >> >
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> > environment:java.vendor=Sun Microsystems Inc.
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> >
> >>
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> >> >
> >>
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> >> >
> >>
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> >> >
> >>
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> >> >
> >>
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> >> >
> >>
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> >
> >>
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> > environment:java.io.tmpdir=/tmp
> >> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >> > environment:java.compiler=<NA>
> >> >
> >> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the
> classpath.
> >> > What else could be it?
> >> >
> >> > thanks,
> >> > hari
> >>
> >
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Lars George <la...@gmail.com>.
Hi Hari,

I would try the "fat" jar approach. It is much easier to maintain as
each job jar contains its required dependencies. Adding it to the
nodes and config is becoming a maintenance nightmare very quickly. I
am personally using Maven to build my job jars and the Maven "Package"
plugin that has a custom package descriptor which - upon building the
project - wraps everything up for me in one fell swoop.

Lars

On Sun, Nov 21, 2010 at 8:17 AM, Hari Sreekumar
<hs...@clickable.com> wrote:
> Hi Lars,
>
>        I tried copying conf to all nodes and copying jar, it is still
> giving the same error. Weird thing is that tasks on the master node are also
> failing with the same error, even though all my files are available on
> master. I am sure I'm missing something basic here, but unable to pinpoint
> the exact problem.
>
> hari
>
> On Sun, Nov 21, 2010 at 3:11 AM, Lars George <la...@gmail.com> wrote:
>
>> Hi Hari,
>>
>> This is most certainly a classpath issue. You either have to add the jar to
>> all TaskTracker servers and add it into the hadoop-env.sh in the
>> HADOOP_CLASSPATH line (and copy it to all servers again *and* restart the
>> TaskTracker process!) or put the jar into the job jar into a /lib directory.
>>
>> Lars
>>
>> On Nov 20, 2010, at 22:33, Hari Sreekumar <hs...@clickable.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I am getting this exception while running m/r jobs on HBase:
>> >
>> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to
>> process :
>> > 1
>> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
>> job_201011210240_0002
>> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
>> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
>> > attempt_201011210240_0002_m_000036_0, Status : FAILED
>> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
>> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
>> >        at
>> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
>> >        at
>> >
>> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
>> >        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
>> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
>> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
>> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> >        at java.security.AccessController.doPrivileged(Native Method)
>> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>> >        at java.lang.Class.forName0(Native Method)
>> >        at java.lang.Class.forName(Class.java:247)
>> >        at
>> >
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
>> >        at
>> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
>> >        ... 4 more
>> >
>> > What could be the probable reasons for this? I have made sure that
>> > hbase-0.20.6.jar, which contains this particular class, is included in
>> the
>> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, I ran a
>> jar
>> > file successfully that uses HAdmin to create some tables. Here is a part
>> of
>> > the output from these jobs:
>> >
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> > environment:java.vendor=Sun Microsystems Inc.
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> > environment:java.home=/usr/java/jdk1.6.0_22/jre
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> >
>> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
>> >
>> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
>> >
>> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
>> >
>> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
>> >
>> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
>> >
>> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> >
>> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> > environment:java.io.tmpdir=/tmp
>> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
>> > environment:java.compiler=<NA>
>> >
>> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the classpath.
>> > What else could be it?
>> >
>> > thanks,
>> > hari
>>
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Hari Sreekumar <hs...@clickable.com>.
Hi Lars,

        I tried copying conf to all nodes and copying jar, it is still
giving the same error. Weird thing is that tasks on the master node are also
failing with the same error, even though all my files are available on
master. I am sure I'm missing something basic here, but unable to pinpoint
the exact problem.

hari

On Sun, Nov 21, 2010 at 3:11 AM, Lars George <la...@gmail.com> wrote:

> Hi Hari,
>
> This is most certainly a classpath issue. You either have to add the jar to
> all TaskTracker servers and add it into the hadoop-env.sh in the
> HADOOP_CLASSPATH line (and copy it to all servers again *and* restart the
> TaskTracker process!) or put the jar into the job jar into a /lib directory.
>
> Lars
>
> On Nov 20, 2010, at 22:33, Hari Sreekumar <hs...@clickable.com>
> wrote:
>
> > Hi,
> >
> > I am getting this exception while running m/r jobs on HBase:
> >
> > 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to
> process :
> > 1
> > 10/11/21 02:53:01 INFO mapred.JobClient: Running job:
> job_201011210240_0002
> > 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> > 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> > attempt_201011210240_0002_m_000036_0, Status : FAILED
> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> >        at
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
> >        at
> >
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
> >        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> >        at java.lang.Class.forName0(Native Method)
> >        at java.lang.Class.forName(Class.java:247)
> >        at
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
> >        at
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
> >        ... 4 more
> >
> > What could be the probable reasons for this? I have made sure that
> > hbase-0.20.6.jar, which contains this particular class, is included in
> the
> > class path. In fact, if I run non-m/r jobs, it works fine. e.g, I ran a
> jar
> > file successfully that uses HAdmin to create some tables. Here is a part
> of
> > the output from these jobs:
> >
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/java/jdk1.6.0_22/jre
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> >
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> >
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> >
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> >
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> >
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> >
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> > 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > As you can see, /opt/hbase/hbase-0.20.6.jar is included in the classpath.
> > What else could be it?
> >
> > thanks,
> > hari
>

Re: ClassNotFoundException while running some HBase m/r jobs

Posted by Lars George <la...@gmail.com>.
Hi Hari,

This is most certainly a classpath issue. You either have to add the jar to all TaskTracker servers and add it into the hadoop-env.sh in the HADOOP_CLASSPATH line (and copy it to all servers again *and* restart the TaskTracker process!) or put the jar into the job jar into a /lib directory. 

Lars

On Nov 20, 2010, at 22:33, Hari Sreekumar <hs...@clickable.com> wrote:

> Hi,
> 
> I am getting this exception while running m/r jobs on HBase:
> 
> 10/11/21 02:53:01 INFO input.FileInputFormat: Total input paths to process :
> 1
> 10/11/21 02:53:01 INFO mapred.JobClient: Running job: job_201011210240_0002
> 10/11/21 02:53:02 INFO mapred.JobClient:  map 0% reduce 0%
> 10/11/21 02:53:08 INFO mapred.JobClient: Task Id :
> attempt_201011210240_0002_m_000036_0, Status : FAILED
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat
>        at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
>        at
> org.apache.hadoop.mapreduce.JobContext.getOutputFormatClass(JobContext.java:193)
>        at org.apache.hadoop.mapred.Task.initialize(Task.java:413)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:288)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:247)
>        at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
>        at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
>        ... 4 more
> 
> What could be the probable reasons for this? I have made sure that
> hbase-0.20.6.jar, which contains this particular class, is included in the
> class path. In fact, if I run non-m/r jobs, it works fine. e.g, I ran a jar
> file successfully that uses HAdmin to create some tables. Here is a part of
> the output from these jobs:
> 
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Sun Microsystems Inc.
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/java/jdk1.6.0_22/jre
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.class.path=/opt/hadoop/bin/../conf:/usr/java/jdk1.6.0_22/lib/tools.jar:/opt/hadoop/bin/..
> :/opt/hadoop/bin/../hadoop-0.20.2-core.jar:/opt/hadoop/bin/../lib/commons-cli-1.2.jar:/opt/hadoop/bin/../lib/commons-codec-1.3.jar:/opt/hadoop/bin/../lib/com
> mons-el-1.0.jar:/opt/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/bin/../lib/commons-logging-1.0.4.jar:/opt/hadoop/bin/../lib/commons-logging-a
> pi-1.0.4.jar:/opt/hadoop/bin/../lib/commons-net-1.4.1.jar:/opt/hadoop/bin/../lib/core-3.1.1.jar:/opt/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/bin/..
> /lib/jasper-compiler-5.5.12.jar:/opt/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/bin/../lib/jets3t-0.6.1.jar:/opt/hadoop/bin/../lib/jetty-6.1.14.
> jar:/opt/hadoop/bin/../lib/jetty-util-6.1.14.jar:/opt/hadoop/bin/../lib/junit-3.8.1.jar:/opt/hadoop/bin/../lib/kfs-0.2.2.jar:/opt/hadoop/bin/../lib/log4j-1.2.15.jar:/opt/hadoop/bin/../lib/mockito-all-1.8.0.jar:/opt/hadoop/bin/../lib/oro-2.0.8.jar:/opt/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/opt/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/bin/../lib/xmlenc-0.52.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar:/opt/hbase/hbase-0.20.6.jar:/opt/hbase/hbase-0.20.6-test.jar:/opt/hbase/conf:/opt/hbase/lib/zookeeper-3.2.2.jar
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.library.path=/opt/hadoop/bin/../lib/native/Linux-amd64-64
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.io.tmpdir=/tmp
> 10/11/21 02:49:24 INFO zookeeper.ZooKeeper: Client
> environment:java.compiler=<NA>
> 
> As you can see, /opt/hbase/hbase-0.20.6.jar is included in the classpath.
> What else could be it?
> 
> thanks,
> hari