You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Alex Kozlov <al...@cloudera.com> on 2010/07/29 01:34:45 UTC

Re: error:Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzopCodec

Hi Alex,

There seems to be a problem with your configuration.  Can you try and check
on which node the attempt_201007202234_0001_m_000000_0 was run?  Go to this
machine and check the config files and the lib subdirectory (for the
presence of the correct configuration and the hadoop-lzo-0.4.4.jar).
Restart the TT and using 'ps -aef | grep -i tasktracker' check that the
hadoop-lzo-0.4.4.jar is in the classpath.

I have a strong suspicion you have stray config files:
com.hadoop.compression.lzo.LzopCodec is not mentioned in the ones you
provided...

Alex K

On Wed, Jul 28, 2010 at 7:42 AM, Alex Luya <al...@gmail.com> wrote:

> Hello:
>    I got source code from http://github.com/kevinweil/hadoop-lzo,compiled
> them successfully,and then
> 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master
> and
> slave
> 2,Copy all files under directory:../Linux-amd64-64/lib to directory:
> $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
> 3,and upload a file:test.lzo to HDFS
> 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar
> com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test
>
> got errors:
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 10/07/20 22:37:37 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 10/07/20 22:37:37 INFO lzo.LzoCodec: Successfully loaded & initialized
> native-
> lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
> ##########^_^^_^^_^^_^^_^^_^##################
> (I think this says all native library got loaded successfully)
> ################################
> 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file
> target.lz:o
> to indexing list (no index currently exists)
> ...
> attempt_201007202234_0001_m_000000_0, Status : FAILED
> java.lang.IllegalArgumentException: Compression codec
> com.hadoop.compression.lzo.LzopCodec
>                 not found.
>        at
>
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
>        at
>
> org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134)
>        at
>
> com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader.java:48)
>        at
>
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.ClassNotFoundException:
> com.hadoop.compression.lzo.LzopCodec
>
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:247)
>        at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
>        at
>
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
>        ... 6 more
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> There is a installation instruction in this
> link:http://github.com/kevinweil/hadoop-lzo,it says other configurings are
> needed :
>
> Once the libs are built and installed, you may want to add them to the
> class
> paths and library paths. That is, in hadoop-env.sh, set
>
>   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar
>
> Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
> ,should I do set this entry like this again? actually, after I add this:
> export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
> /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so:
>  how can I
> get hadoop to load hadoop-lzo-0.4.4.jar?)
>
>
>    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
> libs:/path/to/standard-hadoop-native-libs
>    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop; comment
> out the line
>    (3)JAVA_LIBRARY_PATH=''
>
>
> Question:since native library got loaded successfully,aren't these
> operation(2)(3) needed?
>
>
> -----------------------------------------------
> I am using hadoop 0.20.2
> core-site.xml
>
> -----------------------------------------------------------------------------
> <configuration>
>        <property>
>                <name>fs.default.name</name>
>                <value>hdfs://hadoop:8020</value>
>        </property>
>        <property>
>                <name>hadoop.tmp.dir</name>
>                <value>/home/hadoop/tmp</value>
>        </property>
>
>        <property>
>                <name>io.compression.codecs</name>
>
>
> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
>                </value>
>        </property>
>        <property>
>                <name>io.compression.codec.lzo.class</name>
>                <value>com.hadoop.compression.lzo.LzoCodec</value>
>        </property>
> </configuration>
>
>
> -----------------------------------------------------------------------------
> mapred-site.xml
>
> -----------------------------------------------------------------------------
> <configuration>
>        <property>
>                <name>mapred.job.tracker</name>
>                <value>AlexLuya:9001</value>
>        </property>
>        <property>
>                <name>mapred.tasktracker.reduce.tasks.maximum</name>
>                <value>1</value>
>        </property>
>        <property>
>                <name>mapred.tasktracker.map.tasks.maximum</name>
>                <value>1</value>
>        </property>
>        <property>
>                <name>mapred.local.dir</name>
>                <value>/home/alex/hadoop/mapred/local</value>
>        </property>
>        <property>
>                <name>mapred.system.dir</name>
>                <value>/tmp/hadoop/mapred/system</value>
>        </property>
>        <property>
>                <name>mapreduce.map.output.compress</name>
>                <value>true</value>
>        </property>
>        <property>
>                <name>mapreduce.map.output.compress.codec</name>
>                <value>com.hadoop.compression.lzo.LzoCodec</value>
>        </property>
> </configuration>
>
>
> -----------------------------------------------------------------------------
> hadoop-env.sh
>
> -----------------------------------------------------------------------------
> # Set Hadoop-specific environment variables here.
>
> # The only required environment variable is JAVA_HOME.  All others are
> # optional.  When running a distributed configuration it is best to
> # set JAVA_HOME in this file, so that it is correctly defined on
> # remote nodes.
>
> # The java implementation to use.  Required.
> export JAVA_HOME=/usr/local/hadoop/jdk1.6.0_20
>
> # Extra Java CLASSPATH elements.  Optional.
> # export HADOOP_CLASSPATH=
>
> # The maximum amount of heap to use, in MB. Default is 1000.
> export HADOOP_HEAPSIZE=200
>
> # Extra Java runtime options.  Empty by default.
> #export HADOOP_OPTS=-server
>
> export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
>
> $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib/hadoop-
> lzo-0.4.4.jar
> #export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/Linux-amd64-64
>
> # Command specific options appended to HADOOP_OPTS when specified
> export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
> $HADOOP_NAMENODE_OPTS"
> export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
> $HADOOP_SECONDARYNAMENODE_OPTS"
> export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
> $HADOOP_DATANODE_OPTS"
> export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
> $HADOOP_BALANCER_OPTS"
> export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
> $HADOOP_JOBTRACKER_OPTS"
> # export HADOOP_TASKTRACKER_OPTS=
> # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> # export HADOOP_CLIENT_OPTS
>
> # Extra ssh options.  Empty by default.
> # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
>
> # Where log files are stored.  $HADOOP_HOME/logs by default.
> # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
>
> # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
> # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
>
> # host:path where hadoop code should be rsync'd from.  Unset by default.
> # export HADOOP_MASTER=master:/home/$USER/src/hadoop
>
> # Seconds to sleep between slave commands.  Unset by default.  This
> # can be useful in large clusters, where, e.g., slave rsyncs can
> # otherwise arrive faster than the master can service them.
> # export HADOOP_SLAVE_SLEEP=0.1
>
> # The directory where pid files are stored. /tmp by default.
> # export HADOOP_PID_DIR=/var/hadoop/pids
>
> # A string representing this instance of hadoop. $USER by default.
> #export HADOOP_IDENT_STRING=$USER
>
> # The scheduling priority for daemon processes.  See 'man nice'.
> # export HADOOP_NICENESS=10
>

Re: error:Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzopCodec

Posted by Ted Yu <yu...@gmail.com>.
Yes.

On Thu, Jul 29, 2010 at 7:57 AM, Alex Luya <al...@gmail.com> wrote:

> Hi,
>
>    Run:ps -aef | grep -i tasktracker
> I got this:
>
> -------------------------------------------------------------------------------------------------
> alex      2425     1  0 22:34 ?        00:00:05
> /usr/local/hadoop/jdk1.6.0_20/bin/java -Xmx200m -
> Dhadoop.log.dir=/usr/local/hadoop/hadoop-0.20.2/bin/../logs -
> Dhadoop.log.file=hadoop-alex-tasktracker-Hadoop-03.log -
> Dhadoop.home.dir=/usr/local/hadoop/hadoop-0.20.2/bin/..
> -Dhadoop.id.str=alex -
> Dhadoop.root.logger=INFO,DRFA -
> Djava.library.path=/usr/local/hadoop/hadoop-0.20.2/bin/../lib/native/Linux-
> amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath
>
> /usr/local/hadoop/hadoop-0.20.2/bin/../conf:/usr/local/hadoop/jdk1.6.0_20/lib/tools.jar:/usr/local/hadoop/hadoop-0.20.2/bin/..:/usr/local/hadoop/hadoop-0.20.2/bin/../hadoop-0.20.2-
> core.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
> cli-1.2.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
> codec-1.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
> el-1.0.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
> httpclient-3.0.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
>
> logging-1.0.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-logging-
> api-1.0.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
>
> net-1.4.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/core-3.1.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hadoop-
>
> lzo-0.4.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hsqldb-1.8.0.10.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jasper-
> compiler-5.5.12.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jasper-
>
> runtime-5.5.12.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jets3t-0.6.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jetty-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jetty-
>
> util-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/junit-3.8.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/kfs-0.2.2.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/log4j-1.2.15.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/mockito-
>
> all-1.8.0.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/oro-2.0.8.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/servlet-
> api-2.5-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/slf4j-
> api-1.4.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/slf4j-
>
> log4j12-1.4.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/xmlenc-0.52.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jsp-2.1/jsp-2.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jsp-2.1/jsp-
> api-2.1.jar org.apache.hadoop.mapred.TaskTracker
> alex      2609  1231  0 22:51 pts/0    00:00:00 grep --color=auto -i
> tasktracker
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> see this:
> :/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hadoop-lzo-0.4.4.jar
>
> "hadoop-lzo-0.4.4.jar" got presented,but question is why directory
> structure
> will like this "/bin/../lib/hadoop-lzo-0.4.4.jar",it should be
> "/usr/local/hadoop/hadoop-0.20.2/lib/hadoop-lzo-0.4.4.jar",is this normal?
>
>
> On Thursday, July 29, 2010 07:34:45 am Alex Kozlov wrote:
> > Hi Alex,
> >
> > There seems to be a problem with your configuration.  Can you try and
> check
> > on which node the attempt_201007202234_0001_m_000000_0 was run?  Go to
> this
> > machine and check the config files and the lib subdirectory (for the
> > presence of the correct configuration and the hadoop-lzo-0.4.4.jar).
> > Restart the TT and using 'ps -aef | grep -i tasktracker' check that the
> > hadoop-lzo-0.4.4.jar is in the classpath.
> >
> > I have a strong suspicion you have stray config files:
> > com.hadoop.compression.lzo.LzopCodec is not mentioned in the ones you
> > provided...
> >
> > Alex K
> >
> > On Wed, Jul 28, 2010 at 7:42 AM, Alex Luya <al...@gmail.com>
> wrote:
> > > Hello:
> > >
> > >    I got source code from http://github.com/kevinweil/hadoop-lzo
> ,compiled
> > >
> > > them successfully,and then
> > > 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each
> master
> > > and
> > > slave
> > > 2,Copy all files under directory:../Linux-amd64-64/lib to directory:
> > > $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
> > > 3,and upload a file:test.lzo to HDFS
> > > 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar
> > > com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test
> > >
> > > got errors:
> > >
> > >
> -------------------------------------------------------------------------
> > >
> -------------------------------------------------------------------------
> > > --------------------------- 10/07/20 22:37:37 INFO
> > > lzo.GPLNativeCodeLoader: Loaded native gpl library 10/07/20 22:37:37
> > > INFO lzo.LzoCodec: Successfully loaded & initialized native-
> > > lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
> > > ##########^_^^_^^_^^_^^_^^_^##################
> > > (I think this says all native library got loaded successfully)
> > > ################################
> > > 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file
> > > target.lz:o
> > > to indexing list (no index currently exists)
> > > ...
> > > attempt_201007202234_0001_m_000000_0, Status : FAILED
> > > java.lang.IllegalArgumentException: Compression codec
> > > com.hadoop.compression.lzo.LzopCodec
> > >
> > >                 not found.
> > >
> > >        at
> > >
> > >
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > > pressionCodecFactory.java:96)
> > >
> > >        at
> > >
> > >
> org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionC
> > > odecFactory.java:134)
> > >
> > >        at
> > >
> > >
> com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader
> > > .java:48)
> > >
> > >        at
> > >
> > >
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTa
> > > sk.java:418)
> > >
> > >        at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> > >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > >
> > > Caused by: java.lang.ClassNotFoundException:
> > > com.hadoop.compression.lzo.LzopCodec
> > >
> > >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> > >        at java.security.AccessController.doPrivileged(Native Method)
> > >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > >        at java.lang.Class.forName0(Native Method)
> > >        at java.lang.Class.forName(Class.java:247)
> > >        at
> > >
> > >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:76
> > > 2)
> > >
> > >        at
> > >
> > >
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > > pressionCodecFactory.java:89)
> > >
> > >        ... 6 more
> > >
> > >
> -------------------------------------------------------------------------
> > >
> -------------------------------------------------------------------------
> > > ----------------------------
> > >
> > >
> > > There is a installation instruction in this
> > > link:http://github.com/kevinweil/hadoop-lzo,it says other configurings
> > > are needed :
> > >
> > > Once the libs are built and installed, you may want to add them to the
> > > class
> > > paths and library paths. That is, in hadoop-env.sh, set
> > >
> > >   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar
> > >
> > > Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
> > > ,should I do set this entry like this again? actually, after I add
> this:
> > > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> > > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
> > >
> > > /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so:
> > >  how can I
> > >
> > > get hadoop to load hadoop-lzo-0.4.4.jar?)
> > >
> > >    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
> > >
> > > libs:/path/to/standard-hadoop-native-libs
> > >
> > >    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop;
> > >    comment
> > >
> > > out the line
> > >
> > >    (3)JAVA_LIBRARY_PATH=''
> > >
> > > Question:since native library got loaded successfully,aren't these
> > > operation(2)(3) needed?
> > >
> > >
> > > -----------------------------------------------
> > > I am using hadoop 0.20.2
> > > core-site.xml
> > >
> > >
> -------------------------------------------------------------------------
> > > ---- <configuration>
> > >
> > >        <property>
> > >
> > >                <name>fs.default.name</name>
> > >                <value>hdfs://hadoop:8020</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>hadoop.tmp.dir</name>
> > >                <value>/home/hadoop/tmp</value>
> > >
> > >        </property>
> > >
> > >        <property>
> > >
> > >                <name>io.compression.codecs</name>
> > >
> > >
> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compr
> > >
> ess.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.comp
> > > ression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
> > >
> > >                </value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>io.compression.codec.lzo.class</name>
> > >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> > >
> > >        </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> -------------------------------------------------------------------------
> > > ---- mapred-site.xml
> > >
> > >
> -------------------------------------------------------------------------
> > > ---- <configuration>
> > >
> > >        <property>
> > >
> > >                <name>mapred.job.tracker</name>
> > >                <value>AlexLuya:9001</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapred.tasktracker.reduce.tasks.maximum</name>
> > >                <value>1</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapred.tasktracker.map.tasks.maximum</name>
> > >                <value>1</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapred.local.dir</name>
> > >                <value>/home/alex/hadoop/mapred/local</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapred.system.dir</name>
> > >                <value>/tmp/hadoop/mapred/system</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapreduce.map.output.compress</name>
> > >                <value>true</value>
> > >
> > >        </property>
> > >        <property>
> > >
> > >                <name>mapreduce.map.output.compress.codec</name>
> > >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> > >
> > >        </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> -------------------------------------------------------------------------
> > > ---- hadoop-env.sh
> > >
> > >
> -------------------------------------------------------------------------
> > > ---- # Set Hadoop-specific environment variables here.
> > >
> > > # The only required environment variable is JAVA_HOME.  All others are
> > > # optional.  When running a distributed configuration it is best to
> > > # set JAVA_HOME in this file, so that it is correctly defined on
> > > # remote nodes.
> > >
> > > # The java implementation to use.  Required.
> > > export JAVA_HOME=/usr/local/hadoop/jdk1.6.0_20
> > >
> > > # Extra Java CLASSPATH elements.  Optional.
> > > # export HADOOP_CLASSPATH=
> > >
> > > # The maximum amount of heap to use, in MB. Default is 1000.
> > > export HADOOP_HEAPSIZE=200
> > >
> > > # Extra Java runtime options.  Empty by default.
> > > #export HADOOP_OPTS=-server
> > >
> > > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> > >
> > >
> $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib/h
> > > adoop- lzo-0.4.4.jar
> > > #export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/Linux-amd64-64
> > >
> > > # Command specific options appended to HADOOP_OPTS when specified
> > > export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
> > > $HADOOP_NAMENODE_OPTS"
> > > export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
> > > $HADOOP_SECONDARYNAMENODE_OPTS"
> > > export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
> > > $HADOOP_DATANODE_OPTS"
> > > export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
> > > $HADOOP_BALANCER_OPTS"
> > > export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
> > > $HADOOP_JOBTRACKER_OPTS"
> > > # export HADOOP_TASKTRACKER_OPTS=
> > > # The following applies to multiple commands (fs, dfs, fsck, distcp
> etc)
> > > # export HADOOP_CLIENT_OPTS
> > >
> > > # Extra ssh options.  Empty by default.
> > > # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o
> SendEnv=HADOOP_CONF_DIR"
> > >
> > > # Where log files are stored.  $HADOOP_HOME/logs by default.
> > > # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
> > >
> > > # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
> > > # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
> > >
> > > # host:path where hadoop code should be rsync'd from.  Unset by
> default.
> > > # export HADOOP_MASTER=master:/home/$USER/src/hadoop
> > >
> > > # Seconds to sleep between slave commands.  Unset by default.  This
> > > # can be useful in large clusters, where, e.g., slave rsyncs can
> > > # otherwise arrive faster than the master can service them.
> > > # export HADOOP_SLAVE_SLEEP=0.1
> > >
> > > # The directory where pid files are stored. /tmp by default.
> > > # export HADOOP_PID_DIR=/var/hadoop/pids
> > >
> > > # A string representing this instance of hadoop. $USER by default.
> > > #export HADOOP_IDENT_STRING=$USER
> > >
> > > # The scheduling priority for daemon processes.  See 'man nice'.
> > > # export HADOOP_NICENESS=10
>

Re: error:Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzopCodec

Posted by Alex Luya <al...@gmail.com>.
Hi,
 
    Run:ps -aef | grep -i tasktracker
I got this:
-------------------------------------------------------------------------------------------------
alex      2425     1  0 22:34 ?        00:00:05 
/usr/local/hadoop/jdk1.6.0_20/bin/java -Xmx200m -
Dhadoop.log.dir=/usr/local/hadoop/hadoop-0.20.2/bin/../logs -
Dhadoop.log.file=hadoop-alex-tasktracker-Hadoop-03.log -
Dhadoop.home.dir=/usr/local/hadoop/hadoop-0.20.2/bin/.. -Dhadoop.id.str=alex -
Dhadoop.root.logger=INFO,DRFA -
Djava.library.path=/usr/local/hadoop/hadoop-0.20.2/bin/../lib/native/Linux-
amd64-64 -Dhadoop.policy.file=hadoop-policy.xml -classpath 
/usr/local/hadoop/hadoop-0.20.2/bin/../conf:/usr/local/hadoop/jdk1.6.0_20/lib/tools.jar:/usr/local/hadoop/hadoop-0.20.2/bin/..:/usr/local/hadoop/hadoop-0.20.2/bin/../hadoop-0.20.2-
core.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
cli-1.2.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
codec-1.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
el-1.0.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
httpclient-3.0.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
logging-1.0.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-logging-
api-1.0.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/commons-
net-1.4.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/core-3.1.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hadoop-
lzo-0.4.4.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hsqldb-1.8.0.10.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jasper-
compiler-5.5.12.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jasper-
runtime-5.5.12.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jets3t-0.6.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jetty-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jetty-
util-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/junit-3.8.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/kfs-0.2.2.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/log4j-1.2.15.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/mockito-
all-1.8.0.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/oro-2.0.8.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/servlet-
api-2.5-6.1.14.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/slf4j-
api-1.4.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/slf4j-
log4j12-1.4.3.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/xmlenc-0.52.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jsp-2.1/jsp-2.1.jar:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/jsp-2.1/jsp-
api-2.1.jar org.apache.hadoop.mapred.TaskTracker
alex      2609  1231  0 22:51 pts/0    00:00:00 grep --color=auto -i 
tasktracker
-----------------------------------------------------------------------------------------------------------------------------------------------------


see this:
:/usr/local/hadoop/hadoop-0.20.2/bin/../lib/hadoop-lzo-0.4.4.jar

"hadoop-lzo-0.4.4.jar" got presented,but question is why directory structure 
will like this "/bin/../lib/hadoop-lzo-0.4.4.jar",it should be 
"/usr/local/hadoop/hadoop-0.20.2/lib/hadoop-lzo-0.4.4.jar",is this normal?


On Thursday, July 29, 2010 07:34:45 am Alex Kozlov wrote:
> Hi Alex,
> 
> There seems to be a problem with your configuration.  Can you try and check
> on which node the attempt_201007202234_0001_m_000000_0 was run?  Go to this
> machine and check the config files and the lib subdirectory (for the
> presence of the correct configuration and the hadoop-lzo-0.4.4.jar).
> Restart the TT and using 'ps -aef | grep -i tasktracker' check that the
> hadoop-lzo-0.4.4.jar is in the classpath.
> 
> I have a strong suspicion you have stray config files:
> com.hadoop.compression.lzo.LzopCodec is not mentioned in the ones you
> provided...
> 
> Alex K
> 
> On Wed, Jul 28, 2010 at 7:42 AM, Alex Luya <al...@gmail.com> wrote:
> > Hello:
> > 
> >    I got source code from http://github.com/kevinweil/hadoop-lzo,compiled
> > 
> > them successfully,and then
> > 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master
> > and
> > slave
> > 2,Copy all files under directory:../Linux-amd64-64/lib to directory:
> > $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
> > 3,and upload a file:test.lzo to HDFS
> > 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar
> > com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test
> > 
> > got errors:
> > 
> > -------------------------------------------------------------------------
> > -------------------------------------------------------------------------
> > --------------------------- 10/07/20 22:37:37 INFO
> > lzo.GPLNativeCodeLoader: Loaded native gpl library 10/07/20 22:37:37
> > INFO lzo.LzoCodec: Successfully loaded & initialized native-
> > lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
> > ##########^_^^_^^_^^_^^_^^_^##################
> > (I think this says all native library got loaded successfully)
> > ################################
> > 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file
> > target.lz:o
> > to indexing list (no index currently exists)
> > ...
> > attempt_201007202234_0001_m_000000_0, Status : FAILED
> > java.lang.IllegalArgumentException: Compression codec
> > com.hadoop.compression.lzo.LzopCodec
> > 
> >                 not found.
> >        
> >        at
> > 
> > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > pressionCodecFactory.java:96)
> > 
> >        at
> > 
> > org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionC
> > odecFactory.java:134)
> > 
> >        at
> > 
> > com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader
> > .java:48)
> > 
> >        at
> > 
> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTa
> > sk.java:418)
> > 
> >        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> > 
> > Caused by: java.lang.ClassNotFoundException:
> > com.hadoop.compression.lzo.LzopCodec
> > 
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> >        at java.lang.Class.forName0(Native Method)
> >        at java.lang.Class.forName(Class.java:247)
> >        at
> > 
> > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:76
> > 2)
> > 
> >        at
> > 
> > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > pressionCodecFactory.java:89)
> > 
> >        ... 6 more
> > 
> > -------------------------------------------------------------------------
> > -------------------------------------------------------------------------
> > ----------------------------
> > 
> > 
> > There is a installation instruction in this
> > link:http://github.com/kevinweil/hadoop-lzo,it says other configurings
> > are needed :
> > 
> > Once the libs are built and installed, you may want to add them to the
> > class
> > paths and library paths. That is, in hadoop-env.sh, set
> > 
> >   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar
> > 
> > Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
> > ,should I do set this entry like this again? actually, after I add this:
> > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
> > 
> > /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so:
> >  how can I
> > 
> > get hadoop to load hadoop-lzo-0.4.4.jar?)
> > 
> >    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
> > 
> > libs:/path/to/standard-hadoop-native-libs
> > 
> >    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop;
> >    comment
> > 
> > out the line
> > 
> >    (3)JAVA_LIBRARY_PATH=''
> > 
> > Question:since native library got loaded successfully,aren't these
> > operation(2)(3) needed?
> > 
> > 
> > -----------------------------------------------
> > I am using hadoop 0.20.2
> > core-site.xml
> > 
> > -------------------------------------------------------------------------
> > ---- <configuration>
> > 
> >        <property>
> >        
> >                <name>fs.default.name</name>
> >                <value>hdfs://hadoop:8020</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>hadoop.tmp.dir</name>
> >                <value>/home/hadoop/tmp</value>
> >        
> >        </property>
> >        
> >        <property>
> >        
> >                <name>io.compression.codecs</name>
> > 
> > <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compr
> > ess.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.comp
> > ression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
> > 
> >                </value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>io.compression.codec.lzo.class</name>
> >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> >        
> >        </property>
> > 
> > </configuration>
> > 
> > 
> > -------------------------------------------------------------------------
> > ---- mapred-site.xml
> > 
> > -------------------------------------------------------------------------
> > ---- <configuration>
> > 
> >        <property>
> >        
> >                <name>mapred.job.tracker</name>
> >                <value>AlexLuya:9001</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapred.tasktracker.reduce.tasks.maximum</name>
> >                <value>1</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapred.tasktracker.map.tasks.maximum</name>
> >                <value>1</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapred.local.dir</name>
> >                <value>/home/alex/hadoop/mapred/local</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapred.system.dir</name>
> >                <value>/tmp/hadoop/mapred/system</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapreduce.map.output.compress</name>
> >                <value>true</value>
> >        
> >        </property>
> >        <property>
> >        
> >                <name>mapreduce.map.output.compress.codec</name>
> >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> >        
> >        </property>
> > 
> > </configuration>
> > 
> > 
> > -------------------------------------------------------------------------
> > ---- hadoop-env.sh
> > 
> > -------------------------------------------------------------------------
> > ---- # Set Hadoop-specific environment variables here.
> > 
> > # The only required environment variable is JAVA_HOME.  All others are
> > # optional.  When running a distributed configuration it is best to
> > # set JAVA_HOME in this file, so that it is correctly defined on
> > # remote nodes.
> > 
> > # The java implementation to use.  Required.
> > export JAVA_HOME=/usr/local/hadoop/jdk1.6.0_20
> > 
> > # Extra Java CLASSPATH elements.  Optional.
> > # export HADOOP_CLASSPATH=
> > 
> > # The maximum amount of heap to use, in MB. Default is 1000.
> > export HADOOP_HEAPSIZE=200
> > 
> > # Extra Java runtime options.  Empty by default.
> > #export HADOOP_OPTS=-server
> > 
> > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> > 
> > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib/h
> > adoop- lzo-0.4.4.jar
> > #export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native/Linux-amd64-64
> > 
> > # Command specific options appended to HADOOP_OPTS when specified
> > export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
> > $HADOOP_NAMENODE_OPTS"
> > export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
> > $HADOOP_SECONDARYNAMENODE_OPTS"
> > export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
> > $HADOOP_DATANODE_OPTS"
> > export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
> > $HADOOP_BALANCER_OPTS"
> > export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
> > $HADOOP_JOBTRACKER_OPTS"
> > # export HADOOP_TASKTRACKER_OPTS=
> > # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
> > # export HADOOP_CLIENT_OPTS
> > 
> > # Extra ssh options.  Empty by default.
> > # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
> > 
> > # Where log files are stored.  $HADOOP_HOME/logs by default.
> > # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
> > 
> > # File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
> > # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
> > 
> > # host:path where hadoop code should be rsync'd from.  Unset by default.
> > # export HADOOP_MASTER=master:/home/$USER/src/hadoop
> > 
> > # Seconds to sleep between slave commands.  Unset by default.  This
> > # can be useful in large clusters, where, e.g., slave rsyncs can
> > # otherwise arrive faster than the master can service them.
> > # export HADOOP_SLAVE_SLEEP=0.1
> > 
> > # The directory where pid files are stored. /tmp by default.
> > # export HADOOP_PID_DIR=/var/hadoop/pids
> > 
> > # A string representing this instance of hadoop. $USER by default.
> > #export HADOOP_IDENT_STRING=$USER
> > 
> > # The scheduling priority for daemon processes.  See 'man nice'.
> > # export HADOOP_NICENESS=10