You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alex Luya <al...@gmail.com> on 2010/07/24 09:40:29 UTC

Lzo question

Hello:
    I got source code from http://github.com/kevinweil/hadoop-lzo,compiled 
them successfully,and then
1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master and 
slave
2,Copy all files under directory:../Linux-amd64-64 to directory:
$HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
3,and upload a file:test.lzo to HDFS
4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar 
com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test

got errors:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10/07/20 22:37:37 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
10/07/20 22:37:37 INFO lzo.LzoCodec: Successfully loaded & initialized native-
lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
##########^_^^_^^_^^_^^_^^_^##################(I think this says all native 
library got loaded successfully)################################
10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file target.lz:o 
to indexing list (no index currently exists)
10/07/20 22:37:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
the arguments. Applications should implement Tool for the same.
10/07/20 22:37:38 INFO input.FileInputFormat: Total input paths to process : 1
10/07/20 22:37:38 INFO mapred.JobClient: Running job: job_201007202234_0001
10/07/20 22:37:39 INFO mapred.JobClient:  map 0% reduce 0%
10/07/20 22:37:48 INFO mapred.JobClient: Task Id : 
attempt_201007202234_0001_m_000000_0, Status : FAILED
java.lang.IllegalArgumentException: Compression codec 
com.hadoop.compression.lzo.LzopCodec
                 not found.
        at 
org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
        at 
org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134)
        at 
com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader.java:48)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.ClassNotFoundException: 
com.hadoop.compression.lzo.LzopCodec

        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at 
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
        at 
org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
        ... 6 more
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 
There is a installation instruction in this 
link:http://github.com/kevinweil/hadoop-lzo,it says other configurings are 
needed :

Once the libs are built and installed, you may want to add them to the class 
paths and library paths. That is, in hadoop-env.sh, set

   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar

Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
,should I do set this entry like this again? ,actully after I add this:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
$HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
/hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so how can I 
get hadoop to load hadoop-lzo-0.4.4.jar?)


    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
libs:/path/to/standard-hadoop-native-libs
    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop; comment 
out the line
    (3)JAVA_LIBRARY_PATH=''


Question:since native library got loaded successfully,aren't these 
operation(2)(3) needed?


-----------------------------------------------
I am using hadoop 0.20.2
core-site.xml
-----------------------------------------------------------------------------
<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://hadoop:8020</value>
	</property>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/home/hadoop/tmp</value>
	</property>

	<property>
		<name>io.compression.codecs</name>
		
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
		</value>
	</property>
	<property>
		<name>io.compression.codec.lzo.class</name>
		<value>com.hadoop.compression.lzo.LzoCodec</value>
	</property>
</configuration>

-----------------------------------------------------------------------------
mapred-site.xml
-----------------------------------------------------------------------------
<configuration>
	<property>
		<name>mapred.job.tracker</name>
		<value>AlexLuya:9001</value>
	</property>
	<property>
		<name>mapred.tasktracker.reduce.tasks.maximum</name>
		<value>1</value>
	</property>
	<property>
		<name>mapred.tasktracker.map.tasks.maximum</name>
		<value>1</value>
	</property>
	<property>
		<name>mapred.local.dir</name>
		<value>/home/alex/hadoop/mapred/local</value>
	</property>
	<property>
		<name>mapred.system.dir</name>
		<value>/tmp/hadoop/mapred/system</value>
	</property>
	<property>
		<name>mapreduce.map.output.compress</name>
		<value>true</value>
	</property>
	<property>
		<name>mapreduce.map.output.compress.codec</name>
		<value>com.hadoop.compression.lzo.LzoCodec</value>
	</property>
</configuration>

-----------------------------------------------------------------------------

Re: Lzo question

Posted by Alex Luya <al...@gmail.com>.
Thanks:
           I followed your suggestion,but the same as before,I think native 
libraries are ok,the problem comes from that lzo lib cann't be loaded. 

On Sunday, July 25, 2010 06:27:16 pm Jamie Cockrill wrote:
> Alex,
> 
> This isn't exactly what you're doing, but I recently got LZO working
> with HBase, another project in the hadoop family, using these
> instructions:
> 
> http://wiki.apache.org/hadoop/UsingLzoCompression
> 
> The part where I thinkyou might be having an issue is you're second
> step: "2, Copy all files under directory /Linux-amd64-64 to
> directory..."
> 
> In the instructions above it states the following:
> 
> ----------
> Copy the results into the hbase lib directory:
> 
> $ cp build/hadoop-gpl-compression-0.1.0-dev.jar hbase/lib/
> $ cp build/native/Linux-amd64-64/lib/libgplcompression.*
> hbase/lib/native/Linux-amd64-64/
> 
> Note there is an extra 'lib' level in the build, which is not present
> in the hbase/lib/native/ tree.
> ----------
> 
> The important bit is the extra lib directory in the build, when
> copying into the hadoop/lib/native structure. I don't know if that'll
> necessarily solve your problem. Also, I have noticed in the past that
> certain jar files don't get picked up straight away and that you have
> to restart job-trackers and task-trackers for them to be used in
> map-reduce jobs.
> 
> Good luck!
> 
> Thanks,
> 
> Jamie
> 
> On 24 July 2010 08:40, Alex Luya <al...@gmail.com> wrote:
> > Hello:
> >    I got source code from http://github.com/kevinweil/hadoop-lzo,compiled
> > them successfully,and then
> > 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master
> > and slave
> > 2,Copy all files under directory:../Linux-amd64-64 to directory:
> > $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
> > 3,and upload a file:test.lzo to HDFS
> > 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar
> > com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test
> > 
> > got errors:
> > -------------------------------------------------------------------------
> > -------------------------------------------------------------------------
> > --------------------------- 10/07/20 22:37:37 INFO
> > lzo.GPLNativeCodeLoader: Loaded native gpl library 10/07/20 22:37:37
> > INFO lzo.LzoCodec: Successfully loaded & initialized native- lzo library
> > [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
> > ##########^_^^_^^_^^_^^_^^_^##################(I think this says all
> > native library got loaded successfully)################################
> > 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file
> > target.lz:o to indexing list (no index currently exists)
> > 10/07/20 22:37:37 WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same.
> > 10/07/20 22:37:38 INFO input.FileInputFormat: Total input paths to
> > process : 1 10/07/20 22:37:38 INFO mapred.JobClient: Running job:
> > job_201007202234_0001 10/07/20 22:37:39 INFO mapred.JobClient:  map 0%
> > reduce 0%
> > 10/07/20 22:37:48 INFO mapred.JobClient: Task Id :
> > attempt_201007202234_0001_m_000000_0, Status : FAILED
> > java.lang.IllegalArgumentException: Compression codec
> > com.hadoop.compression.lzo.LzopCodec
> >                 not found.
> >        at
> > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > pressionCodecFactory.java:96) at
> > org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionC
> > odecFactory.java:134) at
> > com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader
> > .java:48) at
> > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTa
> > sk.java:418) at
> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620) at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at
> > org.apache.hadoop.mapred.Child.main(Child.java:170)
> > Caused by: java.lang.ClassNotFoundException:
> > com.hadoop.compression.lzo.LzopCodec
> > 
> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> >        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> >        at java.lang.Class.forName0(Native Method)
> >        at java.lang.Class.forName(Class.java:247)
> >        at
> > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:76
> > 2) at
> > org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(Com
> > pressionCodecFactory.java:89) ... 6 more
> > -------------------------------------------------------------------------
> > -------------------------------------------------------------------------
> > ----------------------------
> > 
> > 
> > There is a installation instruction in this
> > link:http://github.com/kevinweil/hadoop-lzo,it says other configurings
> > are needed :
> > 
> > Once the libs are built and installed, you may want to add them to the
> > class paths and library paths. That is, in hadoop-env.sh, set
> > 
> >   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar
> > 
> > Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
> > ,should I do set this entry like this again? ,actully after I add this:
> > export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> > $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
> > /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so how can
> > I get hadoop to load hadoop-lzo-0.4.4.jar?)
> > 
> > 
> >    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
> > libs:/path/to/standard-hadoop-native-libs
> >    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop;
> > comment out the line
> >    (3)JAVA_LIBRARY_PATH=''
> > 
> > 
> > Question:since native library got loaded successfully,aren't these
> > operation(2)(3) needed?
> > 
> > 
> > -----------------------------------------------
> > I am using hadoop 0.20.2
> > core-site.xml
> > -------------------------------------------------------------------------
> > ---- <configuration>
> >        <property>
> >                <name>fs.default.name</name>
> >                <value>hdfs://hadoop:8020</value>
> >        </property>
> >        <property>
> >                <name>hadoop.tmp.dir</name>
> >                <value>/home/hadoop/tmp</value>
> >        </property>
> > 
> >        <property>
> >                <name>io.compression.codecs</name>
> > 
> > <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compr
> > ess.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.comp
> > ression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec </value>
> >        </property>
> >        <property>
> >                <name>io.compression.codec.lzo.class</name>
> >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> >        </property>
> > </configuration>
> > 
> > -------------------------------------------------------------------------
> > ---- mapred-site.xml
> > -------------------------------------------------------------------------
> > ---- <configuration>
> >        <property>
> >                <name>mapred.job.tracker</name>
> >                <value>AlexLuya:9001</value>
> >        </property>
> >        <property>
> >                <name>mapred.tasktracker.reduce.tasks.maximum</name>
> >                <value>1</value>
> >        </property>
> >        <property>
> >                <name>mapred.tasktracker.map.tasks.maximum</name>
> >                <value>1</value>
> >        </property>
> >        <property>
> >                <name>mapred.local.dir</name>
> >                <value>/home/alex/hadoop/mapred/local</value>
> >        </property>
> >        <property>
> >                <name>mapred.system.dir</name>
> >                <value>/tmp/hadoop/mapred/system</value>
> >        </property>
> >        <property>
> >                <name>mapreduce.map.output.compress</name>
> >                <value>true</value>
> >        </property>
> >        <property>
> >                <name>mapreduce.map.output.compress.codec</name>
> >                <value>com.hadoop.compression.lzo.LzoCodec</value>
> >        </property>
> > </configuration>
> > 
> > -------------------------------------------------------------------------
> > ----

Re: Lzo question

Posted by Jamie Cockrill <ja...@gmail.com>.
Alex,

This isn't exactly what you're doing, but I recently got LZO working
with HBase, another project in the hadoop family, using these
instructions:

http://wiki.apache.org/hadoop/UsingLzoCompression

The part where I thinkyou might be having an issue is you're second
step: "2, Copy all files under directory /Linux-amd64-64 to
directory..."

In the instructions above it states the following:

----------
Copy the results into the hbase lib directory:

$ cp build/hadoop-gpl-compression-0.1.0-dev.jar hbase/lib/
$ cp build/native/Linux-amd64-64/lib/libgplcompression.*
hbase/lib/native/Linux-amd64-64/

Note there is an extra 'lib' level in the build, which is not present
in the hbase/lib/native/ tree.
----------

The important bit is the extra lib directory in the build, when
copying into the hadoop/lib/native structure. I don't know if that'll
necessarily solve your problem. Also, I have noticed in the past that
certain jar files don't get picked up straight away and that you have
to restart job-trackers and task-trackers for them to be used in
map-reduce jobs.

Good luck!

Thanks,

Jamie



On 24 July 2010 08:40, Alex Luya <al...@gmail.com> wrote:
> Hello:
>    I got source code from http://github.com/kevinweil/hadoop-lzo,compiled
> them successfully,and then
> 1,copy hadoop-lzo-0.4.4.jar to directory:$HADOOP_HOME/lib of each master and
> slave
> 2,Copy all files under directory:../Linux-amd64-64 to directory:
> $HADDOOP_HOME/lib/native/Linux-amd64-64 of each master and slave
> 3,and upload a file:test.lzo to HDFS
> 4,then run:hadoop jar $HADOOP_HOME/lib/hadoop-lzo-0.4.4.jar
> com.hadoop.compression.lzo.DistributedLzoIndexer test.lzo to test
>
> got errors:
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 10/07/20 22:37:37 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 10/07/20 22:37:37 INFO lzo.LzoCodec: Successfully loaded & initialized native-
> lzo library [hadoop-lzo rev 5c25e0073d3dae9ace4bd9eba72e4dc43650c646]
> ##########^_^^_^^_^^_^^_^^_^##################(I think this says all native
> library got loaded successfully)################################
> 10/07/20 22:37:37 INFO lzo.DistributedLzoIndexer: Adding LZO file target.lz:o
> to indexing list (no index currently exists)
> 10/07/20 22:37:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing
> the arguments. Applications should implement Tool for the same.
> 10/07/20 22:37:38 INFO input.FileInputFormat: Total input paths to process : 1
> 10/07/20 22:37:38 INFO mapred.JobClient: Running job: job_201007202234_0001
> 10/07/20 22:37:39 INFO mapred.JobClient:  map 0% reduce 0%
> 10/07/20 22:37:48 INFO mapred.JobClient: Task Id :
> attempt_201007202234_0001_m_000000_0, Status : FAILED
> java.lang.IllegalArgumentException: Compression codec
> com.hadoop.compression.lzo.LzopCodec
>                 not found.
>        at
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96)
>        at
> org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134)
>        at
> com.hadoop.mapreduce.LzoSplitRecordReader.initialize(LzoSplitRecordReader.java:48)
>        at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.ClassNotFoundException:
> com.hadoop.compression.lzo.LzopCodec
>
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:247)
>        at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
>        at
> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
>        ... 6 more
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> There is a installation instruction in this
> link:http://github.com/kevinweil/hadoop-lzo,it says other configurings are
> needed :
>
> Once the libs are built and installed, you may want to add them to the class
> paths and library paths. That is, in hadoop-env.sh, set
>
>   (1)export HADOOP_CLASSPATH=/path/to/your/hadoop-lzo-lib.jar
>
> Question:I have copied hadoop-lzo-0.4.4.jar to $HADOOP_HOME/lib,
> ,should I do set this entry like this again? ,actully after I add this:
> export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HBASE_HOME/hbase-0.20.4.jar:
> $HABSE_HOME/config:$ZOOKEEPER_HOME/zookeeper-3.3.1.jar:$HADOOP_HOME/lib
> /hadoop-lzo-0.4.4.jar,redo 1-4 as above,same problem as before,so how can I
> get hadoop to load hadoop-lzo-0.4.4.jar?)
>
>
>    (2),export JAVA_LIBRARY_PATH=/path/to/hadoop-lzo-native-
> libs:/path/to/standard-hadoop-native-libs
>    Note that there seems to be a bug in /path/to/hadoop/bin/hadoop; comment
> out the line
>    (3)JAVA_LIBRARY_PATH=''
>
>
> Question:since native library got loaded successfully,aren't these
> operation(2)(3) needed?
>
>
> -----------------------------------------------
> I am using hadoop 0.20.2
> core-site.xml
> -----------------------------------------------------------------------------
> <configuration>
>        <property>
>                <name>fs.default.name</name>
>                <value>hdfs://hadoop:8020</value>
>        </property>
>        <property>
>                <name>hadoop.tmp.dir</name>
>                <value>/home/hadoop/tmp</value>
>        </property>
>
>        <property>
>                <name>io.compression.codecs</name>
>
> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
>                </value>
>        </property>
>        <property>
>                <name>io.compression.codec.lzo.class</name>
>                <value>com.hadoop.compression.lzo.LzoCodec</value>
>        </property>
> </configuration>
>
> -----------------------------------------------------------------------------
> mapred-site.xml
> -----------------------------------------------------------------------------
> <configuration>
>        <property>
>                <name>mapred.job.tracker</name>
>                <value>AlexLuya:9001</value>
>        </property>
>        <property>
>                <name>mapred.tasktracker.reduce.tasks.maximum</name>
>                <value>1</value>
>        </property>
>        <property>
>                <name>mapred.tasktracker.map.tasks.maximum</name>
>                <value>1</value>
>        </property>
>        <property>
>                <name>mapred.local.dir</name>
>                <value>/home/alex/hadoop/mapred/local</value>
>        </property>
>        <property>
>                <name>mapred.system.dir</name>
>                <value>/tmp/hadoop/mapred/system</value>
>        </property>
>        <property>
>                <name>mapreduce.map.output.compress</name>
>                <value>true</value>
>        </property>
>        <property>
>                <name>mapreduce.map.output.compress.codec</name>
>                <value>com.hadoop.compression.lzo.LzoCodec</value>
>        </property>
> </configuration>
>
> -----------------------------------------------------------------------------
>