You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Yingnan Ma <ma...@gmail.com> on 2012/05/23 12:29:57 UTC

Hadoop LZO compression

 Hi,

I encounter a problem about when I install the LZO,
after i install it, I found that it can run on Pig scripts
and streaming scripts and when I check these jobs though
 jobtracker , it shows that
*mapred.compress.map.output *true,
*io.compression.codecs *
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec

*io.compression.codec.lzo.class*com.hadoop.compression.lzo.LzoCodec

and the pig && streaming can run also.

However, I found that it would gave me some error notices

such as:


2012-05-23 17:32:57,052 [Thread-6] ERROR
com.hadoop.compression.lzo.GPLNativeCodeLoader - Could not load native
gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738)
        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
        at java.lang.System.loadLibrary(System.java:1028)
        at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
        at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
        at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
        at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134)
        at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:46)
        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:254)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)
        at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
        at java.lang.Thread.run(Thread.java:662)
2012-05-23 17:32:57,053 [Thread-6] ERROR
com.hadoop.compression.lzo.LzoCodec - Cannot load native-lzo without
native-hadoop
2012-05-23 17:32:57,070 [Thread-6] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
input paths (combined) to process : 1
2012-05-23 17:32:58,419 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201204051731_1249
2012-05-23 17:32:58,419 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://hdjt:50030/jobdetails.jsp?jobid=job_201204051731_1249
2012-05-23 17:33:14,526 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 50% complete
2012-05-23 17:33:18,134 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2012-05-23 17:33:18,136 [main] INFO
org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
0.20.2-cdh3u0   0.8.0-cdh3u0    root    2012-05-23 17:32:54
2012-05-23 17:33:18    FILTER
 Success!
 Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime
 MaxReduceTime  MinReduceTime   AvgReduceTime   Alias   Feature
Outputs
job_201204051731_1249   1       0       10      10      10      0
 0     A,B      MAP_ONLY
hdfs://hdmaster:54310/tmp/temp-1842686846/tmp-2027515206,

It make me confuse because if it has some issues, it would not work,
however it may work. So I need some help, thank you for your help!

Best Regards

Malone


2012-05-23

Re: Hadoop LZO compression

Posted by Harsh J <ha...@cloudera.com>.

Hey Malone,

We've already received your earlier mail. I've answered your question
a short while ago over that thread:
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201205.mbox/%3cCAOcnVr1SsAf8HCbVBESeU_a-w8AUQySwDdrN+ovDu0sYpUv0uw@mail.gmail.com%3e

On Wed, May 23, 2012 at 3:59 PM, Yingnan Ma <ma...@gmail.com> wrote:
>  Hi,
>
> I encounter a problem about when I install the LZO,
> after i install it, I found that it can run on Pig scripts
> and streaming scripts and when I check these jobs though
>  jobtracker , it shows that
> *mapred.compress.map.output *true,
> *io.compression.codecs *
> org.apache.hadoop.io.compress.GzipCodec,
> org.apache.hadoop.io.compress.DefaultCodec,
> com.hadoop.compression.lzo.LzoCodec,
> com.hadoop.compression.lzo.LzopCodec
>
> *io.compression.codec.lzo.class*com.hadoop.compression.lzo.LzoCodec
>
> and the pig && streaming can run also.
>
> However, I found that it would gave me some error notices
>
> such as:
>
>
> 2012-05-23 17:32:57,052 [Thread-6] ERROR
> com.hadoop.compression.lzo.GPLNativeCodeLoader - Could not load native
> gpl library
> java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
>        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738)
>        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
>        at java.lang.System.loadLibrary(System.java:1028)
>        at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
>        at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:247)
>        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>        at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
>        at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134)
>        at org.apache.hadoop.mapreduce.lib.input.TextInputFormat.isSplitable(TextInputFormat.java:46)
>        at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:254)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)
>        at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
>        at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
>        at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
>        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>        at java.lang.Thread.run(Thread.java:662)
> 2012-05-23 17:32:57,053 [Thread-6] ERROR
> com.hadoop.compression.lzo.LzoCodec - Cannot load native-lzo without
> native-hadoop
> 2012-05-23 17:32:57,070 [Thread-6] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input paths (combined) to process : 1
> 2012-05-23 17:32:58,419 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_201204051731_1249
> 2012-05-23 17:32:58,419 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - More information at:
> http://hdjt:50030/jobdetails.jsp?jobid=job_201204051731_1249
> 2012-05-23 17:33:14,526 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 50% complete
> 2012-05-23 17:33:18,134 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
> 2012-05-23 17:33:18,136 [main] INFO
> org.apache.pig.tools.pigstats.PigStats - Script Statistics:
>
> HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
> 0.20.2-cdh3u0   0.8.0-cdh3u0    root    2012-05-23 17:32:54
> 2012-05-23 17:33:18    FILTER
>  Success!
>  Job Stats (time in seconds):
> JobId   Maps    Reduces MaxMapTime      MinMapTIme      AvgMapTime
>  MaxReduceTime  MinReduceTime   AvgReduceTime   Alias   Feature
> Outputs
> job_201204051731_1249   1       0       10      10      10      0
>  0     A,B      MAP_ONLY
> hdfs://hdmaster:54310/tmp/temp-1842686846/tmp-2027515206,
>
> It make me confuse because if it has some issues, it would not work,
> however it may work. So I need some help, thank you for your help!
>
> Best Regards
>
> Malone
>
>
> 2012-05-23



-- 
Harsh J