You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Eran Kutner <er...@gigya.com> on 2011/08/02 11:16:09 UTC

Can't get LZO codec to work

Hi,
I'm trying to enable the LZO codec in flume.
I'm using flume 0.9.4 from CDH3
I placed the hadopp-lzo.jar file in flume's lib directory.
I added the native directory to the java lib path in flume-env.sh
I verified, using "flume classpath" that the codec jar file is indeed
included in the flume classpath.

This command is working fine:
java -classpath "/usr/lib/flume/lib/*"
-Djava.library.path=/usr/lib/hadoop/lib/native/Linux-amd64-64
com.hadoop.compression.lzo.LzoIndexer /tmp/test.lzo


However, when I try to enable LzopCodec I get this error:
11/08/02 04:45:35 INFO connector.DirectDriver: Connector logicalNode
collector1-44 exited with error: Unsupported compression codec LzopCodec.
Please choose from: [None, DefaultCodec, GzipCodec, BZip2Codec,
DeflateCodec, SnappyCodec]
java.lang.IllegalArgumentException: Unsupported compression codec
LzopCodec.  Please choose from: [None, DefaultCodec, GzipCodec, BZip2Codec,
DeflateCodec, SnappyCodec]
        at
com.cloudera.flume.handlers.hdfs.CustomDfsSink.getCodec(CustomDfsSink.java:180)
        at
com.cloudera.flume.handlers.hdfs.CustomDfsSink.open(CustomDfsSink.java:109)
        at
com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.openWriter(EscapedCustomDfsSink.java:104)
        at
com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.append(EscapedCustomDfsSink.java:119)
        at
com.cloudera.flume.core.CompositeSink.append(CompositeSink.java:61)
        at
com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
        at
com.cloudera.flume.handlers.rolling.RollSink.synchronousAppend(RollSink.java:234)
        at
com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:183)
        at
com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:181)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)


Note that it doesn't even list the LzoCodec in available codecs list. Any
idea why?

-eran

Re: Can't get LZO codec to work

Posted by Eran Kutner <er...@gigya.com>.
Thanks Harsh, that did the trick. I didn't realize Flume was reading the
hadoop configuration, iand it's not mentioned anywhere.

However, now I have a different problem. I'm trying to write compressed
sequence files to be read by hadoop, however the file produced by flume is
not valid.
After looking into it, it seems that flume is compressing the entire file,
not just the data. This means the file doesn't have a valid sequence file
header, and so hadoop can't read it.
It's not a LZO problem I tried it with the gzip codec and got the same
result. After I ran gunzip on the file, it became a valid sequence file with
the "SEQ" header, etc.
Something I'm doing wrong?

-eran



On Tue, Aug 2, 2011 at 16:12, Harsh J <ha...@cloudera.com> wrote:

> Does your HADOOP_CONF_DIR/core-site.xml have its io.compression.codecs
> property configured to include LZO? Can you ensure your Flume picks up
> the same configuration? You could also try to make it <final> to
> enforce it and prevent overriding.
>
> On Tue, Aug 2, 2011 at 2:46 PM, Eran Kutner <er...@gigya.com> wrote:
> > Hi,
> > I'm trying to enable the LZO codec in flume.
> > I'm using flume 0.9.4 from CDH3
> > I placed the hadopp-lzo.jar file in flume's lib directory.
> > I added the native directory to the java lib path in flume-env.sh
> > I verified, using "flume classpath" that the codec jar file is indeed
> > included in the flume classpath.
> >
> > This command is working fine:
> > java -classpath "/usr/lib/flume/lib/*"
> > -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-amd64-64
> > com.hadoop.compression.lzo.LzoIndexer /tmp/test.lzo
> >
> >
> > However, when I try to enable LzopCodec I get this error:
> > 11/08/02 04:45:35 INFO connector.DirectDriver: Connector logicalNode
> > collector1-44 exited with error: Unsupported compression codec LzopCodec.
> > Please choose from: [None, DefaultCodec, GzipCodec, BZip2Codec,
> > DeflateCodec, SnappyCodec]
> > java.lang.IllegalArgumentException: Unsupported compression codec
> > LzopCodec.  Please choose from: [None, DefaultCodec, GzipCodec,
> BZip2Codec,
> > DeflateCodec, SnappyCodec]
> >         at
> >
> com.cloudera.flume.handlers.hdfs.CustomDfsSink.getCodec(CustomDfsSink.java:180)
> >         at
> >
> com.cloudera.flume.handlers.hdfs.CustomDfsSink.open(CustomDfsSink.java:109)
> >         at
> >
> com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.openWriter(EscapedCustomDfsSink.java:104)
> >         at
> >
> com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.append(EscapedCustomDfsSink.java:119)
> >         at
> > com.cloudera.flume.core.CompositeSink.append(CompositeSink.java:61)
> >         at
> >
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
> >         at
> >
> com.cloudera.flume.handlers.rolling.RollSink.synchronousAppend(RollSink.java:234)
> >         at
> > com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:183)
> >         at
> > com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:181)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >         at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >         at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >         at java.lang.Thread.run(Thread.java:662)
> >
> >
> > Note that it doesn't even list the LzoCodec in available codecs list. Any
> > idea why?
> >
> > -eran
> >
> >
>
>
>
> --
> Harsh J
>

Re: Can't get LZO codec to work

Posted by Harsh J <ha...@cloudera.com>.
Does your HADOOP_CONF_DIR/core-site.xml have its io.compression.codecs
property configured to include LZO? Can you ensure your Flume picks up
the same configuration? You could also try to make it <final> to
enforce it and prevent overriding.

On Tue, Aug 2, 2011 at 2:46 PM, Eran Kutner <er...@gigya.com> wrote:
> Hi,
> I'm trying to enable the LZO codec in flume.
> I'm using flume 0.9.4 from CDH3
> I placed the hadopp-lzo.jar file in flume's lib directory.
> I added the native directory to the java lib path in flume-env.sh
> I verified, using "flume classpath" that the codec jar file is indeed
> included in the flume classpath.
>
> This command is working fine:
> java -classpath "/usr/lib/flume/lib/*"
> -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-amd64-64
> com.hadoop.compression.lzo.LzoIndexer /tmp/test.lzo
>
>
> However, when I try to enable LzopCodec I get this error:
> 11/08/02 04:45:35 INFO connector.DirectDriver: Connector logicalNode
> collector1-44 exited with error: Unsupported compression codec LzopCodec.
> Please choose from: [None, DefaultCodec, GzipCodec, BZip2Codec,
> DeflateCodec, SnappyCodec]
> java.lang.IllegalArgumentException: Unsupported compression codec
> LzopCodec.  Please choose from: [None, DefaultCodec, GzipCodec, BZip2Codec,
> DeflateCodec, SnappyCodec]
>         at
> com.cloudera.flume.handlers.hdfs.CustomDfsSink.getCodec(CustomDfsSink.java:180)
>         at
> com.cloudera.flume.handlers.hdfs.CustomDfsSink.open(CustomDfsSink.java:109)
>         at
> com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.openWriter(EscapedCustomDfsSink.java:104)
>         at
> com.cloudera.flume.handlers.hdfs.EscapedCustomDfsSink.append(EscapedCustomDfsSink.java:119)
>         at
> com.cloudera.flume.core.CompositeSink.append(CompositeSink.java:61)
>         at
> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
>         at
> com.cloudera.flume.handlers.rolling.RollSink.synchronousAppend(RollSink.java:234)
>         at
> com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:183)
>         at
> com.cloudera.flume.handlers.rolling.RollSink$1.call(RollSink.java:181)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
>
>
> Note that it doesn't even list the LzoCodec in available codecs list. Any
> idea why?
>
> -eran
>
>



-- 
Harsh J