You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Eran Kutner <er...@gigya.com> on 2012/08/03 16:33:25 UTC

Can't use snappy codec

Hi,
I'm trying to use the snappy codec but keep getting "native snappy library
not available" errors.
I'm using CDH4 but replaced the flume 1.1 JARs that are included with that
distribution with flume 1.2 JARs.
I tried anything I can think of, including symlinking the hadoop native
library under flume-ng/lib/ dirctory both nothing helps.
Any idea how to resolve this?

This is the error:
2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
java.io.IOException: java.lang.RuntimeException: native snappy library not
available
        at
org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
        at
org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
        at
org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
        at
org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
        at
org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
        at
org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
        at
org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
        at
org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
        at
org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: native snappy library not available
        at
org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
        at
org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
        at
org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
        at
org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
        ... 13 more

And my sink configuration:
flume05.sinks.hdfsSink.type = hdfs
#flume05.sinks.hdfsSink.type = logger
flume05.sinks.hdfsSink.channel = memoryChannel
flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
flume05.sinks.hdfsSink.hdfs.rollInterval=60
flume05.sinks.hdfsSink.hdfs.rollCount=0
flume05.sinks.hdfsSink.hdfs.rollSize=0
flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
flume05.sinks.hdfsSink.hdfs.codeC=snappy
flume05.sinks.hdfsSink.hdfs.writeFormat=Text
flume05.sinks.hdfsSink.hdfs.batchSize=1000
flume05.sinks.hdfsSink.serializer = avro_event

Thanks.

-eran

Re: Can't use snappy codec

Posted by Patrick Wendell <pw...@gmail.com>.
Oh I see - so the current version "just works" for you. I
misinterpreted your last email thinking you had done something
customized to fix it.

The script must have been fixed in an earlier JIRA. Thanks for your
thorough explanation.

- Patrick

On Fri, Aug 3, 2012 at 11:21 AM, Eran Kutner <er...@gigya.com> wrote:
> Hi Patrick,
>
> With the new script it's working ok.
> The old script (the one that comes with cdh4) had a problem
>
> This code in the old script:
>     local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \
>         ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \
>         java.library.path 2>/dev/null)
>
> Would set the HADOOP_JAVA_LIBRARY_PATH to
> "java.library.path=//usr/lib/hadoop/lib/native"
> Which would end up setting FLUME_JAVA_LIBRARY_PATH to
> ":java.library.path=//usr/lib/hadoop/lib/native"
> That would then be used to start the processes as
> "-Djava.library.path=:java.library.path=//usr/lib/hadoop/lib/native" which
> is obviously wrong.
>
> The new script has code to clean up the extra "java.library.path" returned
> by the above code.
> So there are two options, either the implementation of
> org.apache.flume.tools.GetJavaProperty changed, so it now returns the extra
> parameter name and therefore it was required to remove it in the script, or
> the original one included in CDH4 had a bug that was fixed in 1.2
>
> -eran
>
>
>
> On Fri, Aug 3, 2012 at 7:43 PM, Patrick Wendell <pw...@gmail.com> wrote:
>>
>> Hey Eran,
>>
>> So the flume-ng script works by trying to figure out what library path
>> Hadoop is using and then replicating that for flume. If
>> HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it
>> tries to infer the path based on what the hadoop script itself
>> determines.
>>
>> What is the path getting set to in your case and how does that differ
>> from expecations? Just trying to figure out what the bug is.
>>
>> - Patrick
>>
>> On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <er...@gigya.com> wrote:
>> > Thanks Patrick, that helped me figure out the problem and it looks like
>> > a
>> > bug in the "flume-ng" file provided with CDH4, it was messing the
>> > library.path.
>> > I copied the file that was included in flume 1.2.0 distribution and it
>> > now
>> > works ok.
>> >
>> > Thanks for your help.
>> >
>> > -eran
>> >
>> >
>> >
>> > On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <pw...@gmail.com>
>> > wrote:
>> >>
>> >> Hey Eran,
>> >>
>> >> You need to make sure the Flume JVM gets passed
>> >> -Djava.library.path=XXX with the correct path to where your native
>> >> snappy libraries are located.
>> >>
>> >> You can set this by adding the option directly to the flume-ng runner
>> >> script.
>> >>
>> >> - Patrick
>> >>
>> >> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <er...@gigya.com> wrote:
>> >> > Hi,
>> >> > I'm trying to use the snappy codec but keep getting "native snappy
>> >> > library
>> >> > not available" errors.
>> >> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with
>> >> > that
>> >> > distribution with flume 1.2 JARs.
>> >> > I tried anything I can think of, including symlinking the hadoop
>> >> > native
>> >> > library under flume-ng/lib/ dirctory both nothing helps.
>> >> > Any idea how to resolve this?
>> >> >
>> >> > This is the error:
>> >> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop library for your platform... using builtin-java classes
>> >> > where
>> >> > applicable
>> >> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
>> >> > java.io.IOException: java.lang.RuntimeException: native snappy
>> >> > library
>> >> > not
>> >> > available
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
>> >> >         at
>> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >> >         at
>> >> >
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >> >         at
>> >> >
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >> >         at java.lang.Thread.run(Thread.java:662)
>> >> > Caused by: java.lang.RuntimeException: native snappy library not
>> >> > available
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
>> >> >         ... 13 more
>> >> >
>> >> > And my sink configuration:
>> >> > flume05.sinks.hdfsSink.type = hdfs
>> >> > #flume05.sinks.hdfsSink.type = logger
>> >> > flume05.sinks.hdfsSink.channel = memoryChannel
>> >> >
>> >> >
>> >> > flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
>> >> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
>> >> > flume05.sinks.hdfsSink.hdfs.rollInterval=60
>> >> > flume05.sinks.hdfsSink.hdfs.rollCount=0
>> >> > flume05.sinks.hdfsSink.hdfs.rollSize=0
>> >> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
>> >> > flume05.sinks.hdfsSink.hdfs.codeC=snappy
>> >> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text
>> >> > flume05.sinks.hdfsSink.hdfs.batchSize=1000
>> >> > flume05.sinks.hdfsSink.serializer = avro_event
>> >> >
>> >> > Thanks.
>> >> >
>> >> > -eran
>> >> >
>> >
>> >
>
>

Re: Can't use snappy codec

Posted by Eran Kutner <er...@gigya.com>.
Hi Patrick,

With the new script it's working ok.
The old script (the one that comes with cdh4) had a problem

This code in the old script:
    local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \
        ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \
        java.library.path 2>/dev/null)

Would set the HADOOP_JAVA_LIBRARY_PATH to
"java.library.path=//usr/lib/hadoop/lib/native"
Which would end up setting FLUME_JAVA_LIBRARY_PATH to
":java.library.path=//usr/lib/hadoop/lib/native"
That would then be used to start the processes as
"-Djava.library.path=:java.library.path=//usr/lib/hadoop/lib/native" which
is obviously wrong.

The new script has code to clean up the extra "java.library.path" returned
by the above code.
So there are two options, either the implementation of
org.apache.flume.tools.GetJavaProperty changed, so it now returns the extra
parameter name and therefore it was required to remove it in the script, or
the original one included in CDH4 had a bug that was fixed in 1.2

-eran



On Fri, Aug 3, 2012 at 7:43 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey Eran,
>
> So the flume-ng script works by trying to figure out what library path
> Hadoop is using and then replicating that for flume. If
> HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it
> tries to infer the path based on what the hadoop script itself
> determines.
>
> What is the path getting set to in your case and how does that differ
> from expecations? Just trying to figure out what the bug is.
>
> - Patrick
>
> On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <er...@gigya.com> wrote:
> > Thanks Patrick, that helped me figure out the problem and it looks like a
> > bug in the "flume-ng" file provided with CDH4, it was messing the
> > library.path.
> > I copied the file that was included in flume 1.2.0 distribution and it
> now
> > works ok.
> >
> > Thanks for your help.
> >
> > -eran
> >
> >
> >
> > On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
> >>
> >> Hey Eran,
> >>
> >> You need to make sure the Flume JVM gets passed
> >> -Djava.library.path=XXX with the correct path to where your native
> >> snappy libraries are located.
> >>
> >> You can set this by adding the option directly to the flume-ng runner
> >> script.
> >>
> >> - Patrick
> >>
> >> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <er...@gigya.com> wrote:
> >> > Hi,
> >> > I'm trying to use the snappy codec but keep getting "native snappy
> >> > library
> >> > not available" errors.
> >> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with
> >> > that
> >> > distribution with flume 1.2 JARs.
> >> > I tried anything I can think of, including symlinking the hadoop
> native
> >> > library under flume-ng/lib/ dirctory both nothing helps.
> >> > Any idea how to resolve this?
> >> >
> >> > This is the error:
> >> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
> >> > native-hadoop library for your platform... using builtin-java classes
> >> > where
> >> > applicable
> >> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
> >> > java.io.IOException: java.lang.RuntimeException: native snappy library
> >> > not
> >> > available
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
> >> >         at
> >> >
> org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
> >> >         at
> >> >
> >> >
> org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
> >> >         at
> >> >
> org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
> >> >         at
> >> >
> org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
> >> >         at
> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >> >         at
> >> >
> >> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >> >         at java.lang.Thread.run(Thread.java:662)
> >> > Caused by: java.lang.RuntimeException: native snappy library not
> >> > available
> >> >         at
> >> >
> >> >
> org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
> >> >         at
> >> >
> >> >
> org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
> >> >         at
> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
> >> >         ... 13 more
> >> >
> >> > And my sink configuration:
> >> > flume05.sinks.hdfsSink.type = hdfs
> >> > #flume05.sinks.hdfsSink.type = logger
> >> > flume05.sinks.hdfsSink.channel = memoryChannel
> >> >
> >> >
> flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
> >> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
> >> > flume05.sinks.hdfsSink.hdfs.rollInterval=60
> >> > flume05.sinks.hdfsSink.hdfs.rollCount=0
> >> > flume05.sinks.hdfsSink.hdfs.rollSize=0
> >> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
> >> > flume05.sinks.hdfsSink.hdfs.codeC=snappy
> >> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text
> >> > flume05.sinks.hdfsSink.hdfs.batchSize=1000
> >> > flume05.sinks.hdfsSink.serializer = avro_event
> >> >
> >> > Thanks.
> >> >
> >> > -eran
> >> >
> >
> >
>

Re: Can't use snappy codec

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Eran,

So the flume-ng script works by trying to figure out what library path
Hadoop is using and then replicating that for flume. If
HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it
tries to infer the path based on what the hadoop script itself
determines.

What is the path getting set to in your case and how does that differ
from expecations? Just trying to figure out what the bug is.

- Patrick

On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <er...@gigya.com> wrote:
> Thanks Patrick, that helped me figure out the problem and it looks like a
> bug in the "flume-ng" file provided with CDH4, it was messing the
> library.path.
> I copied the file that was included in flume 1.2.0 distribution and it now
> works ok.
>
> Thanks for your help.
>
> -eran
>
>
>
> On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <pw...@gmail.com> wrote:
>>
>> Hey Eran,
>>
>> You need to make sure the Flume JVM gets passed
>> -Djava.library.path=XXX with the correct path to where your native
>> snappy libraries are located.
>>
>> You can set this by adding the option directly to the flume-ng runner
>> script.
>>
>> - Patrick
>>
>> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <er...@gigya.com> wrote:
>> > Hi,
>> > I'm trying to use the snappy codec but keep getting "native snappy
>> > library
>> > not available" errors.
>> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with
>> > that
>> > distribution with flume 1.2 JARs.
>> > I tried anything I can think of, including symlinking the hadoop native
>> > library under flume-ng/lib/ dirctory both nothing helps.
>> > Any idea how to resolve this?
>> >
>> > This is the error:
>> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
>> > native-hadoop library for your platform... using builtin-java classes
>> > where
>> > applicable
>> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
>> > java.io.IOException: java.lang.RuntimeException: native snappy library
>> > not
>> > available
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
>> >         at
>> >
>> > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
>> >         at
>> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
>> >         at
>> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:662)
>> > Caused by: java.lang.RuntimeException: native snappy library not
>> > available
>> >         at
>> >
>> > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
>> >         at
>> >
>> > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
>> >         at
>> >
>> > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
>> >         at
>> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
>> >         ... 13 more
>> >
>> > And my sink configuration:
>> > flume05.sinks.hdfsSink.type = hdfs
>> > #flume05.sinks.hdfsSink.type = logger
>> > flume05.sinks.hdfsSink.channel = memoryChannel
>> >
>> > flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
>> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
>> > flume05.sinks.hdfsSink.hdfs.rollInterval=60
>> > flume05.sinks.hdfsSink.hdfs.rollCount=0
>> > flume05.sinks.hdfsSink.hdfs.rollSize=0
>> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
>> > flume05.sinks.hdfsSink.hdfs.codeC=snappy
>> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text
>> > flume05.sinks.hdfsSink.hdfs.batchSize=1000
>> > flume05.sinks.hdfsSink.serializer = avro_event
>> >
>> > Thanks.
>> >
>> > -eran
>> >
>
>

Re: Can't use snappy codec

Posted by Eran Kutner <er...@gigya.com>.
Thanks Patrick, that helped me figure out the problem and it looks like a
bug in the "flume-ng" file provided with CDH4, it was messing the
library.path.
I copied the file that was included in flume 1.2.0 distribution and it now
works ok.

Thanks for your help.

-eran



On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey Eran,
>
> You need to make sure the Flume JVM gets passed
> -Djava.library.path=XXX with the correct path to where your native
> snappy libraries are located.
>
> You can set this by adding the option directly to the flume-ng runner
> script.
>
> - Patrick
>
> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <er...@gigya.com> wrote:
> > Hi,
> > I'm trying to use the snappy codec but keep getting "native snappy
> library
> > not available" errors.
> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with
> that
> > distribution with flume 1.2 JARs.
> > I tried anything I can think of, including symlinking the hadoop native
> > library under flume-ng/lib/ dirctory both nothing helps.
> > Any idea how to resolve this?
> >
> > This is the error:
> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
> > native-hadoop library for your platform... using builtin-java classes
> where
> > applicable
> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
> > java.io.IOException: java.lang.RuntimeException: native snappy library
> not
> > available
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
> >         at
> >
> org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
> >         at
> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
> >         at
> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
> >         at
> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >         at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >         at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >         at java.lang.Thread.run(Thread.java:662)
> > Caused by: java.lang.RuntimeException: native snappy library not
> available
> >         at
> >
> org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
> >         at
> >
> org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
> >         at
> >
> org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
> >         at
> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
> >         ... 13 more
> >
> > And my sink configuration:
> > flume05.sinks.hdfsSink.type = hdfs
> > #flume05.sinks.hdfsSink.type = logger
> > flume05.sinks.hdfsSink.channel = memoryChannel
> >
> flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
> > flume05.sinks.hdfsSink.hdfs.rollInterval=60
> > flume05.sinks.hdfsSink.hdfs.rollCount=0
> > flume05.sinks.hdfsSink.hdfs.rollSize=0
> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
> > flume05.sinks.hdfsSink.hdfs.codeC=snappy
> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text
> > flume05.sinks.hdfsSink.hdfs.batchSize=1000
> > flume05.sinks.hdfsSink.serializer = avro_event
> >
> > Thanks.
> >
> > -eran
> >
>

Re: Can't use snappy codec

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Eran,

You need to make sure the Flume JVM gets passed
-Djava.library.path=XXX with the correct path to where your native
snappy libraries are located.

You can set this by adding the option directly to the flume-ng runner script.

- Patrick

On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <er...@gigya.com> wrote:
> Hi,
> I'm trying to use the snappy codec but keep getting "native snappy library
> not available" errors.
> I'm using CDH4 but replaced the flume 1.1 JARs that are included with that
> distribution with flume 1.2 JARs.
> I tried anything I can think of, including symlinking the hadoop native
> library under flume-ng/lib/ dirctory both nothing helps.
> Any idea how to resolve this?
>
> This is the error:
> 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
> java.io.IOException: java.lang.RuntimeException: native snappy library not
> available
>         at
> org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
>         at
> org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
>         at
> org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
>         at
> org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
>         at
> org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
>         at
> org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
>         at
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
>         at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
>         at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.RuntimeException: native snappy library not available
>         at
> org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
>         at
> org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
>         at
> org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
>         at
> org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
>         ... 13 more
>
> And my sink configuration:
> flume05.sinks.hdfsSink.type = hdfs
> #flume05.sinks.hdfsSink.type = logger
> flume05.sinks.hdfsSink.channel = memoryChannel
> flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
> flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
> flume05.sinks.hdfsSink.hdfs.rollInterval=60
> flume05.sinks.hdfsSink.hdfs.rollCount=0
> flume05.sinks.hdfsSink.hdfs.rollSize=0
> flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
> flume05.sinks.hdfsSink.hdfs.codeC=snappy
> flume05.sinks.hdfsSink.hdfs.writeFormat=Text
> flume05.sinks.hdfsSink.hdfs.batchSize=1000
> flume05.sinks.hdfsSink.serializer = avro_event
>
> Thanks.
>
> -eran
>