You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2012/07/10 21:38:35 UTC
[jira] [Commented] (HADOOP-8423) MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data

    [ https://issues.apache.org/jira/browse/HADOOP-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410741#comment-13410741 ] 

Harsh J commented on HADOOP-8423:
---------------------------------

bq. The issue was that BlockDecompressorStream wasn't resetting its own state when resetState() was called. So, when reseeking in the SequenceFile.Reader, it would get "out of sync" - and be at the beginning of a block but think it was in the middle of a block. So, the codec got invalid data fed to it.

+1, applied patch minus the fix (i.e. just test) and it fails, passes with the fix.

Committing shortly.
                
> MapFile.Reader.get() crashes jvm or throws EOFException on Snappy or LZO block-compressed data
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-8423
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8423
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.2
>         Environment: Linux 2.6.32.23-0.3-default #1 SMP 2010-10-07 14:57:45 +0200 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Jason B
>            Assignee: Todd Lipcon
>         Attachments: MapFileCodecTest.java, hadoop-8423.txt
>
>
> I am using Cloudera distribution cdh3u1.
> When trying to check native codecs for better decompression
> performance such as Snappy or LZO, I ran into issues with random
> access using MapFile.Reader.get(key, value) method.
> First call of MapFile.Reader.get() works but a second call fails.
> Also  I am getting different exceptions depending on number of entries
> in a map file.
> With LzoCodec and 10 record file, jvm gets aborted.
> At the same time the DefaultCodec works fine for all cases, as well as
> record compression for the native codecs.
> I created a simple test program (attached) that creates map files
> locally with sizes of 10 and 100 records for three codecs: Default,
> Snappy, and LZO.
> (The test requires corresponding native library available)
> The summary of problems are given below:
> Map Size: 100
> Compression: RECORD
> ==================
> DefaultCodec:  OK
> SnappyCodec: OK
> LzoCodec: OK
> Map Size: 10
> Compression: RECORD
> ==================
> DefaultCodec:  OK
> SnappyCodec: OK
> LzoCodec: OK
> Map Size: 100
> Compression: BLOCK
> ================
> DefaultCodec:  OK
> SnappyCodec: java.io.EOFException  at
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
> LzoCodec: java.io.EOFException at
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:114)
> Map Size: 10
> Compression: BLOCK
> ==================
> DefaultCodec:  OK
> SnappyCodec: java.lang.NoClassDefFoundError: Ljava/lang/InternalError
> at org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native
> Method)
> LzoCodec:
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00002b068ffcbc00, pid=6385, tid=47304763508496
> #
> # JRE version: 6.0_21-b07
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (17.0-b17 mixed mode linux-amd64 )
> # Problematic frame:
> # C  [liblzo2.so.2+0x13c00]  lzo1x_decompress+0x1a0
> #

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira