You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Mithun Radhakrishnan (JIRA)" <ji...@apache.org> on 2015/02/02 20:06:35 UTC

[jira] [Commented] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.

    [ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301693#comment-14301693 ] 

Mithun Radhakrishnan commented on HIVE-9471:
--------------------------------------------

Hey, [~prasanth_j]. Does this patch look alright now?

> Bad seek in uncompressed ORC, at row-group boundary.
> ----------------------------------------------------
>
>                 Key: HIVE-9471
>                 URL: https://issues.apache.org/jira/browse/HIVE-9471
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats, Serializers/Deserializers
>    Affects Versions: 0.14.0
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>         Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive
>
>
> Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group.
> {code:title=stacktrace}
> java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data
> 	at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
> 	at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
> 	at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
> 	at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305)
> ...
> Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data
> 	at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112)
> 	at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96)
> 	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310)
> 	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596)
> 	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337)
> 	at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852)
> {code}
> I'll attach the script to reproduce the problem herewith.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)