You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sergey Zadoroshnyak (JIRA)" <ji...@apache.org> on 2016/08/09 14:23:20 UTC

[jira] [Created] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

Sergey Zadoroshnyak created HIVE-14483:
------------------------------------------

             Summary:  java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
                 Key: HIVE-14483
                 URL: https://issues.apache.org/jira/browse/HIVE-14483
             Project: Hive
          Issue Type: Bug
          Components: ORC
    Affects Versions: 2.1.0
            Reporter: Sergey Zadoroshnyak
            Assignee: Owen O'Malley
            Priority: Critical
             Fix For: 2.2.0


Error message:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
at org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
at org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
... 22 more


How to reproduce?
Configure StringTreeReader  which contains StringDirectTreeReader as TreeReader (DIRECT or DIRECT_V2 column encoding)

batchSize = 1026;

invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final int batchSize)

scratchlcv is LongColumnVector with long[] vector  (length 1024)

 which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, scratchlcv,result, batchSize);

as result in method commonReadByteArrays(stream, lengths, scratchlcv,
            result, (int) batchSize) we received ArrayIndexOutOfBoundsException.


If we use StringDictionaryTreeReader, then there is no exception, as we have a verification  scratchlcv.ensureSize((int) batchSize, false) before reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);

These changes were made for Hive 2.1.0 by corresponding commit https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467 for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley

How to fix?
add  only one line :

scratchlcv.ensureSize((int) batchSize, false) ;

in method org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream stream, IntegerReader lengths,
        LongColumnVector scratchlcv,
        BytesColumnVector result, final int batchSize) before invocation lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);















--
This message was sent by Atlassian JIRA
(v6.3.4#6332)