You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2019/11/27 09:30:00 UTC

[jira] [Updated] (HIVE-22551) BytesColumnVector initBuffer should clean vector and length consistently

     [ https://issues.apache.org/jira/browse/HIVE-22551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated HIVE-22551:
--------------------------------
    Description: 
VectorExtractRow relies on the fact that vector[i] and length[i] are consistent within the BytesColumnVector, otherwise it throws exception:
https://github.com/apache/hive/blob/edc53cc0d95e983c371a224943dd866210f0c65c/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExtractRow.java#L274

There is a scenario when only vector[i] has been cleaned while reusing the column vector, and then this kind of exception can be thrown:
the reproduction was made with [LlapDump|https://github.com/apache/hive/blob/master/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapDump.java] with String columns (longer than 16 chars)
{code}
19/10/17 15:55:49 ERROR llap.LlapArrowRowRecordReader: Failed to fetch Arrow batch
java.lang.RuntimeException: STRING entry: batchIndex 45
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.BytesReadError(VectorExtractRow.java:488)
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:294)
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:193)
at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:483)
at org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:125)
at org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284)
at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75)
at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:41)
at datareader.LlapDump.main(LlapDump.java:124)
{code}

> BytesColumnVector initBuffer should clean vector and length consistently 
> -------------------------------------------------------------------------
>
>                 Key: HIVE-22551
>                 URL: https://issues.apache.org/jira/browse/HIVE-22551
>             Project: Hive
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>
> VectorExtractRow relies on the fact that vector[i] and length[i] are consistent within the BytesColumnVector, otherwise it throws exception:
> https://github.com/apache/hive/blob/edc53cc0d95e983c371a224943dd866210f0c65c/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExtractRow.java#L274
> There is a scenario when only vector[i] has been cleaned while reusing the column vector, and then this kind of exception can be thrown:
> the reproduction was made with [LlapDump|https://github.com/apache/hive/blob/master/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapDump.java] with String columns (longer than 16 chars)
> {code}
> 19/10/17 15:55:49 ERROR llap.LlapArrowRowRecordReader: Failed to fetch Arrow batch
> java.lang.RuntimeException: STRING entry: batchIndex 45
> at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.BytesReadError(VectorExtractRow.java:488)
> at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:294)
> at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:193)
> at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:483)
> at org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:125)
> at org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284)
> at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75)
> at org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:41)
> at datareader.LlapDump.main(LlapDump.java:124)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)