You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "zhihai xu (JIRA)" <ji...@apache.org> on 2016/08/18 00:12:20 UTC
[jira] [Created] (HIVE-14564) Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

zhihai xu created HIVE-14564:
--------------------------------

             Summary: Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.
                 Key: HIVE-14564
                 URL: https://issues.apache.org/jira/browse/HIVE-14564
             Project: Hive
          Issue Type: Bug
          Components: Query Planning
    Affects Versions: 2.1.0
            Reporter: zhihai xu
            Assignee: zhihai xu
            Priority: Critical


Column Pruning generates out of order columns in SelectOperator which cause ArrayIndexOutOfBoundsException.

{code}
2016-07-26 21:49:24,390 FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"_col0":null,"_col1":0,"_col2":36,"_col3":"499ec44-6dd2-4709-a019-33d6d484ed90�\u0001U5�\u001c��\t\u001b�\u0000","_col4":"5264db53-d650-4678-9261-cdd51efab8bb","_col5":"cb5233dd-214a-4b0b-b43e-0f41befb5c5c","_col6":"","_col8":48,"_col9":null,"_col10":"1befb5c5c�\u00192016-06-09T15:31:15+00:00\u0002\u0005Rider\u0011svc-dash","_col11":64,"_col12":null,"_col13":null,"_col14":"ber.com�\u0001U5ߨP�\u0001U5ᷨider) - 1000\u0005Rider\u0011svc-dash@uber.com�\u0001U4�;x�\u0001U5\u0004��\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","_col15":"","_col16":null}
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
	... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at java.lang.System.arraycopy(Native Method)
	at org.apache.hadoop.io.Text.set(Text.java:225)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
	at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
	at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
	... 13 more
{code}

The exception is because the serialization and deserialization doesn't match.
The serialization by LazyBinarySerDe from previous MapReduce job used different order of columns. When the current MapReduce job deserialized the intermediate sequence file generated by previous MapReduce job, it will get corrupted data from the deserialization using wrong order of columns by LazyBinaryStruct. The unmatched columns between  serialization and deserialization caused by SelectOperator's Column Pruning {{ColumnPrunerSelectProc}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)