You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/14 09:00:07 UTC

[GitHub] [iceberg] cbwn123 opened a new issue #4113: hive query iceberg partition table throw exception：ArrayIndexOutOfBoundsException

cbwn123 opened a new issue #4113:
URL: https://github.com/apache/iceberg/issues/4113


   Iceberg 0.13
   Hive 3.1.2
   hive on spark
   Hive sql : select count(1) from table where event_day = '20220213'
   event_day is partition column.
   
   Driver stacktrace:
   	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:16
   	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1649)
   	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1648)
   	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
   	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
   	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1648)
   	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
   	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
   	at scala.Option.foreach(Option.scala:257)
   	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1882)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1831)
   	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1820)
   	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
   Caused by: java.lang.RuntimeException: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while proce[Error getting row data with exception java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long
   	at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:234)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:373)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:203)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:189)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:596)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:562)
   	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:136)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
   	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
   	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
   	at org.apache.spark.scheduler.Task.run(Task.scala:109)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
    ]
   	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:149)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
   	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
   	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
   	at org.apache.spark.scheduler.Task.run(Task.scala:109)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception jlassCastException: java.lang.String cannot be cast to java.lang.Long
   	at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaLongObjectInspector.get(JavaLongObjectInspector.java:40)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:234)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:373)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:203)
   	at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:189)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.toErrorMessage(MapOperator.java:596)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:562)
   	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:136)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
   	at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
   	at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
   	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
   	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
   	at org.apache.spark.scheduler.Task.run(Task.scala:109)
   	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
    ]
   	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:570)
   	at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:136)
   	... 12 more
   Caused by: java.lang.ArrayIndexOutOfBoundsException: 136
   	at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:114)
   	at org.apache.iceberg.mr.hive.serde.objectinspector.IcebergRecordObjectInspector.getStructFieldData(IcebergRecordObjectInspector.java
   	at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:95)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:88)
   	at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual.evaluate(GenericUDFOPEqual.java:108)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:197)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
   	at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
   	at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:112)
   	at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
   	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
   	at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
   	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
   	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
   	... 13 more
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] pvary commented on issue #4113: hive query iceberg partition table throw exception：ArrayIndexOutOfBoundsException

Posted by GitBox <gi...@apache.org>.

pvary commented on issue #4113:
URL: https://github.com/apache/iceberg/issues/4113#issuecomment-1038824482


   @cbwn123: What execution engine are you using?
   If you are using Tez you might want to take a look at this: https://iceberg.apache.org/docs/latest/hive/#hive-on-tez-configuration
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] cbwn123 commented on issue #4113: hive query iceberg partition table throw exception：ArrayIndexOutOfBoundsException

Posted by GitBox <gi...@apache.org>.

cbwn123 commented on issue #4113:
URL: https://github.com/apache/iceberg/issues/4113#issuecomment-1038829720


   > @cbwn123: What execution engine are you using? If you are using Tez you might want to take a look at this: https://iceberg.apache.org/docs/latest/hive/#hive-on-tez-configuration
   > 
   > Thanks, Peter
   
   Thank you for your reply.
   My execution engine is Hive on Spark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] pvary commented on issue #4113: hive query iceberg partition table throw exception：ArrayIndexOutOfBoundsException

Posted by GitBox <gi...@apache.org>.

pvary commented on issue #4113:
URL: https://github.com/apache/iceberg/issues/4113#issuecomment-1038831029


   I am afraid that HoS is not supported at all. At least we haven't tested it at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org