You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "JunAn Chen (JIRA)" <ji...@apache.org> on 2016/04/21 08:08:25 UTC

[jira] [Created] (HIVE-13573) [Spark Branch] ClassNotFoundException occurs during query case with group by and UDF defined

JunAn Chen created HIVE-13573:
---------------------------------

             Summary: [Spark Branch] ClassNotFoundException occurs during query case with group by and UDF defined
                 Key: HIVE-13573
                 URL: https://issues.apache.org/jira/browse/HIVE-13573
             Project: Hive
          Issue Type: Sub-task
          Components: Spark, spark-branch
         Environment: CentOS 6
JDK1.7
            Reporter: JunAn Chen
            Assignee: Chengxiang Li


I have a query case with group by and UDF defined : 
{code:borderStyle=solid}
select
  tx.a, tx.b, tx.c
from
(
  select t.*
  from dual
  lateral view explode_json_array(
    named_struct(
      'json', '[{"a":"a1", "b":"b1", "c":"c1"}, {"a":"a2", "b":"b2", "c":"c2"}]',
      'paths', array('$..a', '$..b', '$..c')
    )
  ) t as a,b,c
) tx
group by tx.a, tx.b, tx.c
{code}
I use "create function explode_json_array 
as 'com.baidu.dft.ymir.hadoop.hive.udf.UDTFExpldeJsonArray' 
using jar 'hdfs:///hadoop-tools/hadoop-tools-0.0.1-SNAPSHOT.jar'; " to create udf.

It works fine on both mr and spark cluster mode without "group by" clause, but fails when running on spark cluster mode with "group by" clause defined.

Detail error message is below 

{code:borderStyle=solid}
WARN scheduler.DAGScheduler: Creating new stage failed due to exception - job: 3
java.lang.RuntimeException: Failed to load plan: hdfs://cq01-mcp-master.epc.baidu.com:8020/tmp/hive/mcpweb/4f7216b8-b5f2-443c-a0f1-8520841c1caf/hive_2016-04-21_04-15-39_787_4732405294118301221-15/-mr-10004/8e110379-9b0a-4c0f-b99f-6ecde63369f6/map.xml: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.baidu.dft.ymir.hadoop.hive.udf.UDTFExpldeJsonArray
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:462)
	at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:301)
	at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:505)
	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
	at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82)
	at org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:78)
	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206)
	at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204)
	at org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:321)
	at org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:333)
	at org.apache.spark.scheduler.DAGScheduler.getParentStagesAndId(DAGScheduler.scala:234)
	at org.apache.spark.scheduler.DAGScheduler.newResultStage(DAGScheduler.scala:270)
	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:768)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1426)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: com.baidu.dft.ymir.hadoop.hive.udf.UDTFExpldeJsonArray
Serialization trace:
genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
	at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
	at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
	at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1081)
	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:972)
	at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:986)
	at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:422)
	... 27 more
Caused by: java.lang.ClassNotFoundException: com.baidu.dft.ymir.hadoop.hive.udf.UDTFExpldeJsonArray
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
	... 69 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)