You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bigtop.apache.org by "Luca Toscano (Jira)" <ji...@apache.org> on 2022/02/13 07:33:00 UTC
[jira] [Commented] (BIGTOP-3641) Hive on Spark error

    [ https://issues.apache.org/jira/browse/BIGTOP-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491517#comment-17491517 ] 

Luca Toscano commented on BIGTOP-3641:
--------------------------------------

Hi! Very interesting, I can repro the issue. I tried to run hive in debug mode and this is the stacktrace:

{code}
2022-02-13T07:22:38,753 DEBUG [RPC-Handler-2] rpc.KryoMessageCodec: Encoded message of type org.apache.hive.spark.client.rpc.Rpc$NullMessage (2 bytes)
Job failed with java.lang.ClassNotFoundException: oot_20220213072222_a2f5687a-9e58-4dab-92f8-3963450a2fcd:1
2022-02-13T07:22:39,414 ERROR [141385f4-f28a-464d-9e5f-4fe1df0d946e main] status.SparkJobMonitor: Job failed with java.lang.ClassNotFoundException: oot_20220213072222_a2f5687a-9e58-4dab-92f8-3963450a2fcd:1
com.esotericsoftware.kryo.KryoException: Unable to find class: oot_20220213072222_a2f5687a-9e58-4dab-92f8-3963450a2fcd:1
Serialization trace:
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693)
	at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181)
	at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:709)
	at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:206)
	at org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:60)
	at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:329)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:378)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: oot_20220213072222_a2f5687a-9e58-4dab-92f8-3963450a2fcd:1
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
	... 15 more
{code}



> Hive on Spark error
> -------------------
>
>                 Key: BIGTOP-3641
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-3641
>             Project: Bigtop
>          Issue Type: Bug
>          Components: hive, spark
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Andrew
>            Priority: Major
>
> Hi! I've tried to launch Hadoop stack in docker in 2 ways:
>  # successfully build _hdfs, yarn, mapreduce, hbase, hive, spark, zookeeper_ from bigtop master branch (3.1.0 version) and launched docker from local repo via provisioner with all this components
>  # same as 1st approach but with bigtop repo (3.0.0 version)
> In both cases everything works fine, but Hive on Spark fails with an error:
> {code:java}
> hive> set hive.execution.engine=spark;
> hive> select id, count(*) from default.test group by id;
> Query ID = root_20220209133134_cf3aec7d-ee2e-4d38-b200-6d616020d4b6
> Total jobs = 1
> Launching Job 1 out of 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Job failed with java.lang.ClassNotFoundException: oot_20220209133134_cf3aec7d-ee2e-4d38-b200-6d616020d4b6:1
> FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause.{code}
>  
> From spark-shell everything works fine:
> {code:java}
> scala> sql("select id, count(*) from default.test group by id").show()
> +---+--------+                                                                  
> | id|count(1)|
> +---+--------+
> |  1|       1|
> |  2|       1|
> +---+--------+{code}
>  
> I've also tried to create an hdfs dir with spark libs and specify config was done in https://issues.apache.org/jira/browse/BIGTOP-3333 - it didn't help. Any ideas what is missing and how to fix it?
> P.S. Spark is used as spark-on-yarn



--
This message was sent by Atlassian Jira
(v8.20.1#820001)