You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Xiao, Alton" <al...@sap.com.INVALID> on 2022/09/20 08:05:26 UTC

答复: NoClassDefError and SparkSession should only be created and accessed on the driver.

Can you show us your code?
your udf wasn’t  serialized by spark, In my opinion,  were they out of the spark running code?

发件人: rajat kumar <ku...@gmail.com>
日期: 星期二, 2022年9月20日 15:58
收件人: user @spark <us...@spark.apache.org>
主题: NoClassDefError and SparkSession should only be created and accessed on the driver.
Hello ,

I am using Spark3 where there are some UDFs along . I am using Dataframe APIs to write parquet using spark. I am getting NoClassDefError along with below error.

If I comment out all UDFs , it is working fine.

Could someone suggest what could be wrong. It was working fine in Spark2.4
22/09/20 06:33:17 WARN TaskSetManager: Lost task 9.0 in stage 1.0 (TID 10) (vm-36408481 executor 2): java.lang.ExceptionInInitializerError
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.IllegalStateException: SparkSession should only be created and accessed on the driver.

Re: NoClassDefError and SparkSession should only be created and accessed on the driver.

Posted by Paul Rogalinski <pa...@paul.vc>.
Hi Rajat,


I have been facing similar problem recently and could solve it by moving the UDF implementation into a dedicated class instead having it implemented in the driver class/object.


Regards,
Paul.

On Tuesday 20 September 2022 10:11:31 (+02:00), rajat kumar wrote:


Hi Alton, it's in same scala class only. Is there any change in spark3 to serialize separately?


Regards
Rajat


On Tue, Sep 20, 2022, 13:35 Xiao, Alton <al...@sap.com> wrote:


Can you show us your code?

your udf wasn’t  serialized by spark, In my opinion,  were they out of the spark running code?

 

发件人: rajat kumar <ku...@gmail.com>
日期: 星期二, 2022年9月20日 15:58
收件人: user @spark <us...@spark.apache.org>
主题: NoClassDefError and SparkSession should only be created and accessed on the driver.

Hello ,

I am using Spark3 where there are some UDFs along . I am using Dataframe APIs to write parquet using spark. I am getting NoClassDefError along with below error. 

If I comment out all UDFs , it is working fine. 

Could someone suggest what could be wrong. It was working fine in Spark2.4

22/09/20 06:33:17 WARN TaskSetManager: Lost task 9.0 in stage 1.0 (TID 10) (vm-36408481 executor 2): java.lang.ExceptionInInitializerError

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)

at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)

at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)

at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)

at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)

at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)

at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)

at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83)

at org.apache.spark.scheduler.Task.run(Task.scala:131)

at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)

at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:750)

Caused by: java.lang.IllegalStateException: SparkSession should only be created and accessed on the driver.


--
/**
* Paul Rogalinski-Pinter
*
* paul@paul.vc · +49-179-3574356
* Aidenbachstr. 126 · 81479 München · Germany
*/

Re: NoClassDefError and SparkSession should only be created and accessed on the driver.

Posted by rajat kumar <ku...@gmail.com>.
Hi Alton, it's in same scala class only. Is there any change in spark3 to
serialize separately?

Regards
Rajat

On Tue, Sep 20, 2022, 13:35 Xiao, Alton <al...@sap.com> wrote:

> Can you show us your code?
>
> your udf wasn’t  serialized by spark, In my opinion,  were they out of the
> spark running code?
>
>
>
> *发件人**:* rajat kumar <ku...@gmail.com>
> *日期**:* 星期二, 2022年9月20日 15:58
> *收件人**:* user @spark <us...@spark.apache.org>
> *主题**:* NoClassDefError and SparkSession should only be created and
> accessed on the driver.
>
> Hello ,
>
> I am using Spark3 where there are some UDFs along . I am using Dataframe
> APIs to write parquet using spark. I am getting NoClassDefError along with
> below error.
>
> If I comment out all UDFs , it is working fine.
>
> Could someone suggest what could be wrong. It was working fine in Spark2.4
>
> 22/09/20 06:33:17 WARN TaskSetManager: Lost task 9.0 in stage 1.0 (TID 10)
> (vm-36408481 executor 2): java.lang.ExceptionInInitializerError
>
> *at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)*
>
> *at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)*
>
> *at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
>
> *at java.lang.reflect.Method.invoke(Method.java:498)*
>
> *at
> java.lang.invoke.SerializedLambda.readResolve(SerializedLambda.java:230)*
>
> *at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)*
>
> *at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*
>
> *at java.lang.reflect.Method.invoke(Method.java:498)*
>
> *at
> java.io.ObjectStreamClass.invokeReadResolve(ObjectStreamClass.java:1274)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2222)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)*
>
> *at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)*
>
> *at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)*
>
> *at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)*
>
> *at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)*
>
> *at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)*
>
> *at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)*
>
> *at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:115)*
>
> *at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:83)*
>
> *at org.apache.spark.scheduler.Task.run(Task.scala:131)*
>
> *at
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)*
>
> *at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)*
>
> *at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)*
>
> *at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
>
> *at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
>
> *at java.lang.Thread.run(Thread.java:750)*
>
> Caused by: *java.lang.IllegalStateException*: SparkSession should only be
> created and accessed on the driver.
>