You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Terry Moschou (Jira)" <ji...@apache.org> on 2020/12/17 04:34:00 UTC
[jira] [Created] (SPARK-33820) java.lang.ClassCastException:
com.example.A cannot be cast to com.example.A when
spark.executor.userClassPathFirst is true
Terry Moschou created SPARK-33820:
-------------------------------------
Summary: java.lang.ClassCastException: com.example.A cannot be cast to com.example.A when spark.executor.userClassPathFirst is true
Key: SPARK-33820
URL: https://issues.apache.org/jira/browse/SPARK-33820
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.0.1, 2.4.7
Environment: {noformat}
$ java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b04)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b04, mixed mode)
$ mvn -version
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/Cellar/maven/3.6.3_1/libexec
Java version: 1.8.0_212, vendor: AdoptOpenJDK, runtime: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre
Default locale: en_AU, platform encoding: UTF-8
OS name: "mac os x", version: "10.15.7", arch: "x86_64", family: "Mac"
{noformat}
Reporter: Terry Moschou
Specifying
{{--conf spark.executor.userClassPathFirst=true}}
as an option to {{spark-submit}} causes a {{java.lang.ClassCastException}} to be thrown when trying to cast a class to itself, that would otherwise not have been thrown if the option was not specified.
Minimal test case to reproduce the error is below.
{{com/example/App.scala}}
{code:scala}
package com.example
import org.apache.spark.sql.{Dataset, SaveMode, SparkSession}
case class A(id: String, value: Long)
object App {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
val ds: Dataset[A] = spark
.range(5)
.map { v => A(s"$v", v) }
ds.write.format("json").mode(SaveMode.Overwrite).save("out.json")
spark.close()
}
}
{code}
{{build.xml}}
{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>test-bug</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<version.java>1.8</version.java>
<maven.compiler.source>${version.java}</maven.compiler.source>
<maven.compiler.target>${version.java}</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.binary.version>2.12</scala.binary.version>
<scala.version>2.12.12</scala.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>3.0.1</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>4.3.1</version>
<executions>
<execution>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<recompileMode>incremental</recompileMode>
<args>
<arg>-unchecked</arg>
<arg>-deprecation</arg>
<arg>-feature</arg>
<arg>-explaintypes</arg>
<arg>-Yno-adapted-args</arg>
<arg>-Ypartial-unification</arg>
<arg>-language:higherKinds</arg>
</args>
<javacArgs>
<javacArg>-source</javacArg>
<javacArg>${java.version}</javacArg>
<javacArg>-target</javacArg>
<javacArg>${java.version}</javacArg>
<javacArg>-Xlint:all,-serial,-path,-try</javacArg>
</javacArgs>
</configuration>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
{code}
Using a fresh unzip of the spark distribution spark-3.0.1-bin-hadoop3.2
{code:bash}
mvn clean package
SPARK_HOME=$HOME/dist/spark-3.0.1-bin-hadoop3.2
$SPARK_HOME/bin/spark-submit \
--class com.example.App \
--conf spark.executor.userClassPathFirst=true \
--master 'local[*]' \
target/test-bug-1.0-SNAPSHOT.jar
{code}
The following exception is thrown
{noformat}
20/12/17 14:31:22 ERROR Executor: Exception in task 15.0 in stage 0.0 (TID 15)
java.lang.ClassCastException: com.example.A cannot be cast to com.example.A
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.mapelements_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.deserializetoobject_doConsume_0$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:729)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:260)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
This error also happens against Spark 2.4.7 (Scala 2.12).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org