You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2022/08/31 02:23:00 UTC
[jira] [Commented] (SPARK-40282) DataType argument in StructType.add is incorrectly throwing scala.MatchError
[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598132#comment-17598132 ]
Hyukjin Kwon commented on SPARK-40282:
--------------------------------------
We don't have this problem in the languages supported by the official Apache Spark . Is it problem from Kotlin?
> DataType argument in StructType.add is incorrectly throwing scala.MatchError
> ----------------------------------------------------------------------------
>
> Key: SPARK-40282
> URL: https://issues.apache.org/jira/browse/SPARK-40282
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: M. Manna
> Priority: Major
> Attachments: SparkApplication.kt, retailstore.csv
>
>
> *Problem Description*
> as part of contract mentioned here, Spark should be able to support {{IntegerType}} as an argument in StructType.add method. However, it complaints with {{scala.MatchError}} today.
>
> If we call the override version which access String value as Type e.g. "Integer" - it works.
> *How to Reproduce*
> # Create a Kotlin Project - I have used Kotlin but Java will also work (needs minor adjustment)
> # Place the attached CSV file in {{src/main/resources}}
> # Compile the project with Java 11
> # Run - it will give you error.
> {code:java}
> Exception in thread "main" scala.MatchError: org.apache.spark.sql.types.IntegerType@363fe35a (of class org.apache.spark.sql.types.IntegerType)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeFor(RowEncoder.scala:240)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeForInput(RowEncoder.scala:236)
> at org.apache.spark.sql.catalyst.expressions.objects.ValidateExternalType.<init>(objects.scala:1890)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:197)
> at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
> at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
> at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
> at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
> at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
> at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:192)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:73)
> at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:81)
> at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:92)
> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:89)
> at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:444)
> at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
> at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
> at scala.Option.getOrElse(Option.scala:189)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) {code}
> # Now change line (commented as HERE) - to have a String value i.e. "Integer"
> # It works
> *Ask*
> # Why does it not accept IntegerType, StringType as DataType as part of the parameters supplied through {{add}} function in {{StructType}} ?
> # If this is a bug, do we know when the fix can come?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org