You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2020/12/07 13:41:00 UTC

[jira] [Resolved] (SPARK-33641) Invalidate new char-like type in public APIs that result incorrect results

     [ https://issues.apache.org/jira/browse/SPARK-33641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-33641.
---------------------------------
    Fix Version/s: 3.1.0
       Resolution: Fixed

Issue resolved by pull request 30586
[https://github.com/apache/spark/pull/30586]

> Invalidate new char-like type in public APIs that result incorrect results
> --------------------------------------------------------------------------
>
>                 Key: SPARK-33641
>                 URL: https://issues.apache.org/jira/browse/SPARK-33641
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Blocker
>             Fix For: 3.1.0
>
>
> 1. udf
> {code:java}
> scala> spark.udf.register("abcd", () => "12345", org.apache.spark.sql.types.VarcharType(2))
> scala> spark.sql("select abcd()").show
> scala.MatchError: CharType(2) (of class org.apache.spark.sql.types.VarcharType)
>   at org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeFor(RowEncoder.scala:215)
>   at org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeForInput(RowEncoder.scala:212)
>   at org.apache.spark.sql.catalyst.expressions.objects.ValidateExternalType.<init>(objects.scala:1741)
>   at org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:175)
>   at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
>   at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>   at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
>   at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
>   at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
>   at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198)
>   at org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:171)
>   at org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:66)
>   at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:768)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
>   at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:611)
>   at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:768)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:606)
>   ... 47 elided
> {code}
> 2. spark.createDataframe
> {code:java}
> scala> spark.createDataFrame(spark.read.text("README.md").rdd, new org.apache.spark.sql.types.StructType().add("c", "char(1)")).show
> +--------------------+
> |                   c|
> +--------------------+
> |      # Apache Spark|
> |                    |
> |Spark is a unifie...|
> |high-level APIs i...|
> |supports general ...|
> |rich set of highe...|
> |MLlib for machine...|
> |and Structured St...|
> |                    |
> |<https://spark.ap...|
> |                    |
> |[![Jenkins Build]...|
> |[![AppVeyor Build...|
> |[![PySpark Covera...|
> |                    |
> |                    |
> |## Online Documen...|
> |                    |
> |You can find the ...|
> |guide, on the [pr...|
> +--------------------+
> only showing top 20 rows
> {code}
> 3. reader.schema
> ```
> scala> spark.read.schema("a varchar(2)").text("./README.md").show(100)
> +--------------------+
> |                   a|
> +--------------------+
> |      # Apache Spark|
> |                    |
> |Spark is a unifie...|
> |high-level APIs i...|
> |supports general ...|
> ```
> 4. etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org