You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Henrique dos Santos Goulart (JIRA)" <ji...@apache.org> on 2018/01/30 13:17:00 UTC
[jira] [Updated] (SPARK-23273) Spark Dataset withColumn - schema
column order isn't the same as case class paramether order
[ https://issues.apache.org/jira/browse/SPARK-23273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henrique dos Santos Goulart updated SPARK-23273:
------------------------------------------------
Description:
{code:java}
case class OnlyAge(age: Int)
case class NameAge(name: String, age: Int)
val ds1 = spark.emptyDataset[NameAge]
val ds2 = spark
.createDataset(Seq(OnlyAge(1)))
.withColumn("name", lit("henriquedsg89"))
.as[NameAge]
ds1.show()
ds2.show()
ds1.union(ds2)
{code}
It's going to raise this error:
{noformat}
Cannot up cast `age` from string to int as it may truncate
The type path of the target object is:
- field (class: "scala.Int", name: "age")
- root class: "dw.NameAge"{noformat}
It seems that .as[CaseClass] doesn't keep the order of paramethers that is typed on case class.
If I change the case class paramether order, it's going to work... like:
{code:java}
case class NameAge(age: Int, name: String){code}
was:
{code:java}
case class OnlyAge(age: Int)
case class NameAge(name: String, age: Int)
val ds1 = spark.emptyDataset[NameAge]
val ds2 = spark
.createDataset(Seq(OnlyAge(1)))
.withColumn("name", lit("henriquedsg89"))
.as[NameAge]
ds1.show()
ds2.show()
ds1.union(ds2)
{code}
It's going to raise this error:
{noformat}
Cannot up cast `age` from string to int as it may truncate
The type path of the target object is:
- field (class: "scala.Int", name: "age")
- root class: "dw.NameAge"{noformat}
It seems that .as[CaseClass] doesn't keep the order of paramethers that is typed on case class.
If I change the case class paramether order, it's going to work... like: `case class NameAge(age: Int, name: String)`
> Spark Dataset withColumn - schema column order isn't the same as case class paramether order
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-23273
> URL: https://issues.apache.org/jira/browse/SPARK-23273
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.1
> Reporter: Henrique dos Santos Goulart
> Priority: Major
>
> {code:java}
> case class OnlyAge(age: Int)
> case class NameAge(name: String, age: Int)
> val ds1 = spark.emptyDataset[NameAge]
> val ds2 = spark
> .createDataset(Seq(OnlyAge(1)))
> .withColumn("name", lit("henriquedsg89"))
> .as[NameAge]
> ds1.show()
> ds2.show()
> ds1.union(ds2)
> {code}
>
> It's going to raise this error:
> {noformat}
> Cannot up cast `age` from string to int as it may truncate
> The type path of the target object is:
> - field (class: "scala.Int", name: "age")
> - root class: "dw.NameAge"{noformat}
> It seems that .as[CaseClass] doesn't keep the order of paramethers that is typed on case class.
> If I change the case class paramether order, it's going to work... like:
> {code:java}
> case class NameAge(age: Int, name: String){code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org