You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2017/02/23 21:30:44 UTC

[jira] [Updated] (SPARK-19716) Dataset should allow by-name resolution for struct type elements in array

     [ https://issues.apache.org/jira/browse/SPARK-19716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan updated SPARK-19716:
--------------------------------
    Description: 
if we have a DataFrame with schema {{<a: int, b: int, c: int>}}, and convert it to Dataset with {{case class Data(a: Int, c: Int)}}, it works and we will extract the `a` and `c` columns to build the Data.

However, if the struct is inside array, e.g. schema is {{<arr: array<struct<a: int, b: int, c: int>>>}}, and we wanna convert it to Dataset with {{case class ComplexData(arr: Seq[Data])}}, we will fail. we should support this case.

  was:
if we have a DataFrame with schema {{<a: int, b: int, c: int>}}, and convert it to Dataset with {{case class Data(a: Int, c: Int)}}, it works and we will extract the `a` and `c` columns to build the Data.

However, if the struct is inside array, e.g. schema is {{<arr: array<a: int, b: int, c: int>>}}, and we wanna convert it to Dataset with {{case class ComplexData(arr: Seq[Data])}}, we will fail. we should support this case.


> Dataset should allow by-name resolution for struct type elements in array
> -------------------------------------------------------------------------
>
>                 Key: SPARK-19716
>                 URL: https://issues.apache.org/jira/browse/SPARK-19716
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Wenchen Fan
>
> if we have a DataFrame with schema {{<a: int, b: int, c: int>}}, and convert it to Dataset with {{case class Data(a: Int, c: Int)}}, it works and we will extract the `a` and `c` columns to build the Data.
> However, if the struct is inside array, e.g. schema is {{<arr: array<struct<a: int, b: int, c: int>>>}}, and we wanna convert it to Dataset with {{case class ComplexData(arr: Seq[Data])}}, we will fail. we should support this case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org