You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/05/21 18:37:17 UTC

[jira] [Assigned] (SPARK-7616) Column order can be corrupted when saving DataFrame as a partitioned table

     [ https://issues.apache.org/jira/browse/SPARK-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-7616:
-----------------------------------

    Assignee: Apache Spark  (was: Cheng Lian)

> Column order can be corrupted when saving DataFrame as a partitioned table
> --------------------------------------------------------------------------
>
>                 Key: SPARK-7616
>                 URL: https://issues.apache.org/jira/browse/SPARK-7616
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Yin Huai
>            Assignee: Apache Spark
>            Priority: Blocker
>
> When saved as a partitioned table, partition columns of a DataFrame are appended after data columns. However, column names are not adjusted accordingly.
> {code}
> import sqlContext._
> import sqlContext.implicits._
> val df = (1 to 3).map(i => i -> i * 2).toDF("a", "b")
> df.write
>   .format("parquet")
>   .mode("overwrite")
>   .partitionBy("a")
>   .saveAsTable("t")
> table("t").orderBy('a).show()
> {code}
> Expected output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |2|1|
> |4|2|
> |6|3|
> +-+-+
> {noformat}
> Actual output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |1|2|
> |2|4|
> |3|6|
> +-+-+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org