You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/05/21 18:37:17 UTC
[jira] [Assigned] (SPARK-7616) Column order can be corrupted when
saving DataFrame as a partitioned table
[ https://issues.apache.org/jira/browse/SPARK-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-7616:
-----------------------------------
Assignee: Apache Spark (was: Cheng Lian)
> Column order can be corrupted when saving DataFrame as a partitioned table
> --------------------------------------------------------------------------
>
> Key: SPARK-7616
> URL: https://issues.apache.org/jira/browse/SPARK-7616
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.4.0
> Reporter: Yin Huai
> Assignee: Apache Spark
> Priority: Blocker
>
> When saved as a partitioned table, partition columns of a DataFrame are appended after data columns. However, column names are not adjusted accordingly.
> {code}
> import sqlContext._
> import sqlContext.implicits._
> val df = (1 to 3).map(i => i -> i * 2).toDF("a", "b")
> df.write
> .format("parquet")
> .mode("overwrite")
> .partitionBy("a")
> .saveAsTable("t")
> table("t").orderBy('a).show()
> {code}
> Expected output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |2|1|
> |4|2|
> |6|3|
> +-+-+
> {noformat}
> Actual output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |1|2|
> |2|4|
> |3|6|
> +-+-+
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org