You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/06/07 19:11:21 UTC
[jira] [Assigned] (SPARK-15808) Wrong Results or Strange Errors In
Append-mode DataFrame Writing Due to Mismatched File Formats
[ https://issues.apache.org/jira/browse/SPARK-15808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-15808:
------------------------------------
Assignee: Apache Spark
> Wrong Results or Strange Errors In Append-mode DataFrame Writing Due to Mismatched File Formats
> -----------------------------------------------------------------------------------------------
>
> Key: SPARK-15808
> URL: https://issues.apache.org/jira/browse/SPARK-15808
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Xiao Li
> Assignee: Apache Spark
>
> Example 1: PARQUET -> CSV
> {noformat}
> createDF(0, 9).write.format("parquet").saveAsTable("appendParquetToOrc")
> createDF(10, 19).write.mode(SaveMode.Append).format("orc").saveAsTable("appendParquetToOrc")
> {noformat}
> Error we got:
> {noformat}
> Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost): java.lang.RuntimeException: file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/warehouse-bc8fedf2-aa6a-4002-a18b-524c6ac859d4/appendorctoparquet/part-r-00000-c0e3f365-1d46-4df5-a82c-b47d7af9feb9.snappy.orc is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [79, 82, 67, 23]
> {noformat}
> Example 2: Json -> CSV
> {noformat}
> createDF(0, 9).write.format("json").saveAsTable("appendJsonToCSV")
> createDF(10, 19).write.mode(SaveMode.Append).format("parquet").saveAsTable("appendJsonToCSV")
> {noformat}
> No exception, but wrong results:
> {noformat}
> +----+----+
> | c1| c2|
> +----+----+
> |null|null|
> |null|null|
> |null|null|
> |null|null|
> | 0|str0|
> | 1|str1|
> | 2|str2|
> | 3|str3|
> | 4|str4|
> | 5|str5|
> | 6|str6|
> | 7|str7|
> | 8|str8|
> | 9|str9|
> +----+----+
> {noformat}
> Example 3: Json -> Text
> {noformat}
> createDF(0, 9).write.format("json").saveAsTable("appendJsonToText")
> createDF(10, 19).write.mode(SaveMode.Append).format("text").saveAsTable("appendJsonToText")
> {noformat}
> Error we got:
> {noformat}
> Text data source supports only a single column, and you have 2 columns.
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org