You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Michael Armbrust (JIRA)" <ji...@apache.org> on 2015/05/18 22:02:01 UTC

[jira] [Resolved] (SPARK-4523) Improve handling of serialized schema information

     [ https://issues.apache.org/jira/browse/SPARK-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Armbrust resolved SPARK-4523.
-------------------------------------
    Resolution: Won't Fix

We haven't changed this for a few release now, and it seem unlikely that we will so I'm going to close this issue.

> Improve handling of serialized schema information
> -------------------------------------------------
>
>                 Key: SPARK-4523
>                 URL: https://issues.apache.org/jira/browse/SPARK-4523
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Michael Armbrust
>            Priority: Critical
>
> There are several issues with our current handling of metadata serialization, which is especially troublesome since this is the only place that we persist information directly using Spark SQL.  Moving forward we should do the following:
>  - Relax the parsing so that it does not fail when optional fields are missing (i.e. containsNull or metadata)
>  - Include a regression suite that attempts to read old parquet files written by previous versions of Spark SQL.
>  - Provide better warning messages when various forms of parsing fail (I think that it is silent right now which makes tracking down bugs more difficult than it needs to be).
>  - Deprecate (display a warning) when reading data with the old case class schema representation and eventually remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org