You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2018/01/22 20:58:00 UTC

[jira] [Created] (SPARK-23181) Add compatibility tests for SHS serialized data / disk format

Marcelo Vanzin created SPARK-23181:
--------------------------------------

             Summary: Add compatibility tests for SHS serialized data / disk format
                 Key: SPARK-23181
                 URL: https://issues.apache.org/jira/browse/SPARK-23181
             Project: Spark
          Issue Type: Task
          Components: Tests
    Affects Versions: 2.3.0
            Reporter: Marcelo Vanzin


The SHS in 2.3.0 has the ability to serialize history data to disk (see SPARK-18085 and its sub-tasks). This means that if either the serialized data or the disk format changes, the code needs to be modified to either support the old formats, or discard the old data (and re-create it from logs).

We should add integration tests that help us detect whether one of these changes has occurred. The should check data generated by old versions of Spark and fail if that data cannot be read back.

The Hive suites recently added the ability to download old Spark versions and generate data from those old versions to test that new code can read it, we could use something similar to test this (starting with when 2.3.0 is released).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org