You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shixiong Zhu (Jira)" <ji...@apache.org> on 2021/08/16 07:29:00 UTC

[jira] [Created] (SPARK-36519) Store the RocksDB format in the checkpoint for a streaming query

Shixiong Zhu created SPARK-36519:
------------------------------------

             Summary: Store the RocksDB format in the checkpoint for a streaming query
                 Key: SPARK-36519
                 URL: https://issues.apache.org/jira/browse/SPARK-36519
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 3.2.0
            Reporter: Shixiong Zhu
            Assignee: Shixiong Zhu


RocksDB provides backward compatibility but it doesn't always provide forward compatibility. It's better to store the RocksDB format version in the checkpoint so that it would give us more information to provide the rollback guarantee when we upgrade the RocksDB version that may introduce incompatible change in a new Spark version.

A typical case is when a user upgrades their query to a new Spark version, and this new Spark version has a new RocksDB version which may use a new format. But the user hits some bug and decide to rollback. But in the old Spark version, the old RocksDB version cannot read the new format.

In order to handle this case, we will write the RocksDB format version to the checkpoint. When restarting from a checkpoint, we will force RocksDB to use the format version stored in the checkpoint. This will ensure the user can rollback their Spark version if needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org