You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/07/13 08:10:00 UTC

[jira] [Updated] (FLINK-9376) Allow upgrading to incompatible state serializers (state schema evolution)

     [ https://issues.apache.org/jira/browse/FLINK-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated FLINK-9376:
----------------------------------
    Labels: pull-request-available  (was: )

> Allow upgrading to incompatible state serializers (state schema evolution)
> --------------------------------------------------------------------------
>
>                 Key: FLINK-9376
>                 URL: https://issues.apache.org/jira/browse/FLINK-9376
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing, Type Serialization System
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.6.0
>
>
> Currently, users have access to upgrade state serializers on the restore run of a stateful job, as long as the upgraded new serializer remains backwards compatible with all previous written data in the savepoint (i.e. it can read all previous and current schema of serialized state objects).
> What is still lacking is the ability to upgrade to incompatible serializers. Upon being registered an incompatible serializer for existing restored state, that state needs to go through the process of -
>  1. read serialized state with the previous serializer
>  2. passing each deserialized state object through a “migration map function”, and
>  3. writing back the state with the new serializer
> The availability of this process should be strictly limited to state registrations that occur before the actual processing begins (e.g. in the {{open}} or {{initializeState}} methods), so that we avoid performing these operations during processing.
> How this procedure actually occurs, differs across different types of state backends.
> For example, for state backends that eagerly deserialize / lazily serialize state (e.g. {{HeapStateBackend}}), the job execution itself can be seen as a "migration"; everything is deserialized to state objects on restore, and is only serialized again, with the new serializer, on checkpoints.
> Therefore, for these state backends, the above process is irrelevant.
> On the other hand, for state backends that lazily deserialize / eagerly serialize state (e.g. {{RocksDBStateBackend}}), the state evolution process needs to happen for every state with a newly registered incompatible serializer.
> Procedure 2. will allow even state type migrations, but that is out-of-scope of this JIRA.
>  This ticket focuses only on procedures 1. and 3., where we try to enable schema evolution without state type changes.
> This is an umbrella JIRA ticket that overlooks this feature, including a few preliminary tasks that work towards enabling it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)