You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Tzu-Li (Gordon) Tai (JIRA)" <ji...@apache.org> on 2018/05/16 09:48:00 UTC

[jira] [Updated] (FLINK-9376) Allow upgrading to incompatible state serializers (state schema evolution)

     [ https://issues.apache.org/jira/browse/FLINK-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tzu-Li (Gordon) Tai updated FLINK-9376:
---------------------------------------
    Description: 
Currently, users have access to upgrade state serializers on the restore run of a stateful job, as long as the upgraded new serializer remains backwards compatible with all previous written data in the savepoint (i.e. it can read all previous and current schema of serialized state objects).

What is still lacking is the ability to upgrade to incompatible serializers. Upon being registered an incompatible serializer for existing restored state, that state needs to go through the process of -
 1. read serialized state with the previous serializer
 2. passing each deserialized state object through a “migration map function”, and
 3. writing back the state with the new serializer

This should be strictly limited to state registrations that occur before the actual processing begins (e.g. in the {{open}} or {{initializeState}} methods), so that we avoid performing these operations during processing.

Procedure 2. will allow even state type migrations, but that is out-of-scope of this JIRA.
 This ticket focuses only on procedures 1. and 3., where we try to enable schema evolution without state type changes.

This is an umbrella JIRA ticket that overlooks this feature, including a few preliminary tasks that work towards enabling it.

  was:
Currently, users have access to upgrade state serializers on the restore run of a stateful job, as long as the upgraded new serializer remains backwards compatible with all previous written data in the savepoint (i.e. it can read all previous and current schema of serialized state objects).

What is still lacking is the ability to upgrade to incompatible serializers. Upon being registered an incompatible serializer for existing restored state, that state needs to go through the process of -
1. read serialized state with the previous serializer
2. passing each deserialized state object through a “migration map function”, and
3. writing back the state with the new serializer

This should be strictly limited to state registrations that occur before the actual processing begins (e.g. in the `open` or `initializeState` methods), so that we avoid performing these operations during processing.

Procedure 2. will allow even state type migrations, but that is out-of-scope of this JIRA.
This ticket focuses only on procedures 1. and 3., where we try to enable schema evolution without state type changes.

This is an umbrella JIRA ticket that overlooks this feature, including a few preliminary tasks that work towards enabling it.


> Allow upgrading to incompatible state serializers (state schema evolution)
> --------------------------------------------------------------------------
>
>                 Key: FLINK-9376
>                 URL: https://issues.apache.org/jira/browse/FLINK-9376
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing, Type Serialization System
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Critical
>             Fix For: 1.6.0
>
>
> Currently, users have access to upgrade state serializers on the restore run of a stateful job, as long as the upgraded new serializer remains backwards compatible with all previous written data in the savepoint (i.e. it can read all previous and current schema of serialized state objects).
> What is still lacking is the ability to upgrade to incompatible serializers. Upon being registered an incompatible serializer for existing restored state, that state needs to go through the process of -
>  1. read serialized state with the previous serializer
>  2. passing each deserialized state object through a “migration map function”, and
>  3. writing back the state with the new serializer
> This should be strictly limited to state registrations that occur before the actual processing begins (e.g. in the {{open}} or {{initializeState}} methods), so that we avoid performing these operations during processing.
> Procedure 2. will allow even state type migrations, but that is out-of-scope of this JIRA.
>  This ticket focuses only on procedures 1. and 3., where we try to enable schema evolution without state type changes.
> This is an umbrella JIRA ticket that overlooks this feature, including a few preliminary tasks that work towards enabling it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)