You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Piotr Nowojski <pi...@gmail.com> on 2022/10/19 13:13:38 UTC

[DISCUSS] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

Hi devs,

I would like to open a discussion to remove the long deprecated
(@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
code.

The motivation behind this move is two fold. One reason is that it
complicates our code base unnecessarily and creates confusion on how to
actually implement custom serializers. The immediate reason is that I
wanted to clean up Flink's configuration stack a bit and refactor the
ExecutionConfig class [2]. This refactor would keep the API compatibility
of the ExecutionConfig, but it would break savepoint compatibility with
snapshots written with some of the old serializers, which had
ExecutionConfig as a field and were serialized in the snapshot. This issue
has been resolved by the introduction of TypeSerializerSnapshot in Flink
1.7 [3], where serializers are no longer part of the snapshot.

TypeSerializerConfigSnapshot has been deprecated and no longer used by
built-in serializers since Flink 1.8 [4] and [5]. Users were encouraged to
migrate to TypeSerializerSnapshot since then with their own custom
serializers. That has been plenty of time for the migration.

This proposal would have the following impact for the users:
1. we would drop support for recovery from savepoints taken with Flink <
1.7.0 for all built in types serializers
2. we would drop support for recovery from savepoints taken with Flink <
1.8.0 for built in kryo serializers
3. we would drop support for recovery from savepoints taken with Flink <
1.17 for custom serializers using deprecated TypeSerializerConfigSnapshot

1. and 2. would have a simple migration path. Users migrating from those
old savepoints would have to first start his job using a Flink version from
the [1.8, 1.16] range, and take a new savepoint that would be compatible
with Flink 1.17.
3. This is a bit more problematic, because users would have to first
migrate their own custom serializers to use TypeSerializerSnapshot (using a
Flink version from the [1.8, 1.16]), take a savepoint, and only then
migrate to Flink 1.17. However users had already 4 years to migrate, which
in my opinion has been plenty of time to do so.

As a side effect, we could also drop support for some of the legacy
metadata serializers from LegacyStateMetaInfoReaders and potentially other
places that we are keeping for the sake of compatibility with old
savepoints.

What do you think?

Best,
Piotrek

[1]
https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
[2] https://issues.apache.org/jira/browse/FLINK-29379
[3] https://issues.apache.org/jira/browse/FLINK-9377
[4] https://issues.apache.org/jira/browse/FLINK-9376
[5] https://issues.apache.org/jira/browse/FLINK-11323

Re: [DISCUSS] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

Posted by Dawid Wysakowicz <dw...@apache.org>.
+1, I think that is a sensible think to do. I don't think this will 
affect many users as those versions are already quite old.

As a side note, there is a different effort around serializers that 
might introduce another incompatibility (in the API). I wonder if we 
could squash it together somehow so that only a single version 
introduces frictions: 
https://lists.apache.org/thread/t0bdkx1161rlbnsf06x0kswb05mch164

Best,

Dawid

On 19/10/2022 15:13, Piotr Nowojski wrote:
> Hi devs,
>
> I would like to open a discussion to remove the long deprecated
> (@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
> code.
>
> The motivation behind this move is two fold. One reason is that it
> complicates our code base unnecessarily and creates confusion on how to
> actually implement custom serializers. The immediate reason is that I
> wanted to clean up Flink's configuration stack a bit and refactor the
> ExecutionConfig class [2]. This refactor would keep the API compatibility
> of the ExecutionConfig, but it would break savepoint compatibility with
> snapshots written with some of the old serializers, which had
> ExecutionConfig as a field and were serialized in the snapshot. This issue
> has been resolved by the introduction of TypeSerializerSnapshot in Flink
> 1.7 [3], where serializers are no longer part of the snapshot.
>
> TypeSerializerConfigSnapshot has been deprecated and no longer used by
> built-in serializers since Flink 1.8 [4] and [5]. Users were encouraged to
> migrate to TypeSerializerSnapshot since then with their own custom
> serializers. That has been plenty of time for the migration.
>
> This proposal would have the following impact for the users:
> 1. we would drop support for recovery from savepoints taken with Flink <
> 1.7.0 for all built in types serializers
> 2. we would drop support for recovery from savepoints taken with Flink <
> 1.8.0 for built in kryo serializers
> 3. we would drop support for recovery from savepoints taken with Flink <
> 1.17 for custom serializers using deprecated TypeSerializerConfigSnapshot
>
> 1. and 2. would have a simple migration path. Users migrating from those
> old savepoints would have to first start his job using a Flink version from
> the [1.8, 1.16] range, and take a new savepoint that would be compatible
> with Flink 1.17.
> 3. This is a bit more problematic, because users would have to first
> migrate their own custom serializers to use TypeSerializerSnapshot (using a
> Flink version from the [1.8, 1.16]), take a savepoint, and only then
> migrate to Flink 1.17. However users had already 4 years to migrate, which
> in my opinion has been plenty of time to do so.
>
> As a side effect, we could also drop support for some of the legacy
> metadata serializers from LegacyStateMetaInfoReaders and potentially other
> places that we are keeping for the sake of compatibility with old
> savepoints.
>
> What do you think?
>
> Best,
> Piotrek
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
> [2] https://issues.apache.org/jira/browse/FLINK-29379
> [3] https://issues.apache.org/jira/browse/FLINK-9377
> [4] https://issues.apache.org/jira/browse/FLINK-9376
> [5] https://issues.apache.org/jira/browse/FLINK-11323
>

Re: [DISCUSS] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

Posted by Piotr Nowojski <pn...@apache.org>.
Hi,

Thanks for the support. It looks like nobody objects to this proposal, so I
will start a formal vote.

Best,
Piotrek

pt., 21 paź 2022 o 08:24 Timo Walther <tw...@apache.org> napisał(a):

> Makes sense to me. The serializer stack is pretty complex right now, the
> more legacy we remove the better.
>
> Regards,
> Timo
>
>
> On 20.10.22 12:49, Chesnay Schepler wrote:
> > +1
> >
> > Sounds like a good reason to drop these long-deprecated APIs.
> >
> > On 19/10/2022 15:13, Piotr Nowojski wrote:
> >> Hi devs,
> >>
> >> I would like to open a discussion to remove the long deprecated
> >> (@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
> >> code.
> >>
> >> The motivation behind this move is two fold. One reason is that it
> >> complicates our code base unnecessarily and creates confusion on how to
> >> actually implement custom serializers. The immediate reason is that I
> >> wanted to clean up Flink's configuration stack a bit and refactor the
> >> ExecutionConfig class [2]. This refactor would keep the API
> compatibility
> >> of the ExecutionConfig, but it would break savepoint compatibility with
> >> snapshots written with some of the old serializers, which had
> >> ExecutionConfig as a field and were serialized in the snapshot. This
> >> issue
> >> has been resolved by the introduction of TypeSerializerSnapshot in Flink
> >> 1.7 [3], where serializers are no longer part of the snapshot.
> >>
> >> TypeSerializerConfigSnapshot has been deprecated and no longer used by
> >> built-in serializers since Flink 1.8 [4] and [5]. Users were
> >> encouraged to
> >> migrate to TypeSerializerSnapshot since then with their own custom
> >> serializers. That has been plenty of time for the migration.
> >>
> >> This proposal would have the following impact for the users:
> >> 1. we would drop support for recovery from savepoints taken with Flink <
> >> 1.7.0 for all built in types serializers
> >> 2. we would drop support for recovery from savepoints taken with Flink <
> >> 1.8.0 for built in kryo serializers
> >> 3. we would drop support for recovery from savepoints taken with Flink <
> >> 1.17 for custom serializers using deprecated
> TypeSerializerConfigSnapshot
> >>
> >> 1. and 2. would have a simple migration path. Users migrating from those
> >> old savepoints would have to first start his job using a Flink version
> >> from
> >> the [1.8, 1.16] range, and take a new savepoint that would be compatible
> >> with Flink 1.17.
> >> 3. This is a bit more problematic, because users would have to first
> >> migrate their own custom serializers to use TypeSerializerSnapshot
> >> (using a
> >> Flink version from the [1.8, 1.16]), take a savepoint, and only then
> >> migrate to Flink 1.17. However users had already 4 years to migrate,
> >> which
> >> in my opinion has been plenty of time to do so.
> >>
> >> As a side effect, we could also drop support for some of the legacy
> >> metadata serializers from LegacyStateMetaInfoReaders and potentially
> >> other
> >> places that we are keeping for the sake of compatibility with old
> >> savepoints.
> >>
> >> What do you think?
> >>
> >> Best,
> >> Piotrek
> >>
> >> [1]
> >>
> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
> >> [2] https://issues.apache.org/jira/browse/FLINK-29379
> >> [3] https://issues.apache.org/jira/browse/FLINK-9377
> >> [4] https://issues.apache.org/jira/browse/FLINK-9376
> >> [5] https://issues.apache.org/jira/browse/FLINK-11323
> >>
> >
>
>

Re: [DISCUSS] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

Posted by Timo Walther <tw...@apache.org>.
Makes sense to me. The serializer stack is pretty complex right now, the 
more legacy we remove the better.

Regards,
Timo


On 20.10.22 12:49, Chesnay Schepler wrote:
> +1
> 
> Sounds like a good reason to drop these long-deprecated APIs.
> 
> On 19/10/2022 15:13, Piotr Nowojski wrote:
>> Hi devs,
>>
>> I would like to open a discussion to remove the long deprecated
>> (@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
>> code.
>>
>> The motivation behind this move is two fold. One reason is that it
>> complicates our code base unnecessarily and creates confusion on how to
>> actually implement custom serializers. The immediate reason is that I
>> wanted to clean up Flink's configuration stack a bit and refactor the
>> ExecutionConfig class [2]. This refactor would keep the API compatibility
>> of the ExecutionConfig, but it would break savepoint compatibility with
>> snapshots written with some of the old serializers, which had
>> ExecutionConfig as a field and were serialized in the snapshot. This 
>> issue
>> has been resolved by the introduction of TypeSerializerSnapshot in Flink
>> 1.7 [3], where serializers are no longer part of the snapshot.
>>
>> TypeSerializerConfigSnapshot has been deprecated and no longer used by
>> built-in serializers since Flink 1.8 [4] and [5]. Users were 
>> encouraged to
>> migrate to TypeSerializerSnapshot since then with their own custom
>> serializers. That has been plenty of time for the migration.
>>
>> This proposal would have the following impact for the users:
>> 1. we would drop support for recovery from savepoints taken with Flink <
>> 1.7.0 for all built in types serializers
>> 2. we would drop support for recovery from savepoints taken with Flink <
>> 1.8.0 for built in kryo serializers
>> 3. we would drop support for recovery from savepoints taken with Flink <
>> 1.17 for custom serializers using deprecated TypeSerializerConfigSnapshot
>>
>> 1. and 2. would have a simple migration path. Users migrating from those
>> old savepoints would have to first start his job using a Flink version 
>> from
>> the [1.8, 1.16] range, and take a new savepoint that would be compatible
>> with Flink 1.17.
>> 3. This is a bit more problematic, because users would have to first
>> migrate their own custom serializers to use TypeSerializerSnapshot 
>> (using a
>> Flink version from the [1.8, 1.16]), take a savepoint, and only then
>> migrate to Flink 1.17. However users had already 4 years to migrate, 
>> which
>> in my opinion has been plenty of time to do so.
>>
>> As a side effect, we could also drop support for some of the legacy
>> metadata serializers from LegacyStateMetaInfoReaders and potentially 
>> other
>> places that we are keeping for the sake of compatibility with old
>> savepoints.
>>
>> What do you think?
>>
>> Best,
>> Piotrek
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
>> [2] https://issues.apache.org/jira/browse/FLINK-29379
>> [3] https://issues.apache.org/jira/browse/FLINK-9377
>> [4] https://issues.apache.org/jira/browse/FLINK-9376
>> [5] https://issues.apache.org/jira/browse/FLINK-11323
>>
> 


Re: [DISCUSS] Drop TypeSerializerConfigSnapshot and savepoint support from Flink versions < 1.8.0

Posted by Chesnay Schepler <ch...@apache.org>.
+1

Sounds like a good reason to drop these long-deprecated APIs.

On 19/10/2022 15:13, Piotr Nowojski wrote:
> Hi devs,
>
> I would like to open a discussion to remove the long deprecated
> (@PublicEvolving) TypeSerializerConfigSnapshot class [1] and the related
> code.
>
> The motivation behind this move is two fold. One reason is that it
> complicates our code base unnecessarily and creates confusion on how to
> actually implement custom serializers. The immediate reason is that I
> wanted to clean up Flink's configuration stack a bit and refactor the
> ExecutionConfig class [2]. This refactor would keep the API compatibility
> of the ExecutionConfig, but it would break savepoint compatibility with
> snapshots written with some of the old serializers, which had
> ExecutionConfig as a field and were serialized in the snapshot. This issue
> has been resolved by the introduction of TypeSerializerSnapshot in Flink
> 1.7 [3], where serializers are no longer part of the snapshot.
>
> TypeSerializerConfigSnapshot has been deprecated and no longer used by
> built-in serializers since Flink 1.8 [4] and [5]. Users were encouraged to
> migrate to TypeSerializerSnapshot since then with their own custom
> serializers. That has been plenty of time for the migration.
>
> This proposal would have the following impact for the users:
> 1. we would drop support for recovery from savepoints taken with Flink <
> 1.7.0 for all built in types serializers
> 2. we would drop support for recovery from savepoints taken with Flink <
> 1.8.0 for built in kryo serializers
> 3. we would drop support for recovery from savepoints taken with Flink <
> 1.17 for custom serializers using deprecated TypeSerializerConfigSnapshot
>
> 1. and 2. would have a simple migration path. Users migrating from those
> old savepoints would have to first start his job using a Flink version from
> the [1.8, 1.16] range, and take a new savepoint that would be compatible
> with Flink 1.17.
> 3. This is a bit more problematic, because users would have to first
> migrate their own custom serializers to use TypeSerializerSnapshot (using a
> Flink version from the [1.8, 1.16]), take a savepoint, and only then
> migrate to Flink 1.17. However users had already 4 years to migrate, which
> in my opinion has been plenty of time to do so.
>
> As a side effect, we could also drop support for some of the legacy
> metadata serializers from LegacyStateMetaInfoReaders and potentially other
> places that we are keeping for the sake of compatibility with old
> savepoints.
>
> What do you think?
>
> Best,
> Piotrek
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.html
> [2] https://issues.apache.org/jira/browse/FLINK-29379
> [3] https://issues.apache.org/jira/browse/FLINK-9377
> [4] https://issues.apache.org/jira/browse/FLINK-9376
> [5] https://issues.apache.org/jira/browse/FLINK-11323
>