You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Gao (Jira)" <ji...@apache.org> on 2020/08/07 15:25:00 UTC

[jira] [Comment Edited] (FLINK-18631) Serializer for scala sealed trait hierarchies

    [ https://issues.apache.org/jira/browse/FLINK-18631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173207#comment-17173207 ] 

Yun Gao edited comment on FLINK-18631 at 8/7/20, 3:24 PM:
----------------------------------------------------------

Hi [~rgrebennikov], very thanks for the PR and very sorry for the late response. I read the PR and as a whole I agree with the main thought of the PR. However, currently we are met with a general problem when upgrading the serializers, namely compatibility: users may already use the old generic serializer to store their records (like sealed trait types in this issue) in the state, if we directly change the serializer, users would not be able to recovery from their existing checkpoint after upgrading. In fact, we already have some similar issues blocking on the compatibility problem, like [FLINK-12412|https://issues.apache.org/jira/browse/FLINK-12412]. Currently we are seeking to refactor the framework of the serializers, during which we also want to add support for the compatibility of upgrading serializer, but it still require some time to finish.

One possible solution might be first only introducing the type info and serializers, and users could explicitly set the type info if they required. However, one special point for this issue is that users are not easy to create the SealedTraitTypeInfo manually. 


was (Author: gaoyunhaii):
Hi [~rgrebennikov], very thanks for the PR and very sorry for the late response. I read the PR and as a whole I agree with the main thought of the PR. However, currently we are met with a general problem when upgrading the serializers, namely compatibility: users may already use the old generic serializer to store their records (like sealed trait types in this issue) in the state, if we directly change the serializer, users would not be able to recovery from their existing checkpoint after upgrading. In fact, we already have some similar issues blocking on the compatibility problem, like [FLINK-12412|https://issues.apache.org/jira/browse/FLINK-12412]. Currently we are seeking to refactor the framework of the serializers, during which we also want to add support for the compatibility of upgrading serializer, but it seems to still require some time.

One possible solution might be first only introducing the type info and serializers, and users could explicitly set the type info if they required. However, one special point for this issue is that users are not easy to create the SealedTraitTypeInfo manually. 

> Serializer for scala sealed trait hierarchies
> ---------------------------------------------
>
>                 Key: FLINK-18631
>                 URL: https://issues.apache.org/jira/browse/FLINK-18631
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / Type Serialization System
>    Affects Versions: 1.11.0
>            Reporter: Roman Grebennikov
>            Priority: Major
>              Labels: performance, pull-request-available
>
> Currently, when flink serialization system spots an ADT-style class hierarchy in the Scala code, it falls back to GenericType and kryo serialization, which may introduce performance issues. For example, for code:
> {{sealed trait ADT}}
> {{case class Foo(a: String) extends ADT}}
> {{case class Bar(b: Int) extends ADT}}
> {{env.fromCollection(List[ADT](Foo("a"),Bar(1))).collect()}}
>  
> It will fall back to Kryo even if there is no problem with dealing with List[Foo] or List[Bar] separately. Using ADTs is a convenient way in Scala to model different types of messages, but Flink type system performance limits it to only a non performance-critical paths.
>  
> It would be nice to have a sealed trait hierarchies support out of the box without kryo fallback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)