You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Tzu-Li (Gordon) Tai (JIRA)" <ji...@apache.org> on 2017/06/10 11:11:18 UTC

[jira] [Updated] (FLINK-6883) Serializer for collection of Scala case classes are generated with different anonymous class names in 1.3

     [ https://issues.apache.org/jira/browse/FLINK-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tzu-Li (Gordon) Tai updated FLINK-6883:
---------------------------------------
    Description: 
In the Scala API, serializers are generated using Scala macros (via the {{org.apache.flink.streaming.api.scala.createTypeInformation(..)}} util).
The generated serializers are inner anonymous classes, therefore classnames will differ depending on when / order that the serializers are generated.

From 1.1 / 1.2 to Flink 1.3, the generated classnames for a serializer for a collections of case classes (e.g. {{List[SomeUserCaseClass]}}) will be different. In other words, the exact same user code written in the Scala API, compiling it with 1.1 / 1.2 and with 1.3 will result in different classnames.

This is problematic for restoring older savepoints that have Scala case class collections in their state, because the old serializer cannot be recovered (due to the generated classname change).

For now, I've managed to identify that the root cause for this is that in 1.3 the {{TypeSerializer}} base class additionally extends the {{TypeDeserializer}} interface. Removing this extending resolves the problem. The actual reason for why this affects the generated classname is still being investigated.

  was:
In the Scala API, serializers are generated using Scala macros (via the {{org.apache.flink.streaming.api.scala.createTypeInformation(..)}} util).
The generated serializers are inner anonymous classes, therefore classnames will differ depending on when / order that the serializers are generated.

From 1.1 / 1.2 to Flink 1.3, the generated classnames for a serializer for a collections of case classes (e.g. {{List[SomeUserCaseClass]}}) will be different. In other words, the exact same user code written in the Scala API, compiling it with 1.1 / 1.2 and with 1.3 will result in different classnames.

This is problematic for restoring older savepoints that have Scala case class collections in their state, because the old serializer cannot be recovered (due to the generated classname change).

For now, I've managed to identify that the root cause for this is that in 1.3 the {{TypeSerializer}} base class additionally extends the {{TypeDeserializer}} interface. Removing this extending resolves the problem. The actual reason what this affects the generated classname is still being investigated.


> Serializer for collection of Scala case classes are generated with different anonymous class names in 1.3
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-6883
>                 URL: https://issues.apache.org/jira/browse/FLINK-6883
>             Project: Flink
>          Issue Type: Bug
>          Components: Scala API, Type Serialization System
>    Affects Versions: 1.3.0
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>              Labels: flink-rel-1.3.1-blockers
>             Fix For: 1.3.1
>
>
> In the Scala API, serializers are generated using Scala macros (via the {{org.apache.flink.streaming.api.scala.createTypeInformation(..)}} util).
> The generated serializers are inner anonymous classes, therefore classnames will differ depending on when / order that the serializers are generated.
> From 1.1 / 1.2 to Flink 1.3, the generated classnames for a serializer for a collections of case classes (e.g. {{List[SomeUserCaseClass]}}) will be different. In other words, the exact same user code written in the Scala API, compiling it with 1.1 / 1.2 and with 1.3 will result in different classnames.
> This is problematic for restoring older savepoints that have Scala case class collections in their state, because the old serializer cannot be recovered (due to the generated classname change).
> For now, I've managed to identify that the root cause for this is that in 1.3 the {{TypeSerializer}} base class additionally extends the {{TypeDeserializer}} interface. Removing this extending resolves the problem. The actual reason for why this affects the generated classname is still being investigated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)