You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yun Tang (Jira)" <ji...@apache.org> on 2019/12/19 13:50:00 UTC

[jira] [Comment Edited] (FLINK-13910) Many serializable classes have no explicit 'serialVersionUID'

    [ https://issues.apache.org/jira/browse/FLINK-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000086#comment-17000086 ] 

Yun Tang edited comment on FLINK-13910 at 12/19/19 1:49 PM:
------------------------------------------------------------

[~twalthr] 
 # If we do not add serialVersionUID, we actually already had unintended side effects that job graph serialized at client side might not match that at cluster side. By adding serialVersionUID, we could at least ensure Flink-1.10.x could have compatible serialized classes.
 # Flink community suggest to add serialVersionUID = 1L to all serializable classes (ref [doc|https://flink.apache.org/contributing/code-style-and-quality-java.html#java-serialization]), and we should set serialVersionUID as 1L if we add a new class. However, those existing classes already have their own serialVersionUID instead of 1L (please note that those are not random IDs, but are generated according to [a specific rule|https://docs.oracle.com/javase/7/docs/platform/serialization/spec/class.html#4100])


was (Author: yunta):
[~twalthr] 
 # If we do not add serialVersionUID, we actually already had unintended side effects that job graph serialized at client side might not match that at cluster side. By adding serialVersionUID, we could at least ensure Flink-1.10.x could have compatible serialized classes.
 # Flink community suggest to add serialVersionUID = 1L to all serializable classes (ref [doc|https://flink.apache.org/contributing/code-style-and-quality-java.html#java-serialization]), and we should set serialVersionUID as 1L if we add a new class. However, those existing classes already have their own serialVersionUID instead of 1L (please note that those are not random IDs, but is generated according to [a specific rule|https://docs.oracle.com/javase/7/docs/platform/serialization/spec/class.html#4100])

> Many serializable classes have no explicit 'serialVersionUID'
> -------------------------------------------------------------
>
>                 Key: FLINK-13910
>                 URL: https://issues.apache.org/jira/browse/FLINK-13910
>             Project: Flink
>          Issue Type: Bug
>          Components: API / Type Serialization System
>            Reporter: Yun Tang
>            Assignee: Yun Tang
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.9.2, 1.10.0
>
>         Attachments: SerializableNoSerialVersionUIDField
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, many serializable classes in Flink have no explicit 'serialVersionUID'. As [official doc|https://flink.apache.org/contributing/code-style-and-quality-java.html#java-serialization] said, {{Serializable classes must define a Serial Version UID}}. 
> No 'serialVersionUID' would cause compatibility problem. Take {{TwoPhaseCommitSinkFunction}} for example, since no explicit 'serialVersionUID' defined, after [FLINK-10455|https://github.com/apache/flink/commit/489be82a6d93057ed4a3f9bf38ef50d01d11d96b] introduced, its default 'serialVersionUID' has changed from "4584405056408828651" to "4064406918549730832". In other words, if we submit a job from Flink-1.6.3 local home to remote Flink-1.6.2 cluster with the usage of {{TwoPhaseCommitSinkFunction}}, we would get exception like:
> {code:java}
> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot instantiate user function.
>         at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:239)
>         at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:104)
>         at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:267)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.InvalidClassException: org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction; local class incompatible: stream classdesc serialVersionUID = 4584405056408828651, local class serialVersionUID = 4064406918549730832
>         at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
>         at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
>         at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
>         at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
>         at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
>         at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
>         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>         at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>         at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>         at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>         at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
>         at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:537)
>         at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:524)
>         at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:512)
>         at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:473)
>         at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:224)
>         ... 4 more
> {code}
> Similar problems existed in  {{org.apache.flink.streaming.api.operators.SimpleOperatorFactory}} which has different 'serialVersionUID' from release-1.9 and current master branch.
> IMO, we might have two options to fix this bug:
> # Add explicit serialVersionUID for those classes which is identical to latest Flink-1.9.0 release code.
> # Use similar mechanism like {{FailureTolerantObjectInputStream}} in {{InstantiationUtil}} to ignore serialVersionUID mismatch.
> I have collected all production classes without serialVersionUID from latest master branch in the attachment, which counts to 639 classes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)