You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Maximilian Michels (JIRA)" <ji...@apache.org> on 2018/11/20 14:00:00 UTC

[jira] [Commented] (BEAM-6055) CoderTypeSerializer#duplicate() should create a deep copy of the coder

    [ https://issues.apache.org/jira/browse/BEAM-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16693269#comment-16693269 ] 

Maximilian Michels commented on BEAM-6055:
------------------------------------------

Thanks for reporting this [~srichter]. The wrapped {{Coder}} element should be thread-safe, as this is a requirement for Coders. The Avro coder in use in FLINK-10860 also appears to be using thread locals for thread-safety.

I think the error lies somewhere else, perhaps in specifying an incorrect coder. Need to investigate further. I think we can move FLINK-10860 to BEAM.

> CoderTypeSerializer#duplicate() should create a deep copy of the coder
> ----------------------------------------------------------------------
>
>                 Key: BEAM-6055
>                 URL: https://issues.apache.org/jira/browse/BEAM-6055
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.8.0
>            Reporter: Stefan Richter
>            Priority: Critical
>
> I think that {{CoderTypeSerializer#duplicate()}} must make a deep copy of the {{coder}} field here https://github.com/apache/beam/blob/f2f0b02babf745d0d9645e0526637ef967dd2228/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L53. It seems to me like the {{coder}} objects can be stateful and the current implementation will share them across multiple serializer instance. Different serializer instances can be used by different threads in Flink and as a whole, this can lead to concurrency problems and corruption like in this example: https://issues.apache.org/jira/browse/FLINK-10860



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)