You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Kenneth Knowles (Jira)" <ji...@apache.org> on 2021/05/15 17:57:02 UTC

[jira] [Updated] (BEAM-11146) Add option to disable copying between Flink runner

     [ https://issues.apache.org/jira/browse/BEAM-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kenneth Knowles updated BEAM-11146:
-----------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Resolved)

Hello! Due to a bug in our Jira configuration, this issue had status:Resolved but resolution:Unresolved.

I am bulk editing these issues to have resolution:Fixed

If a different resolution is appropriate, please change it. To do this, click the "Resolve" button (you can do this even for closed issues) and set the Resolution field to the right value.

> Add option to disable copying between Flink runner 
> ---------------------------------------------------
>
>                 Key: BEAM-11146
>                 URL: https://issues.apache.org/jira/browse/BEAM-11146
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-flink
>            Reporter: Teodor Spæren
>            Assignee: Teodor Spæren
>            Priority: P2
>              Labels: performance
>             Fix For: 2.26.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In order to implement Flink [TypeSerializer|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java] the runner implements [CoderTypeSerializer|https://github.com/apache/beam/blob/master/runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L84]. The way the {{copy}} function is implemented is by first serializing and then deserializing each element. This means that such a deep copy needs to be done between each operator and this can become a bottleneck.
> The reason the {{copy}} functions need to be implemented is that Flink guarantees that elements will be deep copied between each operator. In Beam this is the users responsibility and so this is not strictly neccecarry.
> The aim of this improvement is to introduce an option on the Flink Runner, that eliminates this overhead, by simply returning the value.
> [Here is the mailing list discussion|https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)