You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "zhijiang (Jira)" <ji...@apache.org> on 2019/09/23 08:11:00 UTC

[jira] [Updated] (FLINK-10995) Copy intermediate serialization results only once for broadcast mode

     [ https://issues.apache.org/jira/browse/FLINK-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhijiang updated FLINK-10995:
-----------------------------
    Component/s:     (was: Runtime / Network)
                 Runtime / Task

> Copy intermediate serialization results only once for broadcast mode
> --------------------------------------------------------------------
>
>                 Key: FLINK-10995
>                 URL: https://issues.apache.org/jira/browse/FLINK-10995
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Task
>    Affects Versions: 1.8.0
>            Reporter: zhijiang
>            Assignee: zhijiang
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The emitted records from operator would be firstly serialized into intermediate bytes array in {{RecordSerializer}}, then copy the intermediate results into target buffers for different sub partitions.  For broadcast mode, the same intermediate results would be copied as many times as the number of sub partitions, and this would affect the performance seriously in large scale jobs.
> We can copy to only one target buffer which would be shared by all the sub partitions to reduce the overheads. For emitting latency marker in broadcast mode, we should flush the previous shared target buffers first, and then request a new buffer for the target sub partition to send latency marker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)