You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Luke Cwik (Jira)" <ji...@apache.org> on 2020/04/21 20:13:00 UTC

[jira] [Commented] (BEAM-9014) Update CachingShuffleBatchReader to record weights by size in bytes

    [ https://issues.apache.org/jira/browse/BEAM-9014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089033#comment-17089033 ] 

Luke Cwik commented on BEAM-9014:
---------------------------------

This change needs to be rolled back as it negatively impacts large iterable shuffle performance in Dataflow. Rollback in https://github.com/apache/beam/pull/11483

> Update CachingShuffleBatchReader to record weights by size in bytes
> -------------------------------------------------------------------
>
>                 Key: BEAM-9014
>                 URL: https://issues.apache.org/jira/browse/BEAM-9014
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Luke Cwik
>            Assignee: Tyson Hamilton
>            Priority: Minor
>             Fix For: 2.21.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the CachingShuffleBatchReader caches based upon the number of batches and not the size of those batches. This task is about updating CachingShuffleBatchReader to cache based on the size of those batches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)