You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Daniel Halperin (JIRA)" <ji...@apache.org> on 2016/03/31 07:09:25 UTC

[jira] [Comment Edited] (BEAM-68) Support for limiting parallelism of a step

    [ https://issues.apache.org/jira/browse/BEAM-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219353#comment-15219353 ] 

Daniel Halperin edited comment on BEAM-68 at 3/31/16 5:08 AM:
--------------------------------------------------------------

Eugene: I disagree with the premise that even the group by key trick works without runner support. Fusion breaks and dynamic work rebalancing violate all such assumptions. Model changes are all that will guarantee anything here, without completely removing modularity from the system


was (Author: dhalperi@google.com):
Eugene: I disagree with the premise that even the group by key trick works without runner support. Fusion breaks and dynamic work rebalancing violate all such assumptions. Model changes are all that will guarantee anything here.

> Support for limiting parallelism of a step
> ------------------------------------------
>
>                 Key: BEAM-68
>                 URL: https://issues.apache.org/jira/browse/BEAM-68
>             Project: Beam
>          Issue Type: New Feature
>          Components: beam-model
>            Reporter: Daniel Halperin
>
> Users may want to limit the parallelism of a step. Two classic uses cases are:
> - User wants to produce at most k files, so sets TextIO.Write.withNumShards(k).
> - External API only supports k QPS, so user sets a limit of k/(expected QPS/step) on the ParDo that makes the API call.
> Unfortunately, there is no way to do this effectively within the Beam model. A GroupByKey with exactly k keys will guarantee that only k elements are produced, but runners are free to break fusion in ways that each element may be processed in parallel later.
> To implement this functionaltiy, I believe we need to add this support to the Beam Model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)