You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Henning Rohde (JIRA)" <ji...@apache.org> on 2018/11/07 20:19:00 UTC

[jira] [Assigned] (BEAM-5791) Bound the amount of data on the data plane by time.

     [ https://issues.apache.org/jira/browse/BEAM-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henning Rohde reassigned BEAM-5791:
-----------------------------------

    Assignee:     (was: Henning Rohde)

> Bound the amount of data on the data plane by time.
> ---------------------------------------------------
>
>                 Key: BEAM-5791
>                 URL: https://issues.apache.org/jira/browse/BEAM-5791
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow, sdk-java-harness, sdk-py-harness
>            Reporter: Robert Bradshaw
>            Priority: Major
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> This is especially important for Fn API reads, where each element represents a shard to read and may be very expensive, but many elements may be waiting in the Fn API buffer.
> The need for this will be mitigated with full SDF support for liquid sharding over the Fn API, but not eliminated unless the runner can "unread" elements it has already sent. 
> This is especially important in for dataflow jobs that start out small but then detect that they need more workers (e.g. due to the initial inputs being an SDF).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)