You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Eugene Kirpichov (JIRA)" <ji...@apache.org> on 2017/05/01 20:03:04 UTC

[jira] [Updated] (BEAM-1824) Adapter for running SDF on a statically known input as a Source

     [ https://issues.apache.org/jira/browse/BEAM-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eugene Kirpichov updated BEAM-1824:
-----------------------------------
    Description: 
[~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65] is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.

This will allow us to start transitioning bounded IO connectors to SDF API while preserving the dynamic rebalancing feature in the common case when the source is known at pipeline submission time.

And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors for reading from a PCollection of inputs, and those will enjoy the same benefits. Actually we can add such APIs earlier, with the caveat that they won't support dynamic rebalancing, but in this case it's ok because there'll be no performance regression because these APIs didn't exist before.

Proposal document: http://s.apache.org/sdf-via-source

  was:
[~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65] is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.

This will allow us to start transitioning bounded IO connectors to SDF API while preserving the dynamic rebalancing feature in the common case when the source is known at pipeline submission time.

And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors for reading from a PCollection of inputs, and those will enjoy the same benefits. Actually we can add such APIs earlier, with the caveat that they won't support dynamic rebalancing, but in this case it's ok because there'll be no performance regression because these APIs didn't exist before.

        Summary: Adapter for running SDF on a statically known input as a Source  (was: Adapter for running SDF on a statically known input as a BoundedSource)

> Adapter for running SDF on a statically known input as a Source
> ---------------------------------------------------------------
>
>                 Key: BEAM-1824
>                 URL: https://issues.apache.org/jira/browse/BEAM-1824
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-dataflow, sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Eugene Kirpichov
>
> [~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65] is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.
> This will allow us to start transitioning bounded IO connectors to SDF API while preserving the dynamic rebalancing feature in the common case when the source is known at pipeline submission time.
> And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors for reading from a PCollection of inputs, and those will enjoy the same benefits. Actually we can add such APIs earlier, with the caveat that they won't support dynamic rebalancing, but in this case it's ok because there'll be no performance regression because these APIs didn't exist before.
> Proposal document: http://s.apache.org/sdf-via-source



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)