You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by chamikaramj <gi...@git.apache.org> on 2017/01/05 09:57:54 UTC

[GitHub] beam pull request #1738: [Beam-564] Updates Python source API to allow repor...

GitHub user chamikaramj opened a pull request:

    https://github.com/apache/beam/pull/1738

    [Beam-564] Updates Python source API to allow reporting consumed and remaining number of split points

    With this update Python BoundedSource/RangeTracker API can report consumed and remaining number of split points while performing a source read operations. Java SDK source API already supports reporting these signals.
    
    These signals can be used by runner implementations, for example, to perform scaling decisions.
    
    This provides a slightly simplified API compared to previous PR https://github.com/apache/incubator-beam/pull/881.
    
    Main differences compared to https://github.com/apache/incubator-beam/pull/881 are following.
    
    (1) set_done()/done() methods were removed from the RangeTracker interface.
    
    Downside is that RangeTracker will be unable to provide the signal that all records have been consumed. I think this signal is unnecessary since a runner can detect that anyways since the reader loop of the source ends at that point.
    
    (2)
    Callback between BoundedSource and RangeTracker was changed from reporting remaining number of split points to reporting unclaimed number of split points. This makes the implementation of the callback simpler for source authors. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chamikaramj/incubator-beam limited_parallelism_updated

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/1738.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1738
    
----
commit d3fb7d86fcf8c42b8da27943318d08b5e97c41c6
Author: Chamikara Jayalath <ch...@google.com>
Date:   2017-01-05T03:10:09Z

    Updates Python SDK source API so that sources can report limited parallelism signals.
    
    With this update Python BoundedSource/RangeTracker API can report consumed and remaining number of split points while performing a source read operations, similar to Java SDK sources.
    
    These signals can be used by runner implementations, for example, to perform scaling decisions.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] beam pull request #1738: [BEAM-564] Updates Python source API to allow repor...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/beam/pull/1738


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---