You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by angoenka <gi...@git.apache.org> on 2017/11/15 02:07:50 UTC

[GitHub] beam pull request #4134: [BEAM-3189] Sdk worker multithreading

GitHub user angoenka opened a pull request:

    https://github.com/apache/beam/pull/4134

    [BEAM-3189] Sdk worker multithreading

    Follow this checklist to help us incorporate your contribution quickly and easily:
    
     - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it).  Trivial changes like typos do not require a JIRA issue.  Your pull request should address just this issue, without pulling in other changes.
     - [ ] Each commit in the pull request should have a meaningful subject line and body.
     - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue.
     - [ ] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
     - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
     - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
    
    ---
    Beam Python SDK is couple of magnitude slower than Java SDK when it comes to stream processing.
    In this PR we are going to address the following issue.
    Given a single core, currently we are not fully utilizing the core because the single thread spends a lot of time on the IO. This is more of a limitation of our implementation rather than a limitation of Python.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/angoenka/beam sdk_worker_multithreading

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/4134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4134
    
----
commit 4339a4a52eddccd567d449c3b390cfa0bcabccce
Author: Ankur Goenka <go...@goenka.svl.corp.google.com>
Date:   2017-11-07T00:16:44Z

    Adding multi threaded function registration test

commit 533b82b45f89edff9262216b1f6252bd3f75f35b
Author: Ankur Goenka <go...@goenka.svl.corp.google.com>
Date:   2017-11-08T00:03:45Z

    Wrapping SDKWoker to associate more state to it.

commit 7719145a4044b323e4b88b87171dbf88bd957c9a
Author: Ankur Goenka <go...@goenka.svl.corp.google.com>
Date:   2017-11-09T22:02:19Z

    Making multiple workers to work in parallel

commit c73fc59d9191f3c337e17710f764f23d9f5c7ba8
Author: Ankur Goenka <go...@goenka.svl.corp.google.com>
Date:   2017-11-15T01:59:11Z

    Adding experimental option for worker_threads

----


---