You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by jkff <gi...@git.apache.org> on 2017/05/15 23:12:01 UTC

[GitHub] beam pull request #3156: [BEAM-2301] Splits SplittableParDo into a core-cons...

GitHub user jkff opened a pull request:

    https://github.com/apache/beam/pull/3156

    [BEAM-2301] Splits SplittableParDo into a core-construction part and a runners-core part

    SplittableParDo itself goes into core-construction, and expands into a slightly different transform.
    
    This change is almost completely simply moving code around.
    
    Before:
    ```
    elements: InputT
    | pair with restriction -> ElementAndRestriction<InputT, RestrictionT>
    | split restriction -> same
    | explode windows -> same
    | assign unique key -> KV<String, ElementAndRestriction<InputT, RestrictionT>>
    | GBKIntoKeyedWorkItems -> KeyedWorkItem<String, ElementAndRestriction<InputT, RestrictionT>>
    | ProcessElements -> PCollection<OutputT>
    ```
    
    After:
    ```
    elements: InputT
    | ...
    | assign unique key -> KV<String, ElementAndRestriction<InputT, RestrictionT>>
    | SplittableProcessKeyed -> PCollection<OutputT>
    ```
    
    Most runners (except Dataflow) will still want to go through KeyedWorkItem. That part is encapsulated in `SplittableParDoViaKeyedWorkItems`, which has an `OverrideFactory` for `SplittableProcessKeyed` expanding it into the good old `GBKIntoKeyedWorkItems` and `ProcessElements`. So runner changes are very minor.
    
    Dataflow, however, can not use runners-core during expansion, so it will translate `SplittableProcessKeyed` directly and perform its expansion service-side, and will instantiate `ProcessFn` worker-side.
    
    R: @tgroh 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jkff/incubator-beam sdf-expansion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3156
    
----
commit be95bdd679fba755785a8e35a87eb1ec6c440882
Author: Eugene Kirpichov <ki...@google.com>
Date:   2017-05-15T22:54:03Z

    Splits SplittableParDo into a core-construction part and a KWI-related part

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] beam pull request #3156: [BEAM-2301] Splits SplittableParDo into a core-cons...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/beam/pull/3156


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---