You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by kennknowles <gi...@git.apache.org> on 2016/08/05 05:11:37 UTC

[GitHub] incubator-beam pull request #793: [BEAM-25] WIP: Tweeze StateSpec out of Sta...

GitHub user kennknowles opened a pull request:

    https://github.com/apache/incubator-beam/pull/793

    [BEAM-25] WIP: Tweeze StateSpec out of StateTag

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [x] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [x] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
    
    ---
    
    R: @bjchambers @tgroh 
    
    This was a hack sprint. The tests pass for the SDK and direct runner; I've introduced a minor issue to the Flink runner that I will fix tomorrow.
    
    My thoughts when doing this:
    
     - I can prep `StateSpec` now so it is ready for incorporation into `DoFn` whenever that work takes off.
     - Since the `StateSpec` carries the disjoint union now, the binder should visit that.
     - `StateTag` = `String id` + `StateSpec`. First step would be to express it that way, second step might be to delete it.
    
    It actually turned out to be a nice cleanup, but there are wrinkles:
    
     - Since the `State` has to be able to know where to write (in the general case) it still needs an `id`, so the visitor needs an id. But `StateSpec` doesn't have one so it is just passed along. So the visitor just becomes a curried version of the prior.
     - That's all fine, but then the `StateTable#get` also needs the spec because it lazily inits based on it. This is the only time it is used, since beyond then it is contained in the state cell.
     - And then to hack up the `CopyOnAccessInMemoryStateInternals` I even had to re-build the tag `id`. So eliminating `StateTag` entirely would just mean more parameters in a bunch of places.
    
    So I have this feeling that actually there may be a simpler visitor pattern, or no visitor pattern, that becomes more natural.
    
    Anyhow, I'm not happy with the change, and certainly haven't polished `StateSpecs` to where it needs to be for user consumption, but I wanted to put this out for early feedback.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kennknowles/incubator-beam StateSpec

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #793
    
----
commit 488e8955000ee905ab26635e0efbd0834a3d0dcf
Author: Kenneth Knowles <kl...@google.com>
Date:   2016-08-05T03:50:28Z

    Create StateSpec parallel to StateTag

commit e6294682daba9835a030c146389bc633e8f280a5
Author: Kenneth Knowles <kl...@google.com>
Date:   2016-08-05T04:48:48Z

    Make StateTag carry a StateSpec separately from its id

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-beam pull request #793: [BEAM-25] WIP: Tweeze StateSpec out of Sta...

Posted by kennknowles <gi...@git.apache.org>.
Github user kennknowles closed the pull request at:

    https://github.com/apache/incubator-beam/pull/793


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---