You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/01/19 03:24:26 UTC

[jira] [Commented] (BEAM-1281) GlobalWindow needs non-empty encoding in StateNamespaces

    [ https://issues.apache.org/jira/browse/BEAM-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829235#comment-15829235 ] 

ASF GitHub Bot commented on BEAM-1281:
--------------------------------------

GitHub user kennknowles opened a pull request:

    https://github.com/apache/beam/pull/1793

    [BEAM-1281] Encode GlobalWindow in one byte when it is the whole stream

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [x] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [x] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [x] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
    
    ---
    
    See JIRA and inline comment for explanation. TL;DR empty encodings are pretty fragile. Many coders are sensitive to one byte addition, but this will not affect anything in the main element collections, as they are within a `WindowedValueCoder`. The motivating case is actually quite mundane.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kennknowles/beam GlobalWindow-encoding

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/1793.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1793
    
----
commit 219d0cc8f5094240d2eafab6d2c1579aa2c70219
Author: Kenneth Knowles <kl...@google.com>
Date:   2017-01-19T03:20:36Z

    Encoding GlobalWindow in one byte when it is the whole stream

----


> GlobalWindow needs non-empty encoding in StateNamespaces
> --------------------------------------------------------
>
>                 Key: BEAM-1281
>                 URL: https://issues.apache.org/jira/browse/BEAM-1281
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Kenneth Knowles
>            Assignee: Kenneth Knowles
>            Priority: Minor
>
> Because the GlobalWindow is encoded to zero bytes, a StateNamespace built from the window has a stringKey "//" while the global namespace's stringKey is "/". As paths, these are identical, though we don't currently treat them as paths, quite. It isn't clear whether this is desirable. Maybe it is harmless but it complicates parsing and interpretation.
> For a system that actually builds hierarchical paths out of, say, some prefix, the StateNamespace, and a subsequent ID, the canonicalized path is the same so it is not possible to deserialize to the original.
> There are other gotchas associated with zero-length encodings, such as APIs that return zero bytes when there is no more data ready, versus returning zero bytes because the data is representable in zero bytes, given the context of knowing what type of data is expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)