You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/04/22 02:24:04 UTC

[jira] [Commented] (BEAM-2052) Windowed file sinks should support dynamic sharding

    [ https://issues.apache.org/jira/browse/BEAM-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979710#comment-15979710 ] 

ASF GitHub Bot commented on BEAM-2052:
--------------------------------------

GitHub user reuvenlax opened a pull request:

    https://github.com/apache/beam/pull/2647

    [BEAM-2052] Allow dynamic sharding in windowed file sinks

    We now allow windowed FileBasedSinks to support dynamic sharding. This requires encoding the window and pane in the FileResult object, and delaying calling into the FilenamePolicy until the finalize step when we know how many shards there are. It also requires us to ensure that elements from different windows are written to different temporary files in the WriteBundles step (since at that point, the bundle might contain elements from several windows).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/reuvenlax/incubator-beam streaming_gcs_output

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2647.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2647
    
----
commit 57a52f02b4077344b421676ab548f9bfbd256e73
Author: Reuven Lax <re...@google.com>
Date:   2017-04-05T19:13:44Z

    Start the process of getting rid of Sink.

commit eb512cc5f343451ca83d981d957884aae62f8a55
Author: Reuven Lax <re...@google.com>
Date:   2017-04-05T22:06:14Z

    Remove Sink class, and rename Write to WriteFiles.

commit 430f51a89e56cf606ac1d4b3de873b439177c939
Author: Reuven Lax <re...@google.com>
Date:   2017-04-06T00:22:15Z

    Get rid of Sink and initialize. We keep standin versions of the old Write and Sink transforms around as a stopgap solution for HDFSFileSink.

commit 176223954e73e42d4e40d47985083ec907ce02b5
Author: Reuven Lax <re...@google.com>
Date:   2017-04-19T17:05:09Z

    Fix Javadoc issues.

commit cfad16ceec00f7336cbbca8d1f45edcde4619f84
Author: Reuven Lax <re...@google.com>
Date:   2017-04-19T18:01:14Z

    Fix javadoc

commit 6b587c5079a24b4ab747c71af8b525bf0f615fea
Author: Reuven Lax <re...@google.com>
Date:   2017-04-19T22:18:44Z

    Deleting unneeded test.

commit 0dedaf39356170d61ee7b5777bcdda760e78c685
Author: Reuven Lax <re...@google.com>
Date:   2017-04-21T17:23:12Z

    Foo

commit dfbbe3c2e86963c911681e05149a172881b6466e
Author: Reuven Lax <re...@google.com>
Date:   2017-04-21T19:23:31Z

    Finish making windowed writes work dynamically.

commit 4bf76edc3ec50857a5f277a5391d5d61eaa2236d
Author: Reuven Lax <re...@google.com>
Date:   2017-04-22T02:00:04Z

    Finish implementing dynamic-sharding for windowed file outputs, and add an integration test.

----


> Windowed file sinks should support dynamic sharding
> ---------------------------------------------------
>
>                 Key: BEAM-2052
>                 URL: https://issues.apache.org/jira/browse/BEAM-2052
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Reuven Lax
>            Assignee: Davor Bonaci
>
> Currently windowed file sinks (WriteFiles and FileBasedSink) require withNumShards to be set explicitly. We should remove this requirement, and allow dynamic output.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)