You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/08/10 10:36:22 UTC

[jira] [Commented] (BEAM-542) Spark batch interval should be a configuration instead of an interpretation of the Pipeline's windows

    [ https://issues.apache.org/jira/browse/BEAM-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415091#comment-15415091 ] 

ASF GitHub Bot commented on BEAM-542:
-------------------------------------

GitHub user amitsela opened a pull request:

    https://github.com/apache/incubator-beam/pull/808

    [BEAM-542] Spark batch interval should be a configuration instead of an interpretation of the Pipeline's windows

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [ ] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
    
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/amitsela/incubator-beam BEAM-542

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/808.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #808
    
----
commit f0201f7f1426d1b5d7058aff02202a9cb54b3ac0
Author: Sela <an...@paypal.com>
Date:   2016-08-10T10:30:30Z

    Add the batch interval to the pipeline options, default arbitrarily to 1000 msec.

commit d0eab7b8a4f179d1c2beefe471a983c26c75ce86
Author: Sela <an...@paypal.com>
Date:   2016-08-10T10:32:14Z

    Pick-up the batch interval from pipeline options and remove StreamingWindowPipelineDetector.

commit 8c6a7b5ca04b61af2f6b8acef695f9ffe1aa32a0
Author: Sela <an...@paypal.com>
Date:   2016-08-10T10:32:59Z

    Use SDK API to get the window function.

commit 3cf66d7d25829548436abbb56b0699612a534768
Author: Sela <an...@paypal.com>
Date:   2016-08-10T10:33:25Z

    Update the README

commit c22232e333aa86b5ac97d25ac7d6a2b83f699f34
Author: Sela <an...@paypal.com>
Date:   2016-08-10T10:33:34Z

    Update streaming tests

----


> Spark batch interval should be a configuration instead of an interpretation of the Pipeline's windows
> -----------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-542
>                 URL: https://issues.apache.org/jira/browse/BEAM-542
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> Currently, the SparkRunner extracts the batch interval from the duration of the first window. 
> This is wrong in several ways:
> # GlobalWindow pipelines
> # It's an engine specific property and should not be expressed as a part of the logic but rather as a configuration for execution of the pipeline.
> # Effectively forces the definition of Fixed/SlidingWindows even when they are not needed (stateless processing), which also makes the pipeline code not portable.    



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)