You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/05/20 16:19:12 UTC

[jira] [Commented] (BEAM-160) Port 'NexMark Queries' to Beam for use as integration test

    [ https://issues.apache.org/jira/browse/BEAM-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293635#comment-15293635 ] 

ASF GitHub Bot commented on BEAM-160:
-------------------------------------

GitHub user mshields822 opened a pull request:

    https://github.com/apache/incubator-beam/pull/366

    [BEAM-160] DRAFT NexMark

    Rough cut of the NexMark integration suit, ported to Beam and generalized to support multiple runners.
    
    I've been running this regularly for the last few months to test pubsub, flink's implementation of unbounded sources, and so on.
    
    Still to do:
     - Implement an InProcess runner driver.
     - Bring over the internal unit tests.
     - Include a  readme with some recipes.
     - Back port some of changes which have happened since I peeled off.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mshields822/incubator-beam nexmark

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #366
    
----
commit cfbd716fe994257938ac7279613a07f708cf30e4
Author: Mark Shields <ma...@google.com>
Date:   2016-03-28T23:25:29Z

    NexMark

----


> Port 'NexMark Queries' to Beam for use as integration test
> ----------------------------------------------------------
>
>                 Key: BEAM-160
>                 URL: https://issues.apache.org/jira/browse/BEAM-160
>             Project: Beam
>          Issue Type: Test
>          Components: testing
>            Reporter: Mark Shields
>            Assignee: Mark Shields
>
> A while back we implemented the 'queries' from
>   http://datalab.cs.pdx.edu/niagara/NEXMark/
> as Gooogle Dataflow pipelines. We found them useful
> for uncovering performance problems with the sdk, our runners,
> and our service. Many of those problems only manifested under
> high load, multi-day runs, or with high 'backlog' on the incoming
> pub/sub subscriptions.
> We thus think they would be useful for other runners.
> Disclaimer: Though the original 'queries' were proposed as a way to
> benchmark 'continuous SQL' implementations, we have so far only
> used them for internal A/B and regression testing and have not validated
> them as representative of customer workloads. We would thus discourage their use for competitive benchmarks without more work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)