You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Mark Shields (JIRA)" <ji...@apache.org> on 2016/03/30 21:55:25 UTC

[jira] [Created] (BEAM-160) Port 'NexMark Queries' to Beam for use as integration test

Mark Shields created BEAM-160:
---------------------------------

             Summary: Port 'NexMark Queries' to Beam for use as integration test
                 Key: BEAM-160
                 URL: https://issues.apache.org/jira/browse/BEAM-160
             Project: Beam
          Issue Type: Test
          Components: testing
            Reporter: Mark Shields
            Assignee: Davor Bonaci


A while back we implemented the 'queries' from
  http://datalab.cs.pdx.edu/niagara/NEXMark/
as Gooogle Dataflow pipelines. We found them useful
for uncovering performance problems with the sdk, our runners,
and our service. Many of those problems only manifested under
high load, multi-day runs, or with high 'backlog' on the incoming
pub/sub subscriptions.

We thus think they would be useful for other runners.

Disclaimer: Though the original 'queries' were proposed as a way to
benchmark 'continuous SQL' implementations, we have so far only
used them for internal A/B and regression testing and have not validated
them as representative of customer workloads. We would thus discourage their use for competitive benchmarks without more work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)