You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Bryan Pendleton (JIRA)" <ji...@apache.org> on 2009/09/07 17:40:57 UTC

[jira] Updated: (DERBY-4363) Add simple benchmark for measuring GROUP BY performance

     [ https://issues.apache.org/jira/browse/DERBY-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Pendleton updated DERBY-4363:
-----------------------------------

    Attachment: multiColumnBenchmark.diff

Attached is 'multiColumnBenchmark, an enhanced version of the GroupByClient proposal
which can now generate a richer variety of GROUP BY statements.

It also only executes a single statement per run, since I agree with the
observation that it is hard to interpret the results of running a mixture
of statements in the same run.

I put a lot of comments into the GroupByClient header which should explain
how to invoke the benchmark to run a richer set of statements.

I gave getLoadOpt package visibility so that the GroupByClient could
interrogate the -load_opts settings in a more convenient fashion.

Continued suggestions and comments would be greatly appreciated.

Soon, I hope to find the time to run this benchmark against the current trunk,
as well as against the DERBY-3002 patch proposal, to get a first set of
numbers to explore the overall performance characteristics in a coarse fashion.

I'm hoping it will be sufficient to perform, say, 5 different GROUP BY statements
against each version of the code, at scales 10 thousand, 100 thousand, and
250 thousand rows. That will give us 15 numbers for each branch, and
maybe we can see some results from that data. 

I should be able to post those runs as a "script" of 18 perf.clients.Runner statements
to be run in sequence against each code branch.


> Add simple benchmark for measuring GROUP BY performance
> -------------------------------------------------------
>
>                 Key: DERBY-4363
>                 URL: https://issues.apache.org/jira/browse/DERBY-4363
>             Project: Derby
>          Issue Type: Sub-task
>          Components: Test, Tools
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: multiColumnBenchmark.diff, simpleBenchmark.diff
>
>
> As part of ROLLUP implementation (DERBY-3002), it will be helpful to be able to measure the performance of GROUP BY.
> Using the o.a.dT.perf.clients framework, this sub-task proposes to add a GroupByClient to the performance runner
> library; the GroupByClient will run GROUP BY statements against the Wisconsin benchmark database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.