You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2015/12/09 21:34:10 UTC
[jira] [Comment Edited] (CALCITE-1012) Benchmark SQL parser
[ https://issues.apache.org/jira/browse/CALCITE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049350#comment-15049350 ]
Julian Hyde edited comment on CALCITE-1012 at 12/9/15 8:33 PM:
---------------------------------------------------------------
In CALCITE-459 [~vlsi] wrote:
[~julianhyde], what is the recommended way to parse queries?
I've crafted a first draft and it looks like {{SqlParser.create(sql).parseQuery()}} is taking quite noticeable time:
https://github.com/apache/calcite/pull/176
Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects.
It looks like I'm using the wrong API.
{noformat}
Benchmark (comments) (length) Mode Cnt Score Error Units
ParserBenchmark.parse true 10 avgt 5 46,718 ± 16,213 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 10 avgt 5 74487,751 ± 177,889 B/op
ParserBenchmark.parse true 100 avgt 5 54,542 ± 24,144 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 100 avgt 5 76262,748 ± 162,214 B/op
ParserBenchmark.parse true 1000 avgt 5 250,789 ± 180,010 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 1000 avgt 5 96880,116 ± 0,083 B/op
ParserBenchmark.parse true 100000 avgt 5 12608,364 ± 1293,125 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 100000 avgt 5 2396007,715 ± 16,160 B/op
ParserBenchmark.parse false 10 avgt 5 46,253 ± 9,299 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 10 avgt 5 74408,256 ± 137,292 B/op
ParserBenchmark.parse false 100 avgt 5 54,753 ± 8,506 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 100 avgt 5 77034,399 ± 152,759 B/op
ParserBenchmark.parse false 1000 avgt 5 227,980 ± 23,439 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 1000 avgt 5 105440,105 ± 0,010 B/op
ParserBenchmark.parse false 100000 avgt 5 18675,561 ± 242,296 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 100000 avgt 5 3202680,561 ± 0,269 B/op
{noformat}
Query of length 10 is something like:
{code:sql}select 1, '7935759579887025813642400976320251869'
from dual{code}
100:
{code:sql}
select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054'
from dual{code}
1000:
{code:sql}
select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
from dual{code}
100'000:
{noformat}you've got the idea{noformat}
was (Author: julianhyde):
@vlsi wrote:
[~julianhyde], what is the recommended way to parse queries?
I've crafted a first draft and it looks like {{SqlParser.create(sql).parseQuery()}} is taking quite noticeable time:
https://github.com/apache/calcite/pull/176
Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects.
It looks like I'm using the wrong API.
{noformat}
Benchmark (comments) (length) Mode Cnt Score Error Units
ParserBenchmark.parse true 10 avgt 5 46,718 ± 16,213 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 10 avgt 5 74487,751 ± 177,889 B/op
ParserBenchmark.parse true 100 avgt 5 54,542 ± 24,144 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 100 avgt 5 76262,748 ± 162,214 B/op
ParserBenchmark.parse true 1000 avgt 5 250,789 ± 180,010 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 1000 avgt 5 96880,116 ± 0,083 B/op
ParserBenchmark.parse true 100000 avgt 5 12608,364 ± 1293,125 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm true 100000 avgt 5 2396007,715 ± 16,160 B/op
ParserBenchmark.parse false 10 avgt 5 46,253 ± 9,299 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 10 avgt 5 74408,256 ± 137,292 B/op
ParserBenchmark.parse false 100 avgt 5 54,753 ± 8,506 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 100 avgt 5 77034,399 ± 152,759 B/op
ParserBenchmark.parse false 1000 avgt 5 227,980 ± 23,439 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 1000 avgt 5 105440,105 ± 0,010 B/op
ParserBenchmark.parse false 100000 avgt 5 18675,561 ± 242,296 us/op
ParserBenchmark.parse:·gc.alloc.rate.norm false 100000 avgt 5 3202680,561 ± 0,269 B/op
{noformat}
Query of length 10 is something like:
{code:sql}select 1, '7935759579887025813642400976320251869'
from dual{code}
100:
{code:sql}
select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054'
from dual{code}
1000:
{code:sql}
select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
from dual{code}
100'000:
{noformat}you've got the idea{noformat}
> Benchmark SQL parser
> --------------------
>
> Key: CALCITE-1012
> URL: https://issues.apache.org/jira/browse/CALCITE-1012
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)