You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2015/12/09 21:34:10 UTC
[jira] [Comment Edited] (CALCITE-1012) Benchmark SQL parser

    [ https://issues.apache.org/jira/browse/CALCITE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049350#comment-15049350 ] 

Julian Hyde edited comment on CALCITE-1012 at 12/9/15 8:33 PM:
---------------------------------------------------------------

In CALCITE-459 [~vlsi] wrote:

[~julianhyde], what is the recommended way to parse queries?
I've crafted a first draft and it looks like {{SqlParser.create(sql).parseQuery()}} is taking quite noticeable time:
https://github.com/apache/calcite/pull/176

Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects.
It looks like I'm using the wrong API.

{noformat}

Benchmark                                 (comments)  (length)  Mode  Cnt        Score        Error   Units
ParserBenchmark.parse                           true        10  avgt    5       46,718 ±     16,213   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true        10  avgt    5    74487,751 ±    177,889    B/op
ParserBenchmark.parse                           true       100  avgt    5       54,542 ±     24,144   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true       100  avgt    5    76262,748 ±    162,214    B/op
ParserBenchmark.parse                           true      1000  avgt    5      250,789 ±    180,010   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true      1000  avgt    5    96880,116 ±      0,083    B/op
ParserBenchmark.parse                           true    100000  avgt    5    12608,364 ±   1293,125   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true    100000  avgt    5  2396007,715 ±     16,160    B/op
ParserBenchmark.parse                          false        10  avgt    5       46,253 ±      9,299   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false        10  avgt    5    74408,256 ±    137,292    B/op
ParserBenchmark.parse                          false       100  avgt    5       54,753 ±      8,506   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false       100  avgt    5    77034,399 ±    152,759    B/op
ParserBenchmark.parse                          false      1000  avgt    5      227,980 ±     23,439   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false      1000  avgt    5   105440,105 ±      0,010    B/op
ParserBenchmark.parse                          false    100000  avgt    5    18675,561 ±    242,296   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false    100000  avgt    5  3202680,561 ±      0,269    B/op
{noformat}

Query of length 10 is something like:
{code:sql}select 1, '7935759579887025813642400976320251869'
 from dual{code}

100:
{code:sql}
select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054'
 from dual{code}

1000:
{code:sql}
select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
 from dual{code}

100'000:
{noformat}you've got the idea{noformat}


was (Author: julianhyde):
@vlsi wrote:

[~julianhyde], what is the recommended way to parse queries?
I've crafted a first draft and it looks like {{SqlParser.create(sql).parseQuery()}} is taking quite noticeable time:
https://github.com/apache/calcite/pull/176

Even parsing a trivial query takes something like 20-40 us and allocates 70KiB of java objects.
It looks like I'm using the wrong API.

{noformat}

Benchmark                                 (comments)  (length)  Mode  Cnt        Score        Error   Units
ParserBenchmark.parse                           true        10  avgt    5       46,718 ±     16,213   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true        10  avgt    5    74487,751 ±    177,889    B/op
ParserBenchmark.parse                           true       100  avgt    5       54,542 ±     24,144   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true       100  avgt    5    76262,748 ±    162,214    B/op
ParserBenchmark.parse                           true      1000  avgt    5      250,789 ±    180,010   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true      1000  avgt    5    96880,116 ±      0,083    B/op
ParserBenchmark.parse                           true    100000  avgt    5    12608,364 ±   1293,125   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm       true    100000  avgt    5  2396007,715 ±     16,160    B/op
ParserBenchmark.parse                          false        10  avgt    5       46,253 ±      9,299   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false        10  avgt    5    74408,256 ±    137,292    B/op
ParserBenchmark.parse                          false       100  avgt    5       54,753 ±      8,506   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false       100  avgt    5    77034,399 ±    152,759    B/op
ParserBenchmark.parse                          false      1000  avgt    5      227,980 ±     23,439   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false      1000  avgt    5   105440,105 ±      0,010    B/op
ParserBenchmark.parse                          false    100000  avgt    5    18675,561 ±    242,296   us/op
ParserBenchmark.parse:·gc.alloc.rate.norm      false    100000  avgt    5  3202680,561 ±      0,269    B/op
{noformat}

Query of length 10 is something like:
{code:sql}select 1, '7935759579887025813642400976320251869'
 from dual{code}

100:
{code:sql}
select 1, ?, 1910525591, -33731156, -140885363, '-28690016290210702076711497928626237037', '-76148433530322602893458688481236178054'
 from dual{code}

1000:
{code:sql}
select 1, 1482506249, ?, 932852513, 540573036, -527274831, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, 1692895102, '50951735439143566-7690637232772497893', ?, ?, ?, ?, ?// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '4380376358709010212-1173038460074161045', -1918523131, '66668745555380426053762426347771867584', '-4774732703008184939-3817796209798356166', '89039746234244470812022748182366240377', '-12480157603806734075299113791518246288', 1171721598// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '47274140693996363363917338148185131534', -80138859, ?, ?, 1676036541, ?, -983339282// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
, '70734587149480568336941566409423582112', '-8847237352737739855-2917641826110218142', '4087359824551837905-5660719534397547355', ?, ?, 1192173528, '-4576995549188621938-8764240972870282558'// sb.append('\'').append(rnd.nextLong()).append(rnd.nextLong()).append(rnd.nextLong())
 from dual{code}

100'000:
{noformat}you've got the idea{noformat}

> Benchmark SQL parser
> --------------------
>
>                 Key: CALCITE-1012
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1012
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)