You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Andrei Sereda <an...@sereda.cc> on 2019/04/30 16:41:04 UTC

calcite overhead for simple queries (optimization and conversion phase)

Hello,

One of our applications uses Calcite as translation layer between SQL and
destination source (mongo, elastic, etc.). The queries are fairly simple
and similar to the one below:

select col1, col2, agg3(col3), agg4(col4), ..., aggN(colN) from table
where id in (1, 2, 3) group by col1, col2

The only complexity is that number of columns can be fairly large (up to
150) but otherwise it is a standard aggregation with some simple predicates
(no joins). Number of rows is small and usually is less than 1k.

We have observed that overhead for such queries is 2x-3x (95th percentile)
compared to executing produced queries directly on the data-source (eg.
mongo / elastic query). Difference is in the order of 100ms: 200ms (direct)
vs 600ms (calcite). Unfortunately such latency is noticeable in UI.

Originally I thought it has to do with compilation time (janino) but
profiling showed that most of time is spent in the following methods:

   1. .preprare.Prepare.optimize() (VolcanoPanner)
   2. .sql2rel.SqlToRelConverter.convertQuery()

What can be done to avoid such overhead ?

   1. Have avatica / calcite connection cache connection.prepareStatement()
   so same optimization is not done twice ? Manually re-running same
   PreparedStatement helps.
   2. Use interpreter ?
   3. Manually re-use PreparedQuery (eg. from Cache<String,
   PreparedStatement>) ? This introduces other complexities like executing
   same query in parallel.
   4. Minimize number of Rules ?
   5. Cache logical plan (RelNode) ?
   6. Anything else ?

Many thanks in advance.

Andrei.

Re: calcite overhead for simple queries (optimization and conversion phase)

Posted by Andrei Sereda <an...@sereda.cc>.

In case somebody is interested, here is an isolated test to compare
performance  :
https://gist.github.com/asereda-gs/9f09f3c46904c4b6670aeca602164c6b

I get the following results (10x difference) my previous comparison was
with elastic search (now it is HSQLDB) :

=== Column columnCount: 10
direct count:200 50%:0 80%:0 90%:0 95%:0 max:1
calcite count:200 50%:25 80%:28 90%:30 95%:32 max:40

=== Column columnCount: 100
direct count:200 50%:3 80%:3 90%:3 95%:3 max:5
calcite count:200 50%:48 80%:49 90%:50 95%:51 max:60

=== Column columnCount: 500
direct count:200 50%:19 80%:20 90%:21 95%:22 max:29
calcite count:200 50%:190 80%:195 90%:200 95%:204 max:241

=== Column columnCount: 1000
direct count:200 50%:43 80%:47 90%:50 95%:51 max:56
calcite count:200 50%:331 80%:342 90%:352 95%:361 max:389

It's possible that my test is doing something wrong or I'm measuring
different operation (independent "audit" will be appreciated).

I will check if I can narrow down the issue with a profiler.

On Tue, Apr 30, 2019 at 4:07 PM Julian Hyde <jh...@apache.org> wrote:

> > As temporary work-around can I re-use PreparedStatement ?
>
> If you mean re-execute PreparedStatement, yes. Its main purpose is to only
> pay the cost of preparation once, but execute multiple times (possibly with
> different parameter values).
>
> But each time you execute it will be the same query and the same plan.
>
> Julian
>
>
> > On Apr 30, 2019, at 11:43 AM, Andrei Sereda <an...@sereda.cc> wrote:
> >
> >> Consider using Hep planner rather than Volcano planner.
> > Will check with Hep. Working on isolated unit test.
> >
> >> If you reduce the number of columns (to say 10), does the time reduce
> > significantly? That might be a clue that there is a performance bug
> > somewhere.
> > Number of columns seems to be correlated with performance penalty:
> > - for 10 columns:  120 (raw) vs 150ms (calcite)
> > - for 50 columns:  260 (raw) vs 740ms (calcite)
> >
> >> Are these numbers on the first query, or after the system has warmed up?
> > After. Following warmup I'm running 100 queries sequentially.
> >
> > As temporary work-around can I re-use PreparedStatement ?
> >
> > On Tue, Apr 30, 2019 at 2:06 PM Julian Hyde <jh...@apache.org> wrote:
> >
> >> Consider using Hep planner rather than Volcano planner. (Measure the
> >> number of rule firings. Is it higher than you think is necessary, given
> the
> >> complexity of the query?)
> >>
> >> If you reduce the number of columns (to say 10), does the time reduce
> >> significantly? That might be a clue that there is a performance bug
> >> somewhere.
> >>
> >> Are these numbers on the first query, or after the system has warmed up?
> >>
> >> Julian
> >>
> >>
> >>> On Apr 30, 2019, at 9:41 AM, Andrei Sereda <an...@sereda.cc> wrote:
> >>>
> >>> Hello,
> >>>
> >>> One of our applications uses Calcite as translation layer between SQL
> and
> >>> destination source (mongo, elastic, etc.). The queries are fairly
> simple
> >>> and similar to the one below:
> >>>
> >>> select col1, col2, agg3(col3), agg4(col4), ..., aggN(colN) from table
> >>> where id in (1, 2, 3) group by col1, col2
> >>>
> >>> The only complexity is that number of columns can be fairly large (up
> to
> >>> 150) but otherwise it is a standard aggregation with some simple
> >> predicates
> >>> (no joins). Number of rows is small and usually is less than 1k.
> >>>
> >>> We have observed that overhead for such queries is 2x-3x (95th
> >> percentile)
> >>> compared to executing produced queries directly on the data-source (eg.
> >>> mongo / elastic query). Difference is in the order of 100ms: 200ms
> >> (direct)
> >>> vs 600ms (calcite). Unfortunately such latency is noticeable in UI.
> >>>
> >>> Originally I thought it has to do with compilation time (janino) but
> >>> profiling showed that most of time is spent in the following methods:
> >>>
> >>>  1. .preprare.Prepare.optimize() (VolcanoPanner)
> >>>  2. .sql2rel.SqlToRelConverter.convertQuery()
> >>>
> >>> What can be done to avoid such overhead ?
> >>>
> >>>  1. Have avatica / calcite connection cache
> >> connection.prepareStatement()
> >>>  so same optimization is not done twice ? Manually re-running same
> >>>  PreparedStatement helps.
> >>>  2. Use interpreter ?
> >>>  3. Manually re-use PreparedQuery (eg. from Cache<String,
> >>>  PreparedStatement>) ? This introduces other complexities like
> executing
> >>>  same query in parallel.
> >>>  4. Minimize number of Rules ?
> >>>  5. Cache logical plan (RelNode) ?
> >>>  6. Anything else ?
> >>>
> >>> Many thanks in advance.
> >>>
> >>> Andrei.
> >>
> >>
>
>

Re: calcite overhead for simple queries (optimization and conversion phase)

Posted by Julian Hyde <jh...@apache.org>.

> As temporary work-around can I re-use PreparedStatement ?

If you mean re-execute PreparedStatement, yes. Its main purpose is to only pay the cost of preparation once, but execute multiple times (possibly with different parameter values).

But each time you execute it will be the same query and the same plan.

Julian


> On Apr 30, 2019, at 11:43 AM, Andrei Sereda <an...@sereda.cc> wrote:
> 
>> Consider using Hep planner rather than Volcano planner.
> Will check with Hep. Working on isolated unit test.
> 
>> If you reduce the number of columns (to say 10), does the time reduce
> significantly? That might be a clue that there is a performance bug
> somewhere.
> Number of columns seems to be correlated with performance penalty:
> - for 10 columns:  120 (raw) vs 150ms (calcite)
> - for 50 columns:  260 (raw) vs 740ms (calcite)
> 
>> Are these numbers on the first query, or after the system has warmed up?
> After. Following warmup I'm running 100 queries sequentially.
> 
> As temporary work-around can I re-use PreparedStatement ?
> 
> On Tue, Apr 30, 2019 at 2:06 PM Julian Hyde <jh...@apache.org> wrote:
> 
>> Consider using Hep planner rather than Volcano planner. (Measure the
>> number of rule firings. Is it higher than you think is necessary, given the
>> complexity of the query?)
>> 
>> If you reduce the number of columns (to say 10), does the time reduce
>> significantly? That might be a clue that there is a performance bug
>> somewhere.
>> 
>> Are these numbers on the first query, or after the system has warmed up?
>> 
>> Julian
>> 
>> 
>>> On Apr 30, 2019, at 9:41 AM, Andrei Sereda <an...@sereda.cc> wrote:
>>> 
>>> Hello,
>>> 
>>> One of our applications uses Calcite as translation layer between SQL and
>>> destination source (mongo, elastic, etc.). The queries are fairly simple
>>> and similar to the one below:
>>> 
>>> select col1, col2, agg3(col3), agg4(col4), ..., aggN(colN) from table
>>> where id in (1, 2, 3) group by col1, col2
>>> 
>>> The only complexity is that number of columns can be fairly large (up to
>>> 150) but otherwise it is a standard aggregation with some simple
>> predicates
>>> (no joins). Number of rows is small and usually is less than 1k.
>>> 
>>> We have observed that overhead for such queries is 2x-3x (95th
>> percentile)
>>> compared to executing produced queries directly on the data-source (eg.
>>> mongo / elastic query). Difference is in the order of 100ms: 200ms
>> (direct)
>>> vs 600ms (calcite). Unfortunately such latency is noticeable in UI.
>>> 
>>> Originally I thought it has to do with compilation time (janino) but
>>> profiling showed that most of time is spent in the following methods:
>>> 
>>>  1. .preprare.Prepare.optimize() (VolcanoPanner)
>>>  2. .sql2rel.SqlToRelConverter.convertQuery()
>>> 
>>> What can be done to avoid such overhead ?
>>> 
>>>  1. Have avatica / calcite connection cache
>> connection.prepareStatement()
>>>  so same optimization is not done twice ? Manually re-running same
>>>  PreparedStatement helps.
>>>  2. Use interpreter ?
>>>  3. Manually re-use PreparedQuery (eg. from Cache<String,
>>>  PreparedStatement>) ? This introduces other complexities like executing
>>>  same query in parallel.
>>>  4. Minimize number of Rules ?
>>>  5. Cache logical plan (RelNode) ?
>>>  6. Anything else ?
>>> 
>>> Many thanks in advance.
>>> 
>>> Andrei.
>> 
>>

Re: calcite overhead for simple queries (optimization and conversion phase)

Posted by Andrei Sereda <an...@sereda.cc>.

> Consider using Hep planner rather than Volcano planner.
Will check with Hep. Working on isolated unit test.

> If you reduce the number of columns (to say 10), does the time reduce
significantly? That might be a clue that there is a performance bug
somewhere.
Number of columns seems to be correlated with performance penalty:
- for 10 columns:  120 (raw) vs 150ms (calcite)
- for 50 columns:  260 (raw) vs 740ms (calcite)

> Are these numbers on the first query, or after the system has warmed up?
After. Following warmup I'm running 100 queries sequentially.

As temporary work-around can I re-use PreparedStatement ?

On Tue, Apr 30, 2019 at 2:06 PM Julian Hyde <jh...@apache.org> wrote:

> Consider using Hep planner rather than Volcano planner. (Measure the
> number of rule firings. Is it higher than you think is necessary, given the
> complexity of the query?)
>
> If you reduce the number of columns (to say 10), does the time reduce
> significantly? That might be a clue that there is a performance bug
> somewhere.
>
> Are these numbers on the first query, or after the system has warmed up?
>
> Julian
>
>
> > On Apr 30, 2019, at 9:41 AM, Andrei Sereda <an...@sereda.cc> wrote:
> >
> > Hello,
> >
> > One of our applications uses Calcite as translation layer between SQL and
> > destination source (mongo, elastic, etc.). The queries are fairly simple
> > and similar to the one below:
> >
> > select col1, col2, agg3(col3), agg4(col4), ..., aggN(colN) from table
> > where id in (1, 2, 3) group by col1, col2
> >
> > The only complexity is that number of columns can be fairly large (up to
> > 150) but otherwise it is a standard aggregation with some simple
> predicates
> > (no joins). Number of rows is small and usually is less than 1k.
> >
> > We have observed that overhead for such queries is 2x-3x (95th
> percentile)
> > compared to executing produced queries directly on the data-source (eg.
> > mongo / elastic query). Difference is in the order of 100ms: 200ms
> (direct)
> > vs 600ms (calcite). Unfortunately such latency is noticeable in UI.
> >
> > Originally I thought it has to do with compilation time (janino) but
> > profiling showed that most of time is spent in the following methods:
> >
> >   1. .preprare.Prepare.optimize() (VolcanoPanner)
> >   2. .sql2rel.SqlToRelConverter.convertQuery()
> >
> > What can be done to avoid such overhead ?
> >
> >   1. Have avatica / calcite connection cache
> connection.prepareStatement()
> >   so same optimization is not done twice ? Manually re-running same
> >   PreparedStatement helps.
> >   2. Use interpreter ?
> >   3. Manually re-use PreparedQuery (eg. from Cache<String,
> >   PreparedStatement>) ? This introduces other complexities like executing
> >   same query in parallel.
> >   4. Minimize number of Rules ?
> >   5. Cache logical plan (RelNode) ?
> >   6. Anything else ?
> >
> > Many thanks in advance.
> >
> > Andrei.
>
>

Re: calcite overhead for simple queries (optimization and conversion phase)

Posted by Julian Hyde <jh...@apache.org>.

Consider using Hep planner rather than Volcano planner. (Measure the number of rule firings. Is it higher than you think is necessary, given the complexity of the query?)

If you reduce the number of columns (to say 10), does the time reduce significantly? That might be a clue that there is a performance bug somewhere.

Are these numbers on the first query, or after the system has warmed up?

Julian


> On Apr 30, 2019, at 9:41 AM, Andrei Sereda <an...@sereda.cc> wrote:
> 
> Hello,
> 
> One of our applications uses Calcite as translation layer between SQL and
> destination source (mongo, elastic, etc.). The queries are fairly simple
> and similar to the one below:
> 
> select col1, col2, agg3(col3), agg4(col4), ..., aggN(colN) from table
> where id in (1, 2, 3) group by col1, col2
> 
> The only complexity is that number of columns can be fairly large (up to
> 150) but otherwise it is a standard aggregation with some simple predicates
> (no joins). Number of rows is small and usually is less than 1k.
> 
> We have observed that overhead for such queries is 2x-3x (95th percentile)
> compared to executing produced queries directly on the data-source (eg.
> mongo / elastic query). Difference is in the order of 100ms: 200ms (direct)
> vs 600ms (calcite). Unfortunately such latency is noticeable in UI.
> 
> Originally I thought it has to do with compilation time (janino) but
> profiling showed that most of time is spent in the following methods:
> 
>   1. .preprare.Prepare.optimize() (VolcanoPanner)
>   2. .sql2rel.SqlToRelConverter.convertQuery()
> 
> What can be done to avoid such overhead ?
> 
>   1. Have avatica / calcite connection cache connection.prepareStatement()
>   so same optimization is not done twice ? Manually re-running same
>   PreparedStatement helps.
>   2. Use interpreter ?
>   3. Manually re-use PreparedQuery (eg. from Cache<String,
>   PreparedStatement>) ? This introduces other complexities like executing
>   same query in parallel.
>   4. Minimize number of Rules ?
>   5. Cache logical plan (RelNode) ?
>   6. Anything else ?
> 
> Many thanks in advance.
> 
> Andrei.