You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iotdb.apache.org by Chao Wang <cc...@163.com> on 2022/01/01 08:13:00 UTC

Re: Performance Improvements with Code Generation

Hi Julian,
    Great Job!


    Looking forward to your later work.


    BTW, happy new year for everybody!


Thanks!


Chao Wang
BONC ltd
ccgowork@163.com
On 1/1/2022 06:44,Julian Feinauer<j....@pragmaticminds.de> wrote:
Hi all,

after a short discussion I had today with Jialin I took some minutes to see if / how code generation could help us to impove the performance on Queries.

The first thing I looked at was the Filterin in the SeriesReader.
Currently many Filters have if conditions / branches in their `satisfy` method.
As this method is called on every datapoint (that is not filtered out on another level) there is lot of potential for improvement.

I pushed a branch with my code changes (which are not that much, actually), see [1] for all technical details.

What did i do?

Basically I changed the existing filters to also return an AST of their "filter operation". So I can visit all filters and dsynamically generate the Java Code for the "perfect" Filter in that situation.
This is then dynamically compiled in a Java class and live loaded.

Of course this process takes some time (about 100 ms or a bit below, I think) but then improves the speed on every data point.

What are my results so far?

I generated an example file with about 50 Mio. datapoints and three different types of filters (simple, standard, complex).

And so far, my results show that the performance can be improved to up to 20%!!!

For simple queries its not beneficial and can make it even a bit slower (see overhead above).

For standard queries I saw about 5% improvement in query times.

For complex queries i saw between 10% and 20% improvement in query time.

I will do some more benchmarks the next days and would love tow ork together with other devs from the community to get this feature in the next release!

Best!
Julian

PS.: Happy new year for everybody

[1] Branch: https://github.com/apache/iotdb/tree/experimental/code-generation
[https://opengraph.githubassets.com/4a5319a07e99ead0a4adb97783f0ba1682e78fa6c074c1a9204c2ea82a2a347b/apache/iotdb]<https://github.com/apache/iotdb/tree/experimental/code-generation>
GitHub - apache/iotdb at experimental/code-generation<https://github.com/apache/iotdb/tree/experimental/code-generation>
Apache IoTDB. Contribute to apache/iotdb development by creating an account on GitHub.
github.com


Re: Performance Improvements with Code Generation

Posted by Xiangdong Huang <sa...@gmail.com>.
Hi,

Julian has shared with me some results and indeed it has a big improvement
in some cases.
This work is promising, but we must avoid the cases that code generation
time cost  > saved time cost...

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Chao Wang <cc...@163.com> 于2022年1月1日周六 16:13写道:

> Hi Julian,
>     Great Job!
>
>
>     Looking forward to your later work.
>
>
>     BTW, happy new year for everybody!
>
>
> Thanks!
>
>
> Chao Wang
> BONC ltd
> ccgowork@163.com
> On 1/1/2022 06:44,Julian Feinauer<j....@pragmaticminds.de> wrote:
> Hi all,
>
> after a short discussion I had today with Jialin I took some minutes to
> see if / how code generation could help us to impove the performance on
> Queries.
>
> The first thing I looked at was the Filterin in the SeriesReader.
> Currently many Filters have if conditions / branches in their `satisfy`
> method.
> As this method is called on every datapoint (that is not filtered out on
> another level) there is lot of potential for improvement.
>
> I pushed a branch with my code changes (which are not that much,
> actually), see [1] for all technical details.
>
> What did i do?
>
> Basically I changed the existing filters to also return an AST of their
> "filter operation". So I can visit all filters and dsynamically generate
> the Java Code for the "perfect" Filter in that situation.
> This is then dynamically compiled in a Java class and live loaded.
>
> Of course this process takes some time (about 100 ms or a bit below, I
> think) but then improves the speed on every data point.
>
> What are my results so far?
>
> I generated an example file with about 50 Mio. datapoints and three
> different types of filters (simple, standard, complex).
>
> And so far, my results show that the performance can be improved to up to
> 20%!!!
>
> For simple queries its not beneficial and can make it even a bit slower
> (see overhead above).
>
> For standard queries I saw about 5% improvement in query times.
>
> For complex queries i saw between 10% and 20% improvement in query time.
>
> I will do some more benchmarks the next days and would love tow ork
> together with other devs from the community to get this feature in the next
> release!
>
> Best!
> Julian
>
> PS.: Happy new year for everybody
>
> [1] Branch:
> https://github.com/apache/iotdb/tree/experimental/code-generation
> [
> https://opengraph.githubassets.com/4a5319a07e99ead0a4adb97783f0ba1682e78fa6c074c1a9204c2ea82a2a347b/apache/iotdb
> ]<https://github.com/apache/iotdb/tree/experimental/code-generation>
> GitHub - apache/iotdb at experimental/code-generation<
> https://github.com/apache/iotdb/tree/experimental/code-generation>
> Apache IoTDB. Contribute to apache/iotdb development by creating an
> account on GitHub.
> github.com
>
>