You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by M Singh <ma...@yahoo.com.INVALID> on 2022/01/07 17:08:52 UTC

Apache Calcite - Generated code

Hi:
I am working on a project that requires changing the query and the data at run time.  The data to be processed will be stored in memory as a list of strings.  I am using java 8 at the moment.
I wanted to understand how the classes are generated in Calcite using janino at run time.  
Questions:
1. If the query is executed on the same data twice, does it generate the code twice ? If so, are all the classes regenerated or only specific ones ?2. If the query changes are all the classes regenerated ?3. If the process keeps running, will the regenerated classes cause oom ?  If so, is there any way to avoid this. 4. Is there a way to remove the generated classes at runtime ?5. Is there any way in Calcite to avoid generating the classes if the data or query changes while the process is running ?
I tried one of the csv example tests at added the following sql line twice (as shown in the snippet below) (https://github.com/apache/calcite/blob/master/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java#L351) and it does appear to generate some classes twice but please feel free to correct me if I am mistaken.
<snippet>
  @Test void testFilterableWhereTwice() throws SQLException {    final String sql =        "select empno, gender, name from EMPS where name = 'John'";    sql("filterable-model", sql)        .returns("EMPNO=110; GENDER=M; NAME=John").ok();
    sql("filterable-model", sql)        .returns("EMPNO=110; GENDER=M; NAME=John").ok();  }
</snippet>

If there is any documentation, example, or advice, on how code generation works, is there a way to avoid it, please let me know.
Thanks

Re: Apache Calcite - Generated code

Posted by Julian Hyde <jh...@gmail.com>.
+1 everything Scott said

But also, if you can use a JDBC PreparedStatement, do so. Calcite will only generate the code once, even if you execute multiple times. All of the caches in Calcite are unnecessary if you use PreparedStatement in your application. 

> On Jan 7, 2022, at 9:31 AM, Scott Reynolds <sd...@gmail.com> wrote:
> 
> I am going to attempt to answer a few of your questions. The Enumberable
> implementation generates java code as a String. There is a Java Property
> caching per unique Java Code String and therefore, when the same Java Code
> is generated the same compiled byte code will be used.
> 
> Where the Property is used
> https://github.com/apache/calcite/blob/9c0e3130e6692d1960a34a680dc13d11083ff1c8/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableInterpretable.java#L159-L166
> 
> The Property definition:
> https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java#L353
> 
> Therefore, if you set calcite.bindable.cache.maxSize Java Property (however
> you chose to do set the Property), you will get a cache of the byte code.
> 
>> requires changing the query and the data at run time
> 
> I am unclear on what this means and let me explain. Calcite takes in a
> Logical Query and will "rewrite it" (better known as optimize it) via
> RelRules (
> https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/plan/RelRule.java#L112).
> Once the query is optimized then Java Code String is generated and then it
> is compiled and cached. This means every query goes through the
> optimization process and so if you want to change how it fetches the data,
> the most straightforward place is in your own RelRule.
> 
> On Fri, Jan 7, 2022 at 9:11 AM M Singh <ma...@yahoo.com.invalid> wrote:
> 
>> Hi:
>> I am working on a project that requires changing the query and the data at
>> run time.  The data to be processed will be stored in memory as a list of
>> strings.  I am using java 8 at the moment.
>> I wanted to understand how the classes are generated in Calcite using
>> janino at run time.
>> Questions:
>> 1. If the query is executed on the same data twice, does it generate the
>> code twice ? If so, are all the classes regenerated or only specific ones
>> ?2. If the query changes are all the classes regenerated ?3. If the process
>> keeps running, will the regenerated classes cause oom ?  If so, is there
>> any way to avoid this. 4. Is there a way to remove the generated classes at
>> runtime ?5. Is there any way in Calcite to avoid generating the classes if
>> the data or query changes while the process is running ?
>> I tried one of the csv example tests at added the following sql line twice
>> (as shown in the snippet below) (
>> https://github.com/apache/calcite/blob/master/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java#L351)
>> and it does appear to generate some classes twice but please feel free to
>> correct me if I am mistaken.
>> <snippet>
>>  @Test void testFilterableWhereTwice() throws SQLException {    final
>> String sql =        "select empno, gender, name from EMPS where name =
>> 'John'";    sql("filterable-model", sql)        .returns("EMPNO=110;
>> GENDER=M; NAME=John").ok();
>>    sql("filterable-model", sql)        .returns("EMPNO=110; GENDER=M;
>> NAME=John").ok();  }
>> </snippet>
>> 
>> If there is any documentation, example, or advice, on how code generation
>> works, is there a way to avoid it, please let me know.
>> Thanks


Re: Apache Calcite - Generated code

Posted by Scott Reynolds <sd...@gmail.com>.
I am going to attempt to answer a few of your questions. The Enumberable
implementation generates java code as a String. There is a Java Property
caching per unique Java Code String and therefore, when the same Java Code
is generated the same compiled byte code will be used.

Where the Property is used
https://github.com/apache/calcite/blob/9c0e3130e6692d1960a34a680dc13d11083ff1c8/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableInterpretable.java#L159-L166

The Property definition:
https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/config/CalciteSystemProperty.java#L353

Therefore, if you set calcite.bindable.cache.maxSize Java Property (however
you chose to do set the Property), you will get a cache of the byte code.

>requires changing the query and the data at run time

I am unclear on what this means and let me explain. Calcite takes in a
Logical Query and will "rewrite it" (better known as optimize it) via
RelRules (
https://github.com/apache/calcite/blob/a8a6569e6ba75efe9d5725c49338a7f181d3ab5c/core/src/main/java/org/apache/calcite/plan/RelRule.java#L112).
Once the query is optimized then Java Code String is generated and then it
is compiled and cached. This means every query goes through the
optimization process and so if you want to change how it fetches the data,
the most straightforward place is in your own RelRule.

On Fri, Jan 7, 2022 at 9:11 AM M Singh <ma...@yahoo.com.invalid> wrote:

> Hi:
> I am working on a project that requires changing the query and the data at
> run time.  The data to be processed will be stored in memory as a list of
> strings.  I am using java 8 at the moment.
> I wanted to understand how the classes are generated in Calcite using
> janino at run time.
> Questions:
> 1. If the query is executed on the same data twice, does it generate the
> code twice ? If so, are all the classes regenerated or only specific ones
> ?2. If the query changes are all the classes regenerated ?3. If the process
> keeps running, will the regenerated classes cause oom ?  If so, is there
> any way to avoid this. 4. Is there a way to remove the generated classes at
> runtime ?5. Is there any way in Calcite to avoid generating the classes if
> the data or query changes while the process is running ?
> I tried one of the csv example tests at added the following sql line twice
> (as shown in the snippet below) (
> https://github.com/apache/calcite/blob/master/example/csv/src/test/java/org/apache/calcite/test/CsvTest.java#L351)
> and it does appear to generate some classes twice but please feel free to
> correct me if I am mistaken.
> <snippet>
>   @Test void testFilterableWhereTwice() throws SQLException {    final
> String sql =        "select empno, gender, name from EMPS where name =
> 'John'";    sql("filterable-model", sql)        .returns("EMPNO=110;
> GENDER=M; NAME=John").ok();
>     sql("filterable-model", sql)        .returns("EMPNO=110; GENDER=M;
> NAME=John").ok();  }
> </snippet>
>
> If there is any documentation, example, or advice, on how code generation
> works, is there a way to avoid it, please let me know.
> Thanks