You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Cyril Catheu <cy...@startree.ai.INVALID> on 2021/11/03 20:53:11 UTC

Calcite for macro parsing in SQL ?

Hey,

I'm working on a product that executes arbitrary SQL on a "data source".
A data source implements a SQL language, for instance, it can be MySQL,
Pinot, BigQuery, etc...
I'd like to introduce a macro language on top of the SQL.
Very similar to what is done in grafana:
https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
These macros would be resolved before the SQL is executed on the data
source.

A macro looks like a function but is a string replacement, for instance:
 __timeFilter(timeColumn, start, end) -->  timeColumn>=start and
timeColumn<end

Grafana uses regex replacements, but I'm considering parsing the SQL with
Calcite to apply the macros safely and manage nested macros/functions.

--> Is this a good use case for Calcite? What seemed interesting to me is
that Calcite has all those SqlDialect implementations.

I had this flow in mind:
1. get the SqlDialect of my datasource
*2.* add to this dialect my custom macros (add to an SqlFunction list? )
3. parse
4. replace
5. rebuild

I'm wondering if 2. is possible? I'm not sure to understand how I could add
a list of SqlFunction to a dialect, or build a new dialect from an existing
one dynamically.

If you've taken the time to read this message, thanks a lot!
Have a nice day.

-- 
[image: Startree] <https://www.startree.ai/>

Cyril de Catheu

Software Engineer, StarTree

cyril@startree.ai | +33 684-829-908

[image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
Twitter] <https://twitter.com/deCatheu>

Re: Calcite for macro parsing in SQL ?

Posted by Julian Hyde <jh...@gmail.com>.
Calcite already has something called macros.  They work like Lisp macros (substituting the AST) rather than C macros (your model of string replacement). See JdbcTest.testTableMacro [1].

Does this approach solve your problem?

Julian

[1] https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/test/java/org/apache/calcite/test/JdbcTest.java#L427
 

> On Nov 3, 2021, at 1:53 PM, Cyril Catheu <cy...@startree.ai.INVALID> wrote:
> 
> Hey,
> 
> I'm working on a product that executes arbitrary SQL on a "data source".
> A data source implements a SQL language, for instance, it can be MySQL,
> Pinot, BigQuery, etc...
> I'd like to introduce a macro language on top of the SQL.
> Very similar to what is done in grafana:
> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
> These macros would be resolved before the SQL is executed on the data
> source.
> 
> A macro looks like a function but is a string replacement, for instance:
> __timeFilter(timeColumn, start, end) -->  timeColumn>=start and
> timeColumn<end
> 
> Grafana uses regex replacements, but I'm considering parsing the SQL with
> Calcite to apply the macros safely and manage nested macros/functions.
> 
> --> Is this a good use case for Calcite? What seemed interesting to me is
> that Calcite has all those SqlDialect implementations.
> 
> I had this flow in mind:
> 1. get the SqlDialect of my datasource
> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
> 3. parse
> 4. replace
> 5. rebuild
> 
> I'm wondering if 2. is possible? I'm not sure to understand how I could add
> a list of SqlFunction to a dialect, or build a new dialect from an existing
> one dynamically.
> 
> If you've taken the time to read this message, thanks a lot!
> Have a nice day.
> 
> -- 
> [image: Startree] <https://www.startree.ai/>
> 
> Cyril de Catheu
> 
> Software Engineer, StarTree
> 
> cyril@startree.ai | +33 684-829-908
> 
> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
> Twitter] <https://twitter.com/deCatheu>


Re: Calcite for macro parsing in SQL ?

Posted by Julian Hyde <jh...@gmail.com>.
No worries.

Thanks for following up and sharing what worked for you. It will help other people who are facing similar problems in future. Iā€™m pleased that you found an example that you could copy!

Julian


> On Nov 10, 2021, at 7:03 AM, Cyril Catheu <cy...@startree.ai.INVALID> wrote:
> 
> Also sorry Julian Hyde I forgot to subscribe to the mailing list when I
> sent my question so I did not see your answer šŸ¤¦ā€ā™‚ļø.
> I'll have a look at your suggestions.
> 
> On Wed, Nov 10, 2021 at 3:45 PM Cyril Catheu <cy...@startree.ai> wrote:
> 
>> Had a deeper look, I can answer my own questions now.
>> 
>> 1. Adding a list of SqlFunction to a parser?
>> No need. The SqlParser will parse unknown symbols that look like function
>> calls as SqlCall nodes with SqlKind "OTHER_FUNCTION"  and
>> SqlFunctionCategory "USER_DEFINED_FUNCTION".
>> 
>> Note: I was actually getting confused by how functions identifiers are
>> escaped in the toString method.
>> For instance: select EXP(col), myMacro(col) from ... will be parsed then
>> returned as String as EXP(`col`), `myMacro`(`col`) from ...
>> Not sure to understand why the EXP is not escaped, but I was able to play
>> with the toSqlString method to get what I want.
>> 
>> 2. Parse a SQL query, traverse the tree recursively, replace some nodes,
>> then get back a Sql string ?
>> Yes, it's a good use case for Calcite. A good inspiration was the Hoist
>> <https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
>> class. It replaces constants in q SQL string.
>> I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
>> macros.
>> 
>> It works like a charm.
>> Have a nice day.
>> 
>> On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:
>> 
>>> Hey,
>>> 
>>> I'm working on a product that executes arbitrary SQL on a "data source".
>>> A data source implements a SQL language, for instance, it can be MySQL,
>>> Pinot, BigQuery, etc...
>>> I'd like to introduce a macro language on top of the SQL.
>>> Very similar to what is done in grafana:
>>> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
>>> These macros would be resolved before the SQL is executed on the data
>>> source.
>>> 
>>> A macro looks like a function but is a string replacement, for instance:
>>> __timeFilter(timeColumn, start, end) -->  timeColumn>=start and
>>> timeColumn<end
>>> 
>>> Grafana uses regex replacements, but I'm considering parsing the SQL with
>>> Calcite to apply the macros safely and manage nested macros/functions.
>>> 
>>> --> Is this a good use case for Calcite? What seemed interesting to me is
>>> that Calcite has all those SqlDialect implementations.
>>> 
>>> I had this flow in mind:
>>> 1. get the SqlDialect of my datasource
>>> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
>>> 3. parse
>>> 4. replace
>>> 5. rebuild
>>> 
>>> I'm wondering if 2. is possible? I'm not sure to understand how I could
>>> add a list of SqlFunction to a dialect, or build a new dialect from an
>>> existing one dynamically.
>>> 
>>> If you've taken the time to read this message, thanks a lot!
>>> Have a nice day.
>>> 
>>> --
>>> [image: Startree] <https://www.startree.ai/>
>>> 
>>> Cyril de Catheu
>>> 
>>> Software Engineer, StarTree
>>> 
>>> cyril@startree.ai | +33 684-829-908
>>> 
>>> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
>>> Twitter] <https://twitter.com/deCatheu>
>>> 
>> 


Re: Calcite for macro parsing in SQL ?

Posted by Cyril Catheu <cy...@startree.ai.INVALID>.
Also sorry Julian Hyde I forgot to subscribe to the mailing list when I
sent my question so I did not see your answer šŸ¤¦ā€ā™‚ļø.
I'll have a look at your suggestions.

On Wed, Nov 10, 2021 at 3:45 PM Cyril Catheu <cy...@startree.ai> wrote:

> Had a deeper look, I can answer my own questions now.
>
> 1. Adding a list of SqlFunction to a parser?
> No need. The SqlParser will parse unknown symbols that look like function
> calls as SqlCall nodes with SqlKind "OTHER_FUNCTION"  and
> SqlFunctionCategory "USER_DEFINED_FUNCTION".
>
> Note: I was actually getting confused by how functions identifiers are
> escaped in the toString method.
> For instance: select EXP(col), myMacro(col) from ... will be parsed then
> returned as String as EXP(`col`), `myMacro`(`col`) from ...
> Not sure to understand why the EXP is not escaped, but I was able to play
> with the toSqlString method to get what I want.
>
> 2. Parse a SQL query, traverse the tree recursively, replace some nodes,
> then get back a Sql string ?
> Yes, it's a good use case for Calcite. A good inspiration was the Hoist
> <https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
> class. It replaces constants in q SQL string.
> I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
> macros.
>
> It works like a charm.
> Have a nice day.
>
> On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:
>
>> Hey,
>>
>> I'm working on a product that executes arbitrary SQL on a "data source".
>> A data source implements a SQL language, for instance, it can be MySQL,
>> Pinot, BigQuery, etc...
>> I'd like to introduce a macro language on top of the SQL.
>> Very similar to what is done in grafana:
>> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
>> These macros would be resolved before the SQL is executed on the data
>> source.
>>
>> A macro looks like a function but is a string replacement, for instance:
>>  __timeFilter(timeColumn, start, end) -->  timeColumn>=start and
>> timeColumn<end
>>
>> Grafana uses regex replacements, but I'm considering parsing the SQL with
>> Calcite to apply the macros safely and manage nested macros/functions.
>>
>> --> Is this a good use case for Calcite? What seemed interesting to me is
>> that Calcite has all those SqlDialect implementations.
>>
>> I had this flow in mind:
>> 1. get the SqlDialect of my datasource
>> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
>> 3. parse
>> 4. replace
>> 5. rebuild
>>
>> I'm wondering if 2. is possible? I'm not sure to understand how I could
>> add a list of SqlFunction to a dialect, or build a new dialect from an
>> existing one dynamically.
>>
>> If you've taken the time to read this message, thanks a lot!
>> Have a nice day.
>>
>> --
>> [image: Startree] <https://www.startree.ai/>
>>
>> Cyril de Catheu
>>
>> Software Engineer, StarTree
>>
>> cyril@startree.ai | +33 684-829-908
>>
>> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
>> Twitter] <https://twitter.com/deCatheu>
>>
>

Re: Calcite for macro parsing in SQL ?

Posted by Cyril Catheu <cy...@startree.ai.INVALID>.
Had a deeper look, I can answer my own questions now.

1. Adding a list of SqlFunction to a parser?
No need. The SqlParser will parse unknown symbols that look like function
calls as SqlCall nodes with SqlKind "OTHER_FUNCTION"  and
SqlFunctionCategory "USER_DEFINED_FUNCTION".

Note: I was actually getting confused by how functions identifiers are
escaped in the toString method.
For instance: select EXP(col), myMacro(col) from ... will be parsed then
returned as String as EXP(`col`), `myMacro`(`col`) from ...
Not sure to understand why the EXP is not escaped, but I was able to play
with the toSqlString method to get what I want.

2. Parse a SQL query, traverse the tree recursively, replace some nodes,
then get back a Sql string ?
Yes, it's a good use case for Calcite. A good inspiration was the Hoist
<https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
class. It replaces constants in q SQL string.
I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
macros.

It works like a charm.
Have a nice day.

On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:

> Hey,
>
> I'm working on a product that executes arbitrary SQL on a "data source".
> A data source implements a SQL language, for instance, it can be MySQL,
> Pinot, BigQuery, etc...
> I'd like to introduce a macro language on top of the SQL.
> Very similar to what is done in grafana:
> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
> These macros would be resolved before the SQL is executed on the data
> source.
>
> A macro looks like a function but is a string replacement, for instance:
>  __timeFilter(timeColumn, start, end) -->  timeColumn>=start and
> timeColumn<end
>
> Grafana uses regex replacements, but I'm considering parsing the SQL with
> Calcite to apply the macros safely and manage nested macros/functions.
>
> --> Is this a good use case for Calcite? What seemed interesting to me is
> that Calcite has all those SqlDialect implementations.
>
> I had this flow in mind:
> 1. get the SqlDialect of my datasource
> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
> 3. parse
> 4. replace
> 5. rebuild
>
> I'm wondering if 2. is possible? I'm not sure to understand how I could
> add a list of SqlFunction to a dialect, or build a new dialect from an
> existing one dynamically.
>
> If you've taken the time to read this message, thanks a lot!
> Have a nice day.
>
> --
> [image: Startree] <https://www.startree.ai/>
>
> Cyril de Catheu
>
> Software Engineer, StarTree
>
> cyril@startree.ai | +33 684-829-908
>
> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
> Twitter] <https://twitter.com/deCatheu>
>