You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Cyril Catheu <cy...@startree.ai.INVALID> on 2021/11/03 20:53:11 UTC
Calcite for macro parsing in SQL ?
Hey,
I'm working on a product that executes arbitrary SQL on a "data source".
A data source implements a SQL language, for instance, it can be MySQL,
Pinot, BigQuery, etc...
I'd like to introduce a macro language on top of the SQL.
Very similar to what is done in grafana:
https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
These macros would be resolved before the SQL is executed on the data
source.
A macro looks like a function but is a string replacement, for instance:
__timeFilter(timeColumn, start, end) --> timeColumn>=start and
timeColumn<end
Grafana uses regex replacements, but I'm considering parsing the SQL with
Calcite to apply the macros safely and manage nested macros/functions.
--> Is this a good use case for Calcite? What seemed interesting to me is
that Calcite has all those SqlDialect implementations.
I had this flow in mind:
1. get the SqlDialect of my datasource
*2.* add to this dialect my custom macros (add to an SqlFunction list? )
3. parse
4. replace
5. rebuild
I'm wondering if 2. is possible? I'm not sure to understand how I could add
a list of SqlFunction to a dialect, or build a new dialect from an existing
one dynamically.
If you've taken the time to read this message, thanks a lot!
Have a nice day.
--
[image: Startree] <https://www.startree.ai/>
Cyril de Catheu
Software Engineer, StarTree
cyril@startree.ai | +33 684-829-908
[image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
Twitter] <https://twitter.com/deCatheu>
Re: Calcite for macro parsing in SQL ?
Posted by Julian Hyde <jh...@gmail.com>.
Calcite already has something called macros. They work like Lisp macros (substituting the AST) rather than C macros (your model of string replacement). See JdbcTest.testTableMacro [1].
Does this approach solve your problem?
Julian
[1] https://github.com/apache/calcite/blob/82dd78a14f6aef2eeec2f9c94978d04b4acc5359/core/src/test/java/org/apache/calcite/test/JdbcTest.java#L427
> On Nov 3, 2021, at 1:53 PM, Cyril Catheu <cy...@startree.ai.INVALID> wrote:
>
> Hey,
>
> I'm working on a product that executes arbitrary SQL on a "data source".
> A data source implements a SQL language, for instance, it can be MySQL,
> Pinot, BigQuery, etc...
> I'd like to introduce a macro language on top of the SQL.
> Very similar to what is done in grafana:
> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
> These macros would be resolved before the SQL is executed on the data
> source.
>
> A macro looks like a function but is a string replacement, for instance:
> __timeFilter(timeColumn, start, end) --> timeColumn>=start and
> timeColumn<end
>
> Grafana uses regex replacements, but I'm considering parsing the SQL with
> Calcite to apply the macros safely and manage nested macros/functions.
>
> --> Is this a good use case for Calcite? What seemed interesting to me is
> that Calcite has all those SqlDialect implementations.
>
> I had this flow in mind:
> 1. get the SqlDialect of my datasource
> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
> 3. parse
> 4. replace
> 5. rebuild
>
> I'm wondering if 2. is possible? I'm not sure to understand how I could add
> a list of SqlFunction to a dialect, or build a new dialect from an existing
> one dynamically.
>
> If you've taken the time to read this message, thanks a lot!
> Have a nice day.
>
> --
> [image: Startree] <https://www.startree.ai/>
>
> Cyril de Catheu
>
> Software Engineer, StarTree
>
> cyril@startree.ai | +33 684-829-908
>
> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
> Twitter] <https://twitter.com/deCatheu>
Re: Calcite for macro parsing in SQL ?
Posted by Julian Hyde <jh...@gmail.com>.
No worries.
Thanks for following up and sharing what worked for you. It will help other people who are facing similar problems in future. Iām pleased that you found an example that you could copy!
Julian
> On Nov 10, 2021, at 7:03 AM, Cyril Catheu <cy...@startree.ai.INVALID> wrote:
>
> Also sorry Julian Hyde I forgot to subscribe to the mailing list when I
> sent my question so I did not see your answer š¤¦āāļø.
> I'll have a look at your suggestions.
>
> On Wed, Nov 10, 2021 at 3:45 PM Cyril Catheu <cy...@startree.ai> wrote:
>
>> Had a deeper look, I can answer my own questions now.
>>
>> 1. Adding a list of SqlFunction to a parser?
>> No need. The SqlParser will parse unknown symbols that look like function
>> calls as SqlCall nodes with SqlKind "OTHER_FUNCTION" and
>> SqlFunctionCategory "USER_DEFINED_FUNCTION".
>>
>> Note: I was actually getting confused by how functions identifiers are
>> escaped in the toString method.
>> For instance: select EXP(col), myMacro(col) from ... will be parsed then
>> returned as String as EXP(`col`), `myMacro`(`col`) from ...
>> Not sure to understand why the EXP is not escaped, but I was able to play
>> with the toSqlString method to get what I want.
>>
>> 2. Parse a SQL query, traverse the tree recursively, replace some nodes,
>> then get back a Sql string ?
>> Yes, it's a good use case for Calcite. A good inspiration was the Hoist
>> <https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
>> class. It replaces constants in q SQL string.
>> I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
>> macros.
>>
>> It works like a charm.
>> Have a nice day.
>>
>> On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:
>>
>>> Hey,
>>>
>>> I'm working on a product that executes arbitrary SQL on a "data source".
>>> A data source implements a SQL language, for instance, it can be MySQL,
>>> Pinot, BigQuery, etc...
>>> I'd like to introduce a macro language on top of the SQL.
>>> Very similar to what is done in grafana:
>>> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
>>> These macros would be resolved before the SQL is executed on the data
>>> source.
>>>
>>> A macro looks like a function but is a string replacement, for instance:
>>> __timeFilter(timeColumn, start, end) --> timeColumn>=start and
>>> timeColumn<end
>>>
>>> Grafana uses regex replacements, but I'm considering parsing the SQL with
>>> Calcite to apply the macros safely and manage nested macros/functions.
>>>
>>> --> Is this a good use case for Calcite? What seemed interesting to me is
>>> that Calcite has all those SqlDialect implementations.
>>>
>>> I had this flow in mind:
>>> 1. get the SqlDialect of my datasource
>>> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
>>> 3. parse
>>> 4. replace
>>> 5. rebuild
>>>
>>> I'm wondering if 2. is possible? I'm not sure to understand how I could
>>> add a list of SqlFunction to a dialect, or build a new dialect from an
>>> existing one dynamically.
>>>
>>> If you've taken the time to read this message, thanks a lot!
>>> Have a nice day.
>>>
>>> --
>>> [image: Startree] <https://www.startree.ai/>
>>>
>>> Cyril de Catheu
>>>
>>> Software Engineer, StarTree
>>>
>>> cyril@startree.ai | +33 684-829-908
>>>
>>> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
>>> Twitter] <https://twitter.com/deCatheu>
>>>
>>
Re: Calcite for macro parsing in SQL ?
Posted by Cyril Catheu <cy...@startree.ai.INVALID>.
Also sorry Julian Hyde I forgot to subscribe to the mailing list when I
sent my question so I did not see your answer š¤¦āāļø.
I'll have a look at your suggestions.
On Wed, Nov 10, 2021 at 3:45 PM Cyril Catheu <cy...@startree.ai> wrote:
> Had a deeper look, I can answer my own questions now.
>
> 1. Adding a list of SqlFunction to a parser?
> No need. The SqlParser will parse unknown symbols that look like function
> calls as SqlCall nodes with SqlKind "OTHER_FUNCTION" and
> SqlFunctionCategory "USER_DEFINED_FUNCTION".
>
> Note: I was actually getting confused by how functions identifiers are
> escaped in the toString method.
> For instance: select EXP(col), myMacro(col) from ... will be parsed then
> returned as String as EXP(`col`), `myMacro`(`col`) from ...
> Not sure to understand why the EXP is not escaped, but I was able to play
> with the toSqlString method to get what I want.
>
> 2. Parse a SQL query, traverse the tree recursively, replace some nodes,
> then get back a Sql string ?
> Yes, it's a good use case for Calcite. A good inspiration was the Hoist
> <https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
> class. It replaces constants in q SQL string.
> I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
> macros.
>
> It works like a charm.
> Have a nice day.
>
> On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:
>
>> Hey,
>>
>> I'm working on a product that executes arbitrary SQL on a "data source".
>> A data source implements a SQL language, for instance, it can be MySQL,
>> Pinot, BigQuery, etc...
>> I'd like to introduce a macro language on top of the SQL.
>> Very similar to what is done in grafana:
>> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
>> These macros would be resolved before the SQL is executed on the data
>> source.
>>
>> A macro looks like a function but is a string replacement, for instance:
>> __timeFilter(timeColumn, start, end) --> timeColumn>=start and
>> timeColumn<end
>>
>> Grafana uses regex replacements, but I'm considering parsing the SQL with
>> Calcite to apply the macros safely and manage nested macros/functions.
>>
>> --> Is this a good use case for Calcite? What seemed interesting to me is
>> that Calcite has all those SqlDialect implementations.
>>
>> I had this flow in mind:
>> 1. get the SqlDialect of my datasource
>> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
>> 3. parse
>> 4. replace
>> 5. rebuild
>>
>> I'm wondering if 2. is possible? I'm not sure to understand how I could
>> add a list of SqlFunction to a dialect, or build a new dialect from an
>> existing one dynamically.
>>
>> If you've taken the time to read this message, thanks a lot!
>> Have a nice day.
>>
>> --
>> [image: Startree] <https://www.startree.ai/>
>>
>> Cyril de Catheu
>>
>> Software Engineer, StarTree
>>
>> cyril@startree.ai | +33 684-829-908
>>
>> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
>> Twitter] <https://twitter.com/deCatheu>
>>
>
Re: Calcite for macro parsing in SQL ?
Posted by Cyril Catheu <cy...@startree.ai.INVALID>.
Had a deeper look, I can answer my own questions now.
1. Adding a list of SqlFunction to a parser?
No need. The SqlParser will parse unknown symbols that look like function
calls as SqlCall nodes with SqlKind "OTHER_FUNCTION" and
SqlFunctionCategory "USER_DEFINED_FUNCTION".
Note: I was actually getting confused by how functions identifiers are
escaped in the toString method.
For instance: select EXP(col), myMacro(col) from ... will be parsed then
returned as String as EXP(`col`), `myMacro`(`col`) from ...
Not sure to understand why the EXP is not escaped, but I was able to play
with the toSqlString method to get what I want.
2. Parse a SQL query, traverse the tree recursively, replace some nodes,
then get back a Sql string ?
Yes, it's a good use case for Calcite. A good inspiration was the Hoist
<https://calcite.apache.org/javadocAggregate/org/apache/calcite/tools/Hoist.html>
class. It replaces constants in q SQL string.
I implemented something similar that replaces my "USER_DEFINED_FUNCTION"
macros.
It works like a charm.
Have a nice day.
On Wed, Nov 3, 2021 at 9:53 PM Cyril Catheu <cy...@startree.ai> wrote:
> Hey,
>
> I'm working on a product that executes arbitrary SQL on a "data source".
> A data source implements a SQL language, for instance, it can be MySQL,
> Pinot, BigQuery, etc...
> I'd like to introduce a macro language on top of the SQL.
> Very similar to what is done in grafana:
> https://grafana.com/docs/grafana/latest/datasources/mysql/#macros
> These macros would be resolved before the SQL is executed on the data
> source.
>
> A macro looks like a function but is a string replacement, for instance:
> __timeFilter(timeColumn, start, end) --> timeColumn>=start and
> timeColumn<end
>
> Grafana uses regex replacements, but I'm considering parsing the SQL with
> Calcite to apply the macros safely and manage nested macros/functions.
>
> --> Is this a good use case for Calcite? What seemed interesting to me is
> that Calcite has all those SqlDialect implementations.
>
> I had this flow in mind:
> 1. get the SqlDialect of my datasource
> *2.* add to this dialect my custom macros (add to an SqlFunction list? )
> 3. parse
> 4. replace
> 5. rebuild
>
> I'm wondering if 2. is possible? I'm not sure to understand how I could
> add a list of SqlFunction to a dialect, or build a new dialect from an
> existing one dynamically.
>
> If you've taken the time to read this message, thanks a lot!
> Have a nice day.
>
> --
> [image: Startree] <https://www.startree.ai/>
>
> Cyril de Catheu
>
> Software Engineer, StarTree
>
> cyril@startree.ai | +33 684-829-908
>
> [image: Linkedin] <https://www.linkedin.com/in/cyril-de-catheu/>[image:
> Twitter] <https://twitter.com/deCatheu>
>