You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Rohan Garg (Jira)" <ji...@apache.org> on 2022/03/31 09:42:00 UTC

[jira] [Created] (CALCITE-5074) Allowing parser extensions

Rohan Garg created CALCITE-5074:
-----------------------------------

             Summary: Allowing parser extensions
                 Key: CALCITE-5074
                 URL: https://issues.apache.org/jira/browse/CALCITE-5074
             Project: Calcite
          Issue Type: New Feature
            Reporter: Rohan Garg


We (I and [~cheddar]) recently had the need to extend various parts of Calcite in a non-standard SQL manner in our project which uses Calcite.  For example, Calcite's TABLESAMPLE keyword always takes a percentage to sample and we wanted to adjust it to also allow for an explicit number of rows.  Given that this is non-standard, we felt that it wouldn't make sense to do a PR that actually impacts Calcite's normal behavior, so we sought a method of extending Calcite to add support for our capabilities while still benefiting from all that Calcite offers.

We came up with an idea for having an "override map" that can be provided in configuration and will cause the parser to override a specific production rule.  We include a commit link to show the idea in action (https://github.com/rohangarg/calcite/commit/7f0c6ad8c8f6bb2a1d1cca025d35670a8c62b3c4).  It's a bit fiddly in that each override point needs to be extended with if/else template syntax, but that's a pattern that already seems to exist and maybe that's a feature rather than a bug?  In either case, if that's too fiddly, but this general pattern makes sense, a subsequent task could be taken on to try to see if there's another point that this could be added more generically without the need for adding the if/else in the template.  Does this seem like something that could be merged?

We have included an override grammar that also relaxes the percentage constraints on TABLESAMPLE as an example of the type of customization that this is attempting to enable.

Just for reference, this customization does exist in other systems as well, so it's non-standard but also not unheard of:

1. MS SQL Server : https://docs.microsoft.com/en-us/sql/t-sql/queries/from-transact-sql?view=sql-server-ver15#tablesample-clause
2. Snowflake : https://docs.snowflake.com/en/sql-reference/constructs/sample.html
3. Google Spanner : https://cloud.google.com/spanner/docs/reference/standard-sql/query-syntax#tablesample_operator
4. Apache Spark : https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-sampling.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)