You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2020/12/15 14:18:00 UTC

[jira] [Commented] (CALCITE-4438) Calcite SQLParser: Calcite throwing error while parsing Spark SQL syntax - INSERT OVERWRITE, RLIKE,DATE

    [ https://issues.apache.org/jira/browse/CALCITE-4438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249716#comment-17249716 ] 

Stamatis Zampetakis commented on CALCITE-4438:
----------------------------------------------

Thanks for working on this [~sambekar]. I had a quick look in the PR and heres a few general comments:

* Smaller contributions are easier to review and merge so when possible break them down to different JIRAs/PRs.
* Non SQL standard syntax/functions, such as INSERT OVERWRITE, generally go to the babel module and not core.
* Non standard functions such as RLIKE generally do not need to be added to the parser (check [SqlLibraryOperators|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/fun/SqlLibraryOperators.java])

When the core is extended with non-standard features there should be some justification.

Why there are so many changes in core/src/main/codegen/config.fmpp ?


> Calcite SQLParser: Calcite throwing error while parsing Spark SQL syntax - INSERT OVERWRITE, RLIKE,DATE 
> --------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4438
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4438
>             Project: Calcite
>          Issue Type: Improvement
>          Components: spark
>            Reporter: shradha
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Calcite SQLParser: Calcite throwing error while parsing Spark SQL syntax - INSERT OVERWRITE, RLIKE,DATE,DAY,YEAR,MONTH. I am using 1.26.0 version of calcite-core and calcite-server.
> Also it throws error if year,day,month,identity,value,date is used as alias in sql query
> For example -
> 1) DATE QUERIES
> {color:#808080}
> {color} {color:#000080}val {color}query1 = {color:#008000}"select DATE(gns_date) as dt"{color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Incorrect syntax near the keyword 'DATE' at line 1, column 10.{color}
> {color:#008000}
> {color} {color:#000080}val {color}query3 = {color:#008000}"select date('2020-11-07') as date"
> {color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "date (" at line 1, column 8.{color}
> {color:#008000} {color:#000080}val {color}query3 = "select loan_id as year, as_of_date as date"{color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "year" at line 1, column 19.{color}
>  
> {color:#008000}val query4="{color}{color:#008000}select \{D'1990-01-01'} as day,a as month from table{color}{color:#008000}"{color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "day" at line 1, column 27.{color}
> {color:#008000}2) INSERT OVERWRITE{color}
> {color:#008000}INSERT OVERWRITE TABLE sbg_schema.tableA
> select distinct SURV_MARK_CUST_IDEN
> ,latest_surv_oci
> ,tran_mark_cust_iden
> ,surv_tran_mci
> ,orgz_mark_cust_iden from tableB
> {color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "OVERWRITE" at line 2, column 8.{color}
> {color:#008000}3) RLIKE{color}
> {color:#000080}val {color}query1 = {color:#008000}"select cola from tableA where MAX(realm_email) rlike '.+@.+{color}{color:#000080}\\\\{color}{color:#008000}..+'"
> {color}
> {color:#008000}StatementMetadataFragment(com.intuit.superglue.pipeline.parsers.CalciteStatementParser,UNKNOWN,List(),List(),List(org.apache.calcite.sql.parser.SqlParseException: Encountered "rlike" at line 1, column 48.{color}
>  
> {color:#008000}All these queries are correctly being parsed via SPARK SQL . But calcite's parser grammer file doesn't suport these tokens{color}
>  
> *{color:#008000}This is the code for creating sqlParser instance:{color}*
> {color:#008000}object Spark3SqlDialect{
>  val DefaultContext: SqlDialect.Context = SqlDialect.EMPTY_CONTEXT
>  .withDatabaseProduct(SqlDialect.DatabaseProduct.SPARK)
>  .withNullCollation(NullCollation.LOW)
>  .withCaseSensitive(false)
>  .withConformance(SqlConformanceEnum.BABEL)
>  .withIdentifierQuoteString("`")
>  .withQuotedCasing(Casing.UNCHANGED)
>  .withUnquotedCasing(Casing.UNCHANGED)
>  val DEFAULT = new SparkSqlDialect(DefaultContext)
> }{color}
> {color:#008000}
> val sqlParser: SqlParser = SqlParser.create(sqlQuery, createSqlParserConfig()){color}
> {color:#008000}
> private def createSqlParserConfig() = {
>  Spark3SqlDialect.DEFAULT.configureParser(SqlParser.config()
>  .withParserFactory(SqlDdlParserImpl.FACTORY))
>  .withConformance(SqlConformanceEnum.BABEL)
>  }{color}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)