You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by 徐仁和 <xu...@gmail.com> on 2022/02/16 08:17:36 UTC

Identifier starting with number parse fail

Hi community:

Parsing sql fails, when meeting the identifier starting with number.

Here, test case:
// org.apache.calcite.test.SqlToRelConverterTest
@Test void testIdentifierStartWithNumber() {
  final String sql =
      "select 1_c1 from (select deptno as 1_c1 from emp)";
  sql(sql).ok();
}

It throws an exception as below:
java.lang.RuntimeException:
org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at line 1,
column 36.
Was expecting one of:
    <QUOTED_STRING> ...
    <BRACKET_QUOTED_IDENTIFIER> ...
    <QUOTED_IDENTIFIER> ...
    <BACK_QUOTED_IDENTIFIER> ...
    <BIG_QUERY_BACK_QUOTED_IDENTIFIER> ...
    <HYPHENATED_IDENTIFIER> ...
    <IDENTIFIER> ...
    <UNICODE_QUOTED_IDENTIFIER> ...

at
org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)

*Mysql supports this case, should calcite support it?*

*Failed png in Calcite:*
[image: image.png]

*Successful png in MySql:*
[image: image.png]



Best XuRenhe

Re: Identifier starting with number parse fail

Posted by 徐仁和 <xu...@gmail.com>.
Hi Ruben, Xiong Duan and Julian
Thanks for your guidance
This problem about parsing identifier fail is without any quoting.
As Julian suggested, I will log a Jira later.

Julian Hyde <jh...@gmail.com> 于2022年2月17日周四 04:01写道:

> +1 Babel is an appropriate place to support this. (It would not be
> practical to implement in parser config, because tokenization needs to be
> efficient. It is better done in a separate parser, which is what Babel
> has.)
>
> XuRenhe, Please log a feature request in Jira to support MySQL-style
> identifiers in Babel.
>
> I know that you were talking about unquoted identifiers, but note that
> identifiers can start with numerics in Calcite and standard SQL provided
> that you quote them, e.g.
>
>   SELECT deptno AS “12c3” FROM emp
>
> Julian
>
>
> > On Feb 16, 2022, at 3:02 AM, xiong duan <no...@gmail.com> wrote:
> >
> > I do some search in:
> >
> > PostgreSQL: https://www.postgresql.org/docs/9.1/sql-syntax-lexical.html
> > ----> don't support
> > SqlServer:
> >
> https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-ver15
> > ---->don't support
> > Oracle: https://docs.oracle.com/javadb/10.6.2.1/ref/crefsqlj1003454.html
> > ----->don't support
> >
> > So If we need to support this. The Babel module is appropriate.
> >
> > Ruben Q L <ru...@gmail.com> 于2022年2月16日周三 18:31写道:
> >
> >> Hello XuRenhe,
> >>
> >> Please be aware that our dev list does not allow images.
> >>
> >> Regarding your question, I found this on the SQL standard:
> >>
> >> « An <identifier start> is any character in the Unicode General Category
> >> classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
> >> NOTE 95 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
> >> “Lo”, and “Nl” are assigned to Unicode characters that are,
> respectively,
> >> upper-case letters, lower case letters, title-case letters, modifier
> >> letters, other letters, and letter numbers. »
> >>
> >> In case it helps, I found in Wikipedia a link to an older version of the
> >> standard [1] which basically says the same thing (page 85).
> >>
> >> To sum up, identifiers must start with a letter, so Calcite is
> respecting
> >> the standard (and it seems MySql deviates from it).
> >>
> >> I am not sure if this is something that could (or should) be achieved
> with
> >> the Calcite SqlDialect mechanism; I am not very familiar with it, maybe
> >> someone else can provide more information.
> >>
> >> Best regards,
> >> Ruben
> >>
> >> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
> >>
> >>
> >> On Wed, Feb 16, 2022 at 8:18 AM 徐仁和 <xu...@gmail.com> wrote:
> >>
> >>> Hi community:
> >>>
> >>> Parsing sql fails, when meeting the identifier starting with number.
> >>>
> >>> Here, test case:
> >>> // org.apache.calcite.test.SqlToRelConverterTest
> >>> @Test void testIdentifierStartWithNumber() {
> >>>  final String sql =
> >>>      "select 1_c1 from (select deptno as 1_c1 from emp)";
> >>>  sql(sql).ok();
> >>> }
> >>>
> >>> It throws an exception as below:
> >>> java.lang.RuntimeException:
> >>> org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at
> line
> >> 1,
> >>> column 36.
> >>> Was expecting one of:
> >>>    <QUOTED_STRING> ...
> >>>    <BRACKET_QUOTED_IDENTIFIER> ...
> >>>    <QUOTED_IDENTIFIER> ...
> >>>    <BACK_QUOTED_IDENTIFIER> ...
> >>>    <BIG_QUERY_BACK_QUOTED_IDENTIFIER> ...
> >>>    <HYPHENATED_IDENTIFIER> ...
> >>>    <IDENTIFIER> ...
> >>>    <UNICODE_QUOTED_IDENTIFIER> ...
> >>>
> >>> at
> >>>
> >>
> org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)
> >>>
> >>> *Mysql supports this case, should calcite support it?*
> >>>
> >>> *Failed png in Calcite:*
> >>> [image: image.png]
> >>>
> >>> *Successful png in MySql:*
> >>> [image: image.png]
> >>>
> >>>
> >>>
> >>> Best XuRenhe
> >>>
> >>>
> >>
>
>

Re: Identifier starting with number parse fail

Posted by Julian Hyde <jh...@gmail.com>.
+1 Babel is an appropriate place to support this. (It would not be practical to implement in parser config, because tokenization needs to be efficient. It is better done in a separate parser, which is what Babel has.) 

XuRenhe, Please log a feature request in Jira to support MySQL-style identifiers in Babel.

I know that you were talking about unquoted identifiers, but note that identifiers can start with numerics in Calcite and standard SQL provided that you quote them, e.g.

  SELECT deptno AS “12c3” FROM emp

Julian


> On Feb 16, 2022, at 3:02 AM, xiong duan <no...@gmail.com> wrote:
> 
> I do some search in:
> 
> PostgreSQL: https://www.postgresql.org/docs/9.1/sql-syntax-lexical.html
> ----> don't support
> SqlServer:
> https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-ver15
> ---->don't support
> Oracle: https://docs.oracle.com/javadb/10.6.2.1/ref/crefsqlj1003454.html
> ----->don't support
> 
> So If we need to support this. The Babel module is appropriate.
> 
> Ruben Q L <ru...@gmail.com> 于2022年2月16日周三 18:31写道:
> 
>> Hello XuRenhe,
>> 
>> Please be aware that our dev list does not allow images.
>> 
>> Regarding your question, I found this on the SQL standard:
>> 
>> « An <identifier start> is any character in the Unicode General Category
>> classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
>> NOTE 95 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
>> “Lo”, and “Nl” are assigned to Unicode characters that are, respectively,
>> upper-case letters, lower case letters, title-case letters, modifier
>> letters, other letters, and letter numbers. »
>> 
>> In case it helps, I found in Wikipedia a link to an older version of the
>> standard [1] which basically says the same thing (page 85).
>> 
>> To sum up, identifiers must start with a letter, so Calcite is respecting
>> the standard (and it seems MySql deviates from it).
>> 
>> I am not sure if this is something that could (or should) be achieved with
>> the Calcite SqlDialect mechanism; I am not very familiar with it, maybe
>> someone else can provide more information.
>> 
>> Best regards,
>> Ruben
>> 
>> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>> 
>> 
>> On Wed, Feb 16, 2022 at 8:18 AM 徐仁和 <xu...@gmail.com> wrote:
>> 
>>> Hi community:
>>> 
>>> Parsing sql fails, when meeting the identifier starting with number.
>>> 
>>> Here, test case:
>>> // org.apache.calcite.test.SqlToRelConverterTest
>>> @Test void testIdentifierStartWithNumber() {
>>>  final String sql =
>>>      "select 1_c1 from (select deptno as 1_c1 from emp)";
>>>  sql(sql).ok();
>>> }
>>> 
>>> It throws an exception as below:
>>> java.lang.RuntimeException:
>>> org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at line
>> 1,
>>> column 36.
>>> Was expecting one of:
>>>    <QUOTED_STRING> ...
>>>    <BRACKET_QUOTED_IDENTIFIER> ...
>>>    <QUOTED_IDENTIFIER> ...
>>>    <BACK_QUOTED_IDENTIFIER> ...
>>>    <BIG_QUERY_BACK_QUOTED_IDENTIFIER> ...
>>>    <HYPHENATED_IDENTIFIER> ...
>>>    <IDENTIFIER> ...
>>>    <UNICODE_QUOTED_IDENTIFIER> ...
>>> 
>>> at
>>> 
>> org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)
>>> 
>>> *Mysql supports this case, should calcite support it?*
>>> 
>>> *Failed png in Calcite:*
>>> [image: image.png]
>>> 
>>> *Successful png in MySql:*
>>> [image: image.png]
>>> 
>>> 
>>> 
>>> Best XuRenhe
>>> 
>>> 
>> 


Re: Identifier starting with number parse fail

Posted by xiong duan <no...@gmail.com>.
I do some search in:

PostgreSQL: https://www.postgresql.org/docs/9.1/sql-syntax-lexical.html
----> don't support
SqlServer:
https://docs.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-ver15
---->don't support
Oracle: https://docs.oracle.com/javadb/10.6.2.1/ref/crefsqlj1003454.html
----->don't support

So If we need to support this. The Babel module is appropriate.

Ruben Q L <ru...@gmail.com> 于2022年2月16日周三 18:31写道:

> Hello XuRenhe,
>
> Please be aware that our dev list does not allow images.
>
> Regarding your question, I found this on the SQL standard:
>
> « An <identifier start> is any character in the Unicode General Category
> classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
> NOTE 95 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
> “Lo”, and “Nl” are assigned to Unicode characters that are, respectively,
> upper-case letters, lower case letters, title-case letters, modifier
> letters, other letters, and letter numbers. »
>
> In case it helps, I found in Wikipedia a link to an older version of the
> standard [1] which basically says the same thing (page 85).
>
> To sum up, identifiers must start with a letter, so Calcite is respecting
> the standard (and it seems MySql deviates from it).
>
> I am not sure if this is something that could (or should) be achieved with
> the Calcite SqlDialect mechanism; I am not very familiar with it, maybe
> someone else can provide more information.
>
> Best regards,
> Ruben
>
> [1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>
>
> On Wed, Feb 16, 2022 at 8:18 AM 徐仁和 <xu...@gmail.com> wrote:
>
> > Hi community:
> >
> > Parsing sql fails, when meeting the identifier starting with number.
> >
> > Here, test case:
> > // org.apache.calcite.test.SqlToRelConverterTest
> > @Test void testIdentifierStartWithNumber() {
> >   final String sql =
> >       "select 1_c1 from (select deptno as 1_c1 from emp)";
> >   sql(sql).ok();
> > }
> >
> > It throws an exception as below:
> > java.lang.RuntimeException:
> > org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at line
> 1,
> > column 36.
> > Was expecting one of:
> >     <QUOTED_STRING> ...
> >     <BRACKET_QUOTED_IDENTIFIER> ...
> >     <QUOTED_IDENTIFIER> ...
> >     <BACK_QUOTED_IDENTIFIER> ...
> >     <BIG_QUERY_BACK_QUOTED_IDENTIFIER> ...
> >     <HYPHENATED_IDENTIFIER> ...
> >     <IDENTIFIER> ...
> >     <UNICODE_QUOTED_IDENTIFIER> ...
> >
> > at
> >
> org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)
> >
> > *Mysql supports this case, should calcite support it?*
> >
> > *Failed png in Calcite:*
> > [image: image.png]
> >
> > *Successful png in MySql:*
> > [image: image.png]
> >
> >
> >
> > Best XuRenhe
> >
> >
>

Re: Identifier starting with number parse fail

Posted by Ruben Q L <ru...@gmail.com>.
Hello XuRenhe,

Please be aware that our dev list does not allow images.

Regarding your question, I found this on the SQL standard:

« An <identifier start> is any character in the Unicode General Category
classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
NOTE 95 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
“Lo”, and “Nl” are assigned to Unicode characters that are, respectively,
upper-case letters, lower case letters, title-case letters, modifier
letters, other letters, and letter numbers. »

In case it helps, I found in Wikipedia a link to an older version of the
standard [1] which basically says the same thing (page 85).

To sum up, identifiers must start with a letter, so Calcite is respecting
the standard (and it seems MySql deviates from it).

I am not sure if this is something that could (or should) be achieved with
the Calcite SqlDialect mechanism; I am not very familiar with it, maybe
someone else can provide more information.

Best regards,
Ruben

[1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt


On Wed, Feb 16, 2022 at 8:18 AM 徐仁和 <xu...@gmail.com> wrote:

> Hi community:
>
> Parsing sql fails, when meeting the identifier starting with number.
>
> Here, test case:
> // org.apache.calcite.test.SqlToRelConverterTest
> @Test void testIdentifierStartWithNumber() {
>   final String sql =
>       "select 1_c1 from (select deptno as 1_c1 from emp)";
>   sql(sql).ok();
> }
>
> It throws an exception as below:
> java.lang.RuntimeException:
> org.apache.calcite.sql.parser.SqlParseException: Encountered "1" at line 1,
> column 36.
> Was expecting one of:
>     <QUOTED_STRING> ...
>     <BRACKET_QUOTED_IDENTIFIER> ...
>     <QUOTED_IDENTIFIER> ...
>     <BACK_QUOTED_IDENTIFIER> ...
>     <BIG_QUERY_BACK_QUOTED_IDENTIFIER> ...
>     <HYPHENATED_IDENTIFIER> ...
>     <IDENTIFIER> ...
>     <UNICODE_QUOTED_IDENTIFIER> ...
>
> at
> org.apache.calcite.sql.parser.SqlParseException.writeReplace(SqlParseException.java:171)
>
> *Mysql supports this case, should calcite support it?*
>
> *Failed png in Calcite:*
> [image: image.png]
>
> *Successful png in MySql:*
> [image: image.png]
>
>
>
> Best XuRenhe
>
>