You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2016/09/03 23:24:20 UTC

[jira] [Commented] (PHOENIX-3246) U+2002 (En Space) not handled as whitespace in grammar

    [ https://issues.apache.org/jira/browse/PHOENIX-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15461867#comment-15461867 ] 

Josh Elser commented on PHOENIX-3246:
-------------------------------------

.001 adds u+2002 as whitespace, removes it as a fieldchar, and has a test to show a query with terms separated by the char can be parsed.

> U+2002 (En Space) not handled as whitespace in grammar
> ------------------------------------------------------
>
>                 Key: PHOENIX-3246
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3246
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 4.9.0, 4.8.1
>
>         Attachments: PHOENIX-3246.001.patch
>
>
> I had the goofiest query issue the other day. A seemingly fine query was throwing a parse error via sqlline.
> {noformat}
> Error: ERROR 601 (42P00): Syntax error. Unexpected char: ' ' (state=42P00,code=601)
> org.apache.phoenix.exception.PhoenixParserException: ERROR 601 (42P00): Syntax error. Unexpected char: ' '
>        	at org.apache.phoenix.exception.PhoenixParserException.newException(PhoenixParserException.java:33)
>        	at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:118)
>        	at org.apache.phoenix.jdbc.PhoenixStatement$PhoenixStatementParser.parseStatement(PhoenixStatement.java:1280)
>        	at org.apache.phoenix.jdbc.PhoenixStatement.parseStatement(PhoenixStatement.java:1363)
>        	at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1434)
>        	at sqlline.Commands.execute(Commands.java:822)
>        	at sqlline.Commands.sql(Commands.java:732)
>        	at sqlline.SqlLine.dispatch(SqlLine.java:807)
>        	at sqlline.SqlLine.runCommands(SqlLine.java:1710)
>        	at sqlline.Commands.run(Commands.java:1285)
>        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>        	at java.lang.reflect.Method.invoke(Method.java:606)
>        	at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>        	at sqlline.SqlLine.dispatch(SqlLine.java:803)
>        	at sqlline.SqlLine.initArgs(SqlLine.java:613)
>        	at sqlline.SqlLine.begin(SqlLine.java:656)
>        	at sqlline.SqlLine.start(SqlLine.java:398)
>        	at sqlline.SqlLine.main(SqlLine.java:292)
> Caused by: java.lang.RuntimeException: Unexpected char: ' '
>        	at org.apache.phoenix.parse.PhoenixSQLLexer.mOTHER(PhoenixSQLLexer.java:4324)
>        	at org.apache.phoenix.parse.PhoenixSQLLexer.mTokens(PhoenixSQLLexer.java:5437)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.Lexer.nextToken(Lexer.java:85)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:143)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:137)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.CommonTokenStream.skipOffTokenChannels(CommonTokenStream.java:113)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:102)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.BufferedTokenStream.LA(BufferedTokenStream.java:174)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.BaseRecognizer.mismatchIsUnwantedToken(BaseRecognizer.java:127)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.recoverFromMismatchedToken(PhoenixSQLParser.java:354)
>        	at org.apache.phoenix.shaded.org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.parseNoReserved(PhoenixSQLParser.java:9969)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.identifier(PhoenixSQLParser.java:9936)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.column_def(PhoenixSQLParser.java:3938)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.column_defs(PhoenixSQLParser.java:3858)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.create_table_node(PhoenixSQLParser.java:1104)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.oneStatement(PhoenixSQLParser.java:816)
>        	at org.apache.phoenix.parse.PhoenixSQLParser.statement(PhoenixSQLParser.java:508)
>        	at org.apache.phoenix.parse.SQLParser.parseStatement(SQLParser.java:108)
>        	... 18 more
> {noformat}
> Re-typing the statement by hand worked successfully.
> After some hexdump and diff action, I finally found out that some of the space characters in the statement were not the normal ASCII 0x20 character, but actually the unicode U+2002 "En Space" character. They look identical to the eye which spawned all of the confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)