You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2019/01/08 00:10:00 UTC

[jira] [Comment Edited] (IMPALA-8051) Compute stats fails on a column with comment character in name

    [ https://issues.apache.org/jira/browse/IMPALA-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16736503#comment-16736503 ] 

Paul Rogers edited comment on IMPALA-8051 at 1/8/19 12:09 AM:
--------------------------------------------------------------

Turns out that the "#" character has unique problems. It is a comment character, which seems to fall through the cracks in the techniques we use to detect that the identifier must be quoted. To check for quotes, we let Hive parse the token. If the name is, say, "foo+", we get two tokens, and so quote the name. But, for "foo#", we get just one token because the "#" starts a comment.

Added this trivial fix to the patch for IMPALA-7905.


was (Author: paul.rogers):
Turns out that the "#" character has unique problems. It is a comment character, which seems to fall through the cracks in the techniques we use to detect that the identifier must be quoted. To check for quotes, we let Hive parse the token. If the name is, say, "foo+", we get two tokens, and so quote the name. But, for "foo#", we get just one token because the "#" starts a comment.

> Compute stats fails on a column with comment character in name
> --------------------------------------------------------------
>
>                 Key: IMPALA-8051
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8051
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> Problem - "compute stats" query executed on a table containing a special character "#" in one of its columns is failing with below error:
> WARNINGS: AnalysisException: Syntax error in line 1:
> ...length(cola)), NDV(colb#) AS colb#, CAST(-1 as BIG...
>                              ^
> Encountered: Unexpected character
> Expected: ADD, ALTER, AND, ARRAY, AS, ASC, BETWEEN, BIGINT, BINARY, BLOCK_SIZE, BOOLEAN, CACHED, CASCADE, CHANGE, CHAR, COMMENT, COMPRESSION, CROSS, DATE, DATETIME, DECIMAL, DEFAULT, DESC, DIV, REAL, DROP, ELSE, ENCODING, END, FLOAT, FOLLOWING, FROM, FULL, GROUP, IGNORE, HAVING, ILIKE, IN, INNER, INTEGER, IREGEXP, IS, JOIN, LEFT, LIKE, LIMIT, LOCATION, MAP, NOT, NULL, NULLS, OFFSET, ON, OR, ORDER, PARTITION, PARTITIONED, PRECEDING, PRIMARY, PURGE, RANGE, RECOVER, REGEXP, RENAME, REPLACE, RESTRICT, RIGHT, RLIKE, ROW, ROWS, SELECT, SET, SMALLINT, SORT, STORED, STRAIGHT_JOIN, STRING, STRUCT, TABLESAMPLE, TBLPROPERTIES, THEN, TIMESTAMP, TINYINT, TO, UNCACHED, UNION, USING, VALUES, VARCHAR, WHEN, WHERE, WITH, COMMA, IDENTIFIER
> Steps to reproduce the issue -
> # Create a table containing special character in one of it columns from Hive. For example:
> {code:sql}
> CREATE TABLE test_special_character (`id#` int);
> {code}
> # Execute "INVALIDATE METADATA test_special_character" from Impala.
> # Execute "COMPUTE STATS test_special_character" from Impala and it'll lead to above mentioned error.
> Impala does not allow to create tables with columns containing special characters but Hive allows it by using back ticks (``) to escape it. However, Impala still can load the metadata of table and can read from column containing special character as well by escaping the special character using back ticks (``). For example, below query can be executed from Impala -
> {code:sql}
> select `id#` from test_special_character;
> {code}
> However, when "compute stats" command is executed, the query triggered by this command to compute column-level stats does not use back ticks to escape the special character present in one of the columns as it does not know that column contains a special character and this is the cause of this issue.
> (Reported by a user.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org