You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2022/09/15 18:08:00 UTC

[jira] [Created] (IMPALA-11587) Improve handling of special chars in column names

Csaba Ringhofer created IMPALA-11587:
----------------------------------------

             Summary: Improve handling of special chars in column names
                 Key: IMPALA-11587
                 URL: https://issues.apache.org/jira/browse/IMPALA-11587
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog, Frontend
            Reporter: Csaba Ringhofer


Hive can several special characters in column names if it is quoted with ' ':
create table tspeccharcol (`@"!£"!$%^=&)(-` int);

The table above can be used by Impala, but there are some caveats:
- Impala returns an error for a similar column name: Invalid column/field name
- SHOW CREATE TABLE in Impala does not quote the column, so it will return a statement that is not executable in Hive (Hive quotes it correctly)

I am not sure whether we should accept all these special characters - the original question why I investigate this was asking for @.

The error is returned due to a Hive function:
https://github.com/apache/impala/blob/cfd79b40beab86f08ad72e0bea41eabf736d0a99/fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java#L166
The second paramater should be a HiveConf, which will decide whether to accept special chars:
https://github.com/apache/hive/blob/293a448296933b7498a91e7eeb91edc88dfaa07e/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L219

Besides this function, Hive also seems to have some other rules, e.g. : is not accepted:
create table if not exists tspeccharcol (`:` int);
Error: Error while compiling statement: FAILED: ParseException line 1:48 Failed to recognize predicate ')'. Failed rule: '[., :] can not be used in column name in create table statement.' in column specification (state=42000,code=40000)


Also noticed some weirdness in Hive / beeline:
While this is accepted:
create table if not exists tspeccharcol (`""` int);
these ones are not:
create table if not exists tspeccharcol (`"` int);
create table if not exists tspeccharcol (`"\"` int);

Both fail with: Error: Error while compiling statement: FAILED: ParseException line 1:49 extraneous input ';' expecting EOF near '<EOF>' (state=42000,code=40000)

Some part of the client/parser does not seem note the ' ' quoting and applies escaping / quoting rules to the text inside.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)