You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2022/09/29 13:41:00 UTC

[jira] [Updated] (IMPALA-11587) Improve handling of special chars in column names

     [ https://issues.apache.org/jira/browse/IMPALA-11587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Csaba Ringhofer updated IMPALA-11587:
-------------------------------------
    Description: 
Hive can use several special characters in column names if it is quoted with ' ':
create table tspeccharcol (`@"!£"!$%^=&)(-` int);

The table above can be used by Impala, but there are some caveats:
- Impala returns an error for a similar column name: Invalid column/field name
- SHOW CREATE TABLE in Impala does not quote the column, so it will return a statement that is not executable in Hive (Hive quotes it correctly)

I am not sure whether we should accept all these special characters - the original question why I investigate this was asking for @.

The error is returned due to a Hive function:
https://github.com/apache/impala/blob/cfd79b40beab86f08ad72e0bea41eabf736d0a99/fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java#L166
The second paramater should be a HiveConf, which will decide whether to accept special chars:
https://github.com/apache/hive/blob/293a448296933b7498a91e7eeb91edc88dfaa07e/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L219

Besides this function, Hive also seems to have some other rules, e.g. : is not accepted:
create table if not exists tspeccharcol (`:` int);
Error: Error while compiling statement: FAILED: ParseException line 1:48 Failed to recognize predicate ')'. Failed rule: '[., :] can not be used in column name in create table statement.' in column specification (state=42000,code=40000)


Also noticed some weirdness in Hive / beeline:
While this is accepted:
create table if not exists tspeccharcol (`""` int);
these ones are not:
create table if not exists tspeccharcol (`"` int);
create table if not exists tspeccharcol (`"\"` int);

Both fail with: Error: Error while compiling statement: FAILED: ParseException line 1:49 extraneous input ';' expecting EOF near '<EOF>' (state=42000,code=40000)

Some part of the client/parser does not seem note the ' ' quoting and applies escaping / quoting rules to the text inside.

  was:
Hive can several special characters in column names if it is quoted with ' ':
create table tspeccharcol (`@"!£"!$%^=&)(-` int);

The table above can be used by Impala, but there are some caveats:
- Impala returns an error for a similar column name: Invalid column/field name
- SHOW CREATE TABLE in Impala does not quote the column, so it will return a statement that is not executable in Hive (Hive quotes it correctly)

I am not sure whether we should accept all these special characters - the original question why I investigate this was asking for @.

The error is returned due to a Hive function:
https://github.com/apache/impala/blob/cfd79b40beab86f08ad72e0bea41eabf736d0a99/fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java#L166
The second paramater should be a HiveConf, which will decide whether to accept special chars:
https://github.com/apache/hive/blob/293a448296933b7498a91e7eeb91edc88dfaa07e/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L219

Besides this function, Hive also seems to have some other rules, e.g. : is not accepted:
create table if not exists tspeccharcol (`:` int);
Error: Error while compiling statement: FAILED: ParseException line 1:48 Failed to recognize predicate ')'. Failed rule: '[., :] can not be used in column name in create table statement.' in column specification (state=42000,code=40000)


Also noticed some weirdness in Hive / beeline:
While this is accepted:
create table if not exists tspeccharcol (`""` int);
these ones are not:
create table if not exists tspeccharcol (`"` int);
create table if not exists tspeccharcol (`"\"` int);

Both fail with: Error: Error while compiling statement: FAILED: ParseException line 1:49 extraneous input ';' expecting EOF near '<EOF>' (state=42000,code=40000)

Some part of the client/parser does not seem note the ' ' quoting and applies escaping / quoting rules to the text inside.


> Improve handling of special chars in column names
> -------------------------------------------------
>
>                 Key: IMPALA-11587
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11587
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog, Frontend
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> Hive can use several special characters in column names if it is quoted with ' ':
> create table tspeccharcol (`@"!£"!$%^=&)(-` int);
> The table above can be used by Impala, but there are some caveats:
> - Impala returns an error for a similar column name: Invalid column/field name
> - SHOW CREATE TABLE in Impala does not quote the column, so it will return a statement that is not executable in Hive (Hive quotes it correctly)
> I am not sure whether we should accept all these special characters - the original question why I investigate this was asking for @.
> The error is returned due to a Hive function:
> https://github.com/apache/impala/blob/cfd79b40beab86f08ad72e0bea41eabf736d0a99/fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java#L166
> The second paramater should be a HiveConf, which will decide whether to accept special chars:
> https://github.com/apache/hive/blob/293a448296933b7498a91e7eeb91edc88dfaa07e/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java#L219
> Besides this function, Hive also seems to have some other rules, e.g. : is not accepted:
> create table if not exists tspeccharcol (`:` int);
> Error: Error while compiling statement: FAILED: ParseException line 1:48 Failed to recognize predicate ')'. Failed rule: '[., :] can not be used in column name in create table statement.' in column specification (state=42000,code=40000)
> Also noticed some weirdness in Hive / beeline:
> While this is accepted:
> create table if not exists tspeccharcol (`""` int);
> these ones are not:
> create table if not exists tspeccharcol (`"` int);
> create table if not exists tspeccharcol (`"\"` int);
> Both fail with: Error: Error while compiling statement: FAILED: ParseException line 1:49 extraneous input ';' expecting EOF near '<EOF>' (state=42000,code=40000)
> Some part of the client/parser does not seem note the ' ' quoting and applies escaping / quoting rules to the text inside.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org