You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "David Mollitor (Jira)" <ji...@apache.org> on 2020/04/24 19:33:00 UTC

[jira] [Comment Edited] (HIVE-19064) Add mode to support delimited identifiers enclosed within double quotation

    [ https://issues.apache.org/jira/browse/HIVE-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091846#comment-17091846 ] 

David Mollitor edited comment on HIVE-19064 at 4/24/20, 7:32 PM:
-----------------------------------------------------------------

[~jcamachorodriguez] and I spoke directly on this. I better understand the context of this patch. Here are my concerns with the current implementation, but just something to consider, you can move forward with this patch if you want, but I'll document the discussion as I understand it:

How I understand this is that "standard" mode will implement the SQL Standard (better support for PostgreSQL applications). If users wants to support MySQL workloads, they can continue to use the "column" mode.

"column" mode = MySQL = Identifiers are surrounded by back ticks
 "standard" mode = PostgreSQL = Identifiers are surrounded by double quotes

The thing I am struggling with still is that this same objective can be met with a feature like ANSI_QUOTES and it will less confusing for users and easier to implement.
{code:sql|title=Ansi Quotes}
set hive.support.quoted.identifiers=column;
-- This works
CREATE `db`.`table` (...);

set ansi_quotes=true;
-- This works
CREATE "db"."table" (...);

-- This still works
CREATE `db`.`table` (...);
{code}
{code:sql|title=Standard Mode}
set hive.support.quoted.identifiers=standard;
-- This works
CREATE "db"."table" (...);

-- This no longer works works
CREATE `db`.`table` (...);
{code}
OK, so you can add support for back ticks on top of double quotes later perhaps, but "standard" mode isn't backwards compatible with "column" mode so, to do this well, every time you output an identifier, in SHOW CREATE TABLE, in a log message, in an error message, you need to always track the current mode, and then, if the user changes modes, the output has to change to reflect that:
{code:sql}
-- Column + ansi_quotes
set hive.support.quoted.identifiers=column;

CREATE TABLE `db`.`table` (...);

CREATE TABLE `db`.`table` (...);
> Table `db`.`table` already exists.
select * from `db`.`table`;

-- This just works when flipping back-and-forth between quotes/back ticks seamlessly 
set ansi_quotes=true;
CREATE TABLE "db"."table" (...);
> Table `db`.`table` already exists.
select * from `db`.`table`;
set ansi_quotes=false;
select * from `db`.`table`;


-- Standard + Back Ticks

set hive.support.quoted.identifiers=standard;
CREATE TABLE "db"."table" (...);
> Table "db"."table" already exists.
select * from "db"."table";

set hive.support.quoted.identifiers=column;
select * from "db"."table";
> Syntax error
{code}


was (Author: belugabehr):
[~jcamachorodriguez] and I spoke directly on this. I better understand the context of this patch. Here are my concerns with the current implementation, but just something to consider, you can move forward with this patch if you want, but I'll document the discussion as I understand it:

How I understand this is that "standard" mode will implement the SQL Standard (better support for PostgreSQL applications). If users wants to support MySQL workloads, they can continue to use the "column" mode.

"column" mode = MySQL = Identifiers are surrounded by back ticks
 "standard" mode = PostgreSQL = Identifiers are surrounded by double quotes

The thing I am struggling with still is that this same objective can be met with a feature like ANSI_QUOTES and it will less confusing for users and easier to implement.
{code:sql|title=Ansi Quotes}
set hive.support.quoted.identifiers=column;
-- This works
CREATE `db`.`table` (...);

set ansi_quotes=true;
-- This works
CREATE "db"."table" (...);

-- This still works
CREATE `db`.`table` (...);
{code}
{code:sql|title=Standard Mode}
set hive.support.quoted.identifiers=standard;
-- This works
CREATE "db"."table" (...);

-- This no longer works works
CREATE `db`.`table` (...);
{code}
OK, so you can add support for back ticks on top of double quotes later perhaps, but "standard" mode isn't backwards compatible with "column" mode so, to do this well, every time you output an identifier, you need to always track the current mode, and then, if the user changes modes, the user has to also change their output:
{code:sql}
-- Column + ansi_quotes
set hive.support.quoted.identifiers=column;

CREATE TABLE `db`.`table` (...);

CREATE TABLE `db`.`table` (...);
> Table `db`.`table` already exists.
select * from `db`.`table`;

-- This just works when flipping back-and-forth between quotes/back ticks seamlessly 
set ansi_quotes=true;
CREATE TABLE "db"."table" (...);
> Table `db`.`table` already exists.
select * from `db`.`table`;
set ansi_quotes=false;
select * from `db`.`table`;


-- Standard + Back Ticks

set hive.support.quoted.identifiers=standard;
CREATE TABLE "db"."table" (...);
> Table "db"."table" already exists.
select * from "db"."table";

set hive.support.quoted.identifiers=column;
select * from "db"."table";
> Syntax error
{code}

> Add mode to support delimited identifiers enclosed within double quotation
> --------------------------------------------------------------------------
>
>                 Key: HIVE-19064
>                 URL: https://issues.apache.org/jira/browse/HIVE-19064
>             Project: Hive
>          Issue Type: Improvement
>          Components: Parser, SQL
>    Affects Versions: 3.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Krisztian Kasa
>            Priority: Major
>         Attachments: HIVE-19064.01.patch, HIVE-19064.02.patch, HIVE-19064.03.patch, HIVE-19064.4.patch, HIVE-19064.5.patch, HIVE-19064.6.patch, HIVE-19064.7.patch, HIVE-19064.7.patch
>
>
> As per SQL standard. Hive currently uses `` (backticks). Default will continue being backticks, but we will support identifiers within double quotation via configuration parameter.
> This issue will also extends support for arbitrary char sequences, e.g., containing {{~ ! @ # $ % ^ & * () , < >}}, in database and table names. Currently, special characters are only supported for column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)