You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org> on 2016/09/23 18:03:57 UTC

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Zoltan Ivanfi has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4527

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................

IMPALA-784: Use `-s in SHOW CREATE TABLE output

Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
---
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
M fe/src/main/java/com/cloudera/impala/analysis/DeleteStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/InsertStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/TableName.java
M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/UpdateStmt.java
M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java
8 files changed, 339 insertions(+), 338 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4527/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 1:

(1 comment)

> How did you determine all the places that needed to be changed?

I modified the expected results in ToSqlTest, then I checked for failures and fixed them.

http://gerrit.cloudera.org:8080/#/c/4527/1/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

PS1, Line 72: reassamble
> reassemble
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> I agree that it's slighly less pleasing to the eye but I think correctness/
For SHOW CREATE TABLE and CREATE VIEW it's fine to quote all identifiers, but I don't think we should needlessly pollute error messages, e.g., in AnalysisExceptions. To the best of my knowledge, Hive does not quote identifiers in error reports, and neither should we.


Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> Hmmm. Interesting. On the other hand:
Those are two separate issues. In your example, we are guarding against a limitation in the Hive Metastore that disallows certain column names. Generally, the Hive Metastore is pretty restrictive.

However, within an Impala query you can use more or less arbitrary identifiers, for better of for worse. There's an argument to be made that we should not allow that, but that's the state of the world today.

Ambiguities are a reality that we cannot escape, even if we disallowed dots in quoted identifiers. If you are curious, you can look at AnalyzeStmts#TestSlotRefPathAmbiguity() and the other *Ambiguity tests to get a flavor.

I still don't see how the specific problem we are discussing is not solvable. This function should accept a single identifier and quote it. It's up to the caller (who has more knowledge about the identifier) to pass in the right value.

For the struct case there is an existing JIRA to change the behavior: IMPALA-2287

My basic question here is: Why did you add the splitting?


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
File fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java:

Line 112:       sb.append(" " + type_.toString());
toSql()


Line 114:       sb.append(" " + typeDef_.toString());
toSql()


http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
The main issue with this approach is that all invocations of toSql() will not have quoted identifiers. Unfortunately, we also use toSql() for error reporting. For example, take a look at AnalyticExpr.java or any other Expr. If analyze() fails we typically we print the offending expr with toSql().

Error messages will now become unnecessarily weird.

I think the callers need to decide the quoting policy.

We should have two versions of toSql():
1. Only adds quotes if necessary for Impala (no need to consider Hive as before)
2. Always quote identifiers

Imo, the default toSql() should have the first behavior, and we may add another version that always quotes identifiers. For example, we could have a setup like this:

private String toSql(boolean quoteAllIdents);
private String toSql() { return toSql(false); }

You can imagine other variants. Happy to discuss.


Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
I don't think splitting here is the right fix. An identifier itself could contain a dot, even if unlikely. Also, qualified table or column references consist of multiple identifiers and not a single identifier, so the premise of this fix is kind of wrong (based on your comment).


Line 88:     if (ident.equals("*")) {
This doesn't seem right or necessary. First, "*" is not an identifier (it is a terminal), and second those clauses that could contain "*" should handle it specially in toSql().

For my understanding, what use case is this change addressing?


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Lars Volker (Code Review)" <ge...@cloudera.org>.
Lars Volker has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

If you abandon this, please leave a link in the Jira so we know it is there. Otherwise it will be easy to miss.

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 1:

(2 comments)

This seems reasonable to me. I think we should get Alex Behm to confirm, since he was the one who originally wrote some of this code. 

How did you determine all the places that needed to be changed? We should also fix this for "SHOW CREATE FUNCTION" and any other similar statements that output SQL for consistency.

The CREATE TABLE expected output is probably coming from .test files. E.g. testdata/workloads/functional-query/queries/QueryTest/kudu-show-create.test

http://gerrit.cloudera.org:8080/#/c/4527/1/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

PS1, Line 72: reassamble
reassemble


Line 73:     // This is safe, because periods are not allowed inside identifier names for
Thanks for the explanation.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

Zoltan - if nothing's going to happen on this patch for a while, could you 'abandon' it so that it doesn't show up in the pending review list? We can always reinstate it later.

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> If I remember correctly, I encountered a problem with the com.cloudera.impa
1. It would be good talk about a concrete example. From my point of view, I don't see how such a TableName would ever be constructed. For nested types, we usually avoid TableName and prefer Path instead.

2. I don't think your comment on InsertStmt and View is relevant here. InsertStmt takes a list of columns which must be quoted without splitting. The same goes for the column labels taken in View. So I don't see a problem here.

3. I agree there may be some required work to properly quote nested types in some case. It would be good understand what these case are exactly before abandoning the effort.

Happy to discuss over a meeting / gchat if you think that would be helpful.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 1:

The functional tests still fail with false positives like
- CREATE TABLE functional_kudu.dimtbl (id BIGINT, name STRING, zip INT)
+ CREATE TABLE `functional_kudu`.`dimtbl` (`id` BIGINT, `name` STRING, `zip` INT)

I haven't fixed these yet, because I still have to figure out where the expected strings come from, but first I wanted to make sure that we agree on the approach to be taken.

The same applies to the overly long lines in the unit test. That test is fixed functionally but not stylistically, as I didn't want to spend too much time on it until I get feedback about whether this is the right approach.

Thanks,

Zoltan

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> 1. It would be good talk about a concrete example. From my point of view, I
Hi Alex,

Thanks for the offer, I would certainly like to discuss the topic. Unfortuntely it looks like I won't have time to finish this change in the near future, as I was assigned to File Formats recently and there is a lot to do. Let's hibernate this review for a while and get back to it later.

Thanks,

Zoltan


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> Initially I was also worried that identifiers become "ugly" and considered 
So we are in agreement then?


Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> I checked that identifiers can't contain dots, or at least users can't crea
select `a.b.c`.`x.y.z` from (select 1 `x.y.z`) `a.b.c`


Line 88:     if (ident.equals("*")) {
> Without this check a SELECT * became SELECT `*` which is invalid.
SelectListItem.toSql() should take care of that case. Seems like something is not behaving as expected and that's where we should apply the fix.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> So we are in agreement then?
I agree that it's slighly less pleasing to the eye but I think correctness/unambiguity is more important than aesthetics. Also Hive does the same, so I think people are used to it.


Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> select `a.b.c`.`x.y.z` from (select 1 `x.y.z`) `a.b.c`
Hmmm. Interesting. On the other hand:

> create view test as select `a.b.c`.`x.y.z` from (select 1 `x.y.z`) `a.b.c`;
ERROR: AnalysisException: Invalid column/field name: x.y.z

This is a serious problem, because it means that identifier names are ambigous. I checked that in case of a struct, functions get "column_name.field_name" as the column name. I haven't checked your example yet, but my guess is that in this case the column name parameter will become "x.y.z". Still, the former should be quoted as `column_name`.`field_name` and latter as `x.y.z`. It seems that it's too late to fix this problem by the time getIdentSql gets called.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has abandoned this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Abandoned

Abandoning the review until I can work on it again.

-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> Those are two separate issues. In your example, we are guarding against a l
I added splitting because when I just put the whole string between backticks then Impala ended up escaping qualified identifiers of complex types incorrectly. I noticed that these references were already in a string form by the time getIdentSql gets called, so there is no way to properly quote them without refactoring to pass them in a structured format to the affected functions.

The "Invalid column/field name" error message mislead me to believe that dots are not allowed in any identifier, which lead me to come up with my approach which would work if identifiers really couldn't contain dots. Since this turned out to be a false assumption, it seems that I have to discard this change as this task can only be implemented properly after IMPALA-2287 and probably some other refactorings.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> I think we can still make progress here, I'm not sure I completely follow y
If I remember correctly, I encountered a problem with the com.cloudera.impala.analysis.TableName class. It stores the table name as a String, but the table name itself can be qualified in case of complex types, thus it can not be quoted by simply putting it between backticks.

It also seems that com.cloudera.impala.analysis.InsertStmt and com.cloudera.impala.catalog.View constructors take a List<String> parameter for columns and store it in the object.

It seems to me that in order to properly quote these, we would need to refactor the code to use a data structure capable of representing hierarchical names unambiguously to pass the table and column references around and to store them. I thought that the ongoing effort you mentioned is about addressing this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has uploaded a new patch set (#2).

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................

IMPALA-784: Use `-s in SHOW CREATE TABLE output

Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
---
M fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java
M fe/src/main/java/com/cloudera/impala/analysis/DeleteStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/InsertStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/TableName.java
M fe/src/main/java/com/cloudera/impala/analysis/TableRef.java
M fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
M fe/src/main/java/com/cloudera/impala/analysis/UpdateStmt.java
M fe/src/test/java/com/cloudera/impala/analysis/ToSqlTest.java
8 files changed, 339 insertions(+), 338 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4527/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Zoltan Ivanfi (Code Review)" <ge...@cloudera.org>.
Zoltan Ivanfi has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 69:   public static String getIdentSql(String ident) {
> The main issue with this approach is that all invocations of toSql() will n
Initially I was also worried that identifiers become "ugly" and considered making this behavior optional. I was thinking of a query option that the user can control using the SET statement. Then I checked how Hive works and I found that it already quotes every identifier (I checked the table and view definitions if I remember correctly), so I came to the conclusion that quoting everything should be okay. Of course I expected some suggestions to how it should work, that's why I didn't fix the styling issues in the test.


Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> I don't think splitting here is the right fix. An identifier itself could c
I checked that identifiers can't contain dots, or at least users can't create such identifiers. If you try specifying `a.b` as an identifier, Impala will complain that dots are not allowed in identifier names. Do you know of a code piece that gets around this mechanism and creates identifiers with dots in some other way?

Qualified table or column references are exactly the reason why I only made this version public and the other one private, as only this is safe to use to quote table or column references. In case of complex types, dots can become a part of either the table reference or the column reference.

In case of a struct, a column reference might look like column_name.field_name, which should be quoted as `column_name`.`field_name`. In case of a map or array, table references might look like table_name.map_or_array_name, which should be quoted `table_name`.`map_or_array_name`.

This is my understanding based on the short amount of time I spent on Impala, so please correct me if I'm wrong.


Line 88:     if (ident.equals("*")) {
> This doesn't seem right or necessary. First, "*" is not an identifier (it i
Without this check a SELECT * became SELECT `*` which is invalid.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-784: Use `-s in SHOW CREATE TABLE output

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-784: Use `-s in SHOW CREATE TABLE output
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4527/2/fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java
File fe/src/main/java/com/cloudera/impala/analysis/ToSqlUtils.java:

Line 76:     return Joiner.on(".").join(getStandaloneIdentSqlList((Splitter.on('.').split(ident))));
> I added splitting because when I just put the whole string between backtick
I think we can still make progress here, I'm not sure I completely follow your reasoning for abandoning the effort.

* It might help to discuss concrete examples where your splitting here is required, so I can follow along.
* IMPALA-2287 seems independent of this work.


-- 
To view, visit http://gerrit.cloudera.org:8080/4527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8e66439daa35d2e3725e7277c16f3cc5fdb6f128
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Zoltan Ivanfi <zi...@cloudera.com>
Gerrit-HasComments: Yes