You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org> on 2022/12/20 21:43:24 UTC

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Wenzhe Zhou has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19383


Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
Select statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.

Testing:
 - Integrat new version of Kudu with the commits for KUDU-1945

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M be/src/exec/kudu/kudu-util.cc
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
34 files changed, 742 insertions(+), 97 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently adds support for non unique primary key. Kudu
server automatically adds an auto-incrementing column named
'auto_increment_id' in a Kudu table which has non unique primary key.
The non unique primary key and 'auto_increment_id' form the effective
unique composite primary key.

This patch integrated new version of Kudu which support non unique
primary key, added syntactic support for creating table with non unique
primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for auto-incrementing column automatically
when inserting rows so insertion statements don't need to specify
values for auto-incrementing column.
SELECT statement does not show auto-incrementing column unless the
column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end and end to end unit tests.
   Passed query_test/test_kudu.py and custom_cluster/test_kudu.py
   on local environment with new version of Kudu built on local
   machine.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
40 files changed, 1,143 insertions(+), 197 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/10
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature. The
feature works by appending a system generated auto-incrementing column
to the primary key columns to guarantee the uniqueness on primary key
when the primary key columns can be non-unique. The non unique primary
key columns and the auto-incrementing column form the effective unique
composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, counters for auto-incrementing
values are not global per Kudu table, but global per Kudu tablet server
instead so auto-incrementing values are not unique in a Kudu table.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
Examples:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,235 insertions(+), 204 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/11
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 11
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 5:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12116/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Fri, 06 Jan 2023 01:41:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 9:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12145/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 11 Jan 2023 07:22:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 16:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/19383/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/16//COMMIT_MSG@17
PS16, Line 17: The assignment to it during insertion is automatic so
             : insertion statements should not specify values for auto-incrementing
             : column.
Is there any test that explicitly trying to insert into auto_incrementing_id column? Say,
INSERT INTO tbl_name (auto_incrementing_id) VALUES (0);


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java@342
PS16, Line 342: if (!isPrimaryKeyUnique_) sb.append(" NON UNIQUE");
              :       sb.append(" PRIMARY KEY");
nit: Use KuduUtil.getPrimaryKeyString() here?


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@488
PS16, Line 488: if (!isPrimaryKeyUnique) sb.append("NON UNIQUE ");
              :         sb.append("PRIMARY KEY (");
nit: Use KuduUtil.getPrimaryKeyString() here?


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@501
PS16, Line 501: if (!isPrimaryKeyUnique) sb.append("NON UNIQUE ");
              :         sb.append("PRIMARY KEY (");
nit: Use KuduUtil.getPrimaryKeyString() here?


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@72
PS16, Line 72:     isKey_ = isKey;
             :     isPrimaryKeyUnique_ = isPrimaryKeyUnique;
             :     isNullable_ = isNullable;
             :     isAutoIncrementing_ = isAutoIncrementing;
Does it make sense to add Preconditions here to verify valid combo between these 4 attributes?


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@391
PS16, Line 391: isPrimaryKeyUnique_ = kuduSchema_.isPrimaryKeyUnique();
              :     hasAutoIncrementingColumn_ = kuduSchema_.hasAutoIncrementingColumn();
If isPrimaryKeyUnique() return False, does hasAutoIncrementingColumn() guaranteed by Kudu client to return True?


http://gerrit.cloudera.org:8080/#/c/19383/16/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

http://gerrit.cloudera.org:8080/#/c/19383/16/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@690
PS16, Line 690: AnalysisException: Column already exists: auto_incrementing_id
If auto_incrementing_id is reserved, is it valid to create / have existing Kudu table with explicit column name auto_incrementing_id even if there is no NON UNIQUE PRIMARY KEY?



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 16
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Feb 2023 00:58:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
within a continuous region of the table served by a tablet-server.

This patch also upgraded Kudu version to 345fd44ca3 to pick up Kudu
changes needed for supporting non-unique primary key. It added
syntactic support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) PARTITIONS 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY RANGE (id)
  (PARTITION VALUES <= 1000,
   PARTITION 1000 < VALUES <= 2000,
   PARTITION 2000 < VALUES <= 3000,
   PARTITION 3000 < VALUES)
  STORED as KUDU
  AS SELECT id, int_col FROM functional.alltypestiny ORDER BY id ASC
   LIMIT 4000;

  CREATE TABLE tbl (id INT, name STRING, NON UNIQUE PRIMARY KEY(id))
  STORED as KUDU;

  CREATE TABLE tbl (a INT, b STRING, c FLOAT)
  PARTITION BY HASH (a, b) PARTITIONS 3
  STORED as KUDU;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create Kudu tables
   with non unique primary key, and tested insert/update/delete
   operations for these tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,349 insertions(+), 207 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/17
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 17
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
33 files changed, 733 insertions(+), 97 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/4
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19383/7/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/19383/7/tests/query_test/test_kudu.py@667
PS7, Line 667: \
flake8: E502 the backslash is redundant between brackets



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 00:01:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 8:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12132/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 00:29:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 13:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12280/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Feb 2023 02:53:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18: Code-Review+2

carry +1 from Riza and Qifan


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 02:35:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end and end to end unit tests.
   Passed query_test/test_kudu.py and custom_cluster/test_kudu.py
   on local environment with new version of Kudu built on local
   machine.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
40 files changed, 1,105 insertions(+), 191 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/9
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end and end to end unit tests.
   Passed query_test/test_kudu.py and custom_cluster/test_kudu.py
   on local environment with new version of Kudu built on local
   machine.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
40 files changed, 1,101 insertions(+), 190 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/8
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature. The
feature works by appending a system generated auto-incrementing column
to the primary key columns to guarantee the uniqueness on primary key
when the primary key columns can be non-unique. The non unique primary
key columns and the auto-incrementing column form the effective unique
composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
in a tablet-server.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,235 insertions(+), 204 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/12
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 15:

(7 comments)

Looks great!

http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@23
PS15, Line 23: in a tablet-server.
nit within a continuous region of the table served by a tablet-server.


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@25
PS15, Line 25: u
nit also


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@47
PS15, Line 47:   AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;
It will be great to include a DDL for a range partitioned kudu table here, assume the auto column is supported for such a table.

Including a DDL for a non-partition table here is also nice.


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@57
PS15, Line 57: t
Kudu


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@59
PS15, Line 59: Kudu
nit. for these tables


http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@107
PS15, Line 107: 
              :   // Set to true if primary key is unique.
              :   private boolean isPrimaryKeyUnique_ = true;
              : 
              :   // Set to true if the table has auto-incrementing column.
              :   private boolean hasAutoIncrementingColumn_ = false;
These two new members could be put into a new class and reused between KuduTable.java and LocalKuduTable.java.


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> Added test case in kudu-scan-node.test with unpartitioned Kudu table. All r
To observe auto column works with a range partitioned kudu table such as 

CREATE TABLE t1 (id STRING PRIMARY KEY, s STRING)
  PARTITION BY RANGE (PARTITION 'a' <= VALUES < '{', PARTITION 'A' <= VALUES < '[', PARTITION VALUES = '00000')
  STORED AS KUDU;

if it is feasible.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 15
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 16:28:42 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
in a tablet-server.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,294 insertions(+), 206 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/14
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 14
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 6:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12125/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Sat, 07 Jan 2023 02:20:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 14:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12291/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 14
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 01:31:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
in a tablet-server.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,321 insertions(+), 206 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/15
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 15
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19383/1/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/1/fe/src/main/java/org/apache/impala/analysis/TableDef.java@513
PS1, Line 513:           throw new AnalysisException("Non unique primary key is only supported for Kudu.");
line too long (92 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Tue, 20 Dec 2022 21:44:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 17:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@488
PS16, Line 488: sb.append(KuduUtil.getPrimaryKeyString(isPrimaryKeyUnique)).append(" (");
              :         Joiner.on(", ").appendTo(sb
> Done
Done


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@501
PS16, Line 501: Joiner.on(", ").appendTo(sb, primaryKeysSql).append(")");
              :       }
> Done
Done


http://gerrit.cloudera.org:8080/#/c/19383/17/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/17/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@71
PS17, Line 71: Preconditions.checkArgument((!isKey && !isPrimaryKeyUnique && !isAutoIncrementing)
             :         || (isKey && (isPrimaryKeyUnique && !isAutoIncrementing || !isPrimaryKeyUnique)));
May I suggest to split this with branch on isKey?

if (isKey) {
  Preconditions.checkArgument(isPrimaryKeyUnique && !isAutoIncrementing || !isPrimaryKeyUnique);
} else {
  Preconditions.checkArgument(!isPrimaryKeyUnique && !isAutoIncrementing);
}

This way, we can quickly distinguish between the two case on failure.


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@391
PS16, Line 391: isPrimaryKeyUnique_ = kuduSchema_.isPrimaryKeyUnique();
              :     hasAutoIncrementingColumn_ = kuduSchema_.hasAutoIncrementingColumn();
> No. auto-incrementing column must be non unique primary key, but non unique
So is it valid to add this precondition here?

Preconditions.checkState(!isPrimaryKeyUnique_ || !hasAutoIncrementingColumn_);



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 17
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Feb 2023 16:52:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
31 files changed, 735 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/5
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12311/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 02:23:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9023/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 02:36:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 13:

(5 comments)

I need to do another pass.
In the meantime, I have some questions and nits.

http://gerrit.cloudera.org:8080/#/c/19383/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/13//COMMIT_MSG@9
PS13, Line 9: Kudu engine recently enables the auto-incrementing column feature
Any KUDU jira number that we can mention here?


http://gerrit.cloudera.org:8080/#/c/19383/13//COMMIT_MSG@28
PS13, Line 28: specifying PRIMARY KEY is optional
Is this new behavior?
Kudu doc says declaring primary key is mandatory
https://kudu.apache.org/docs/schema_design.html#primary-keys

Kudu doc says "Primary key columns must be non-nullable, and may not be a boolean, float or double type.". Is this also true for NON UNIQUE PRIMARY KEY column?


http://gerrit.cloudera.org:8080/#/c/19383/13//COMMIT_MSG@30
PS13, Line 30: columes
nit: column(s)?


http://gerrit.cloudera.org:8080/#/c/19383/13//COMMIT_MSG@36
PS13, Line 36:   CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
             :   PARTITION BY HASH (i) partitions 3
             :   STORED as KUDU;
             :   CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
             :   PARTITION BY HASH (i) partitions 3
             :   STORED as KUDU;
             :   CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
             :   PARTITION BY HASH (id) partitions 3
             :   STORED as KUDU
             :   AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;
nit: Add newline between examples?


http://gerrit.cloudera.org:8080/#/c/19383/13/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

http://gerrit.cloudera.org:8080/#/c/19383/13/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test@533
PS13, Line 533: 'Table has been created.'
Does it makes sense to give feedback to user that (a, b) are automatically promoted as NON UNIQUE PRIMARY KEY?
This can help if user forgot to name PRIMARY KEY, which used to be mandatory before this patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Feb 2023 23:37:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
Select statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M be/src/exec/kudu/kudu-util.cc
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
34 files changed, 735 insertions(+), 97 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 3:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12055/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Wed, 21 Dec 2022 16:50:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12057/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Wed, 21 Dec 2022 22:12:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
Select statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
33 files changed, 733 insertions(+), 97 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 16:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12298/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 16
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 20:27:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> It's in the goals during initial design, but removed later due to too many 
In this case, there is only one partition for a table so the table could not be large. They think it's a rare use case.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 31 Jan 2023 02:07:39 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 07:34:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 9:

(16 comments)

Thanks Qifan for your review.

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG@9
PS9, Line 9: Kudu engine adds support for non unique primary key. It adds
> nit recently
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@580
PS9, Line 580: add
> nit adds
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@581
PS9, Line 581: .
> nit. ", in this case, this field is set to false"
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java@99
PS9, Line 99: c.isPrimaryKeyUnique() ? "" : "non unique ", c.toString()
> This statement can be integrated into ColumnDef.toString().
use KuduUtil.getPrimaryKeyString()


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@227
PS9, Line 227: createStmt_.isPrimaryKeyUnique(), createStmt_.getPrimaryKeyColumnDefs(),
> nit. It may be more logic to switch the two arguments, as the new argument 
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@87
PS9, Line 87: An auto-incrementing column will be added
            :   // automatically by Kudu engine as key column if primary key is not unique.
> nit. If not, an auto-incrementing column will be added
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@537
PS9, Line 537: only the columns, on which the
             :           // partitions are being created,
> nit. the partition columns have to be first in the primary key columns.
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@566
PS9, Line 566:       throw new AnalysisException(String.format("Multiple %sprimary keys specified. " +
             :           "Composite %sprimary keys can be specified using the " +
             :           "%sPRIMARY KEY (col1, col2, ...) syntax at the end of the column definition.",
             :           isPrimaryKeyUnique() ? "" : "non unique ",
             :           isPrimaryKeyUnique() ? "" : "non unique ",
             :           isPrimaryKeyUnique() ? "" : "NON UNIQUE "));
> nit. This looks similar to the lines starting at 520.
use KuduUtil.getPrimaryKeyString()


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java@242
PS9, Line 242: boolean isPrimaryKeyUnique, List<ColumnDef> primaryKeyColumnDefs,
> As noted in another comment, these two arguments probably should be swapped
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@134
PS9, Line 134: position
> nit. 'position'.
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@135
PS9, Line 135: if
> nit. when
Done


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@144
PS9, Line 144: false
> One may call this function even when column.isIs_primary_key_unique() is tr
This is a static function and it is only called when the primary key of the table is not unique.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@133
PS9, Line 133:   if (isKey && !isKeyUnique) {
             :       csb.nonUniqueKey(true);
             :     } else {
             :       csb.key(isKey);
             :     }
> It seems the following covers all combos of isKey and isKeyUnique. 
Kudu client API ColumnSchemaBuilder.key() and ColumnSchemaBuilder.nonUniqueKey() overwrite the previous calls. ColumnSchemaBuilder.nonUniqueKey(false) set the the column as non key, i.e both key and keyUnique as false.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@515
PS9, Line 515:             String.format("Cannot add %s column to Kudu table %s",
> nit "as its name is identical to auto incremental column name in Kudu".
Remove lines #512 to #518 since Kudu client API AlterTableOptions.addColumn() checks if the column name is identical to its reserved column name.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@533
PS9, Line 533:           String.format("Cannot drop %s column from Kudu table %s",
> nit. same comment as above.
Remove lines #530 to #535 since Kudu client API AlterTableOptions.dropColumn() checks if the column name is identical to its reserved column name.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@563
PS9, Line 563:           String.format("Cannot alter %s column for Kudu table %s",
> nit  same comment as above
Remove lines #560 to #565 since Kudu client AlterTableOptions APIs checks if the column name is identical to its reserved column name.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jan 2023 06:02:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 9:

(16 comments)

Looks good!

Will review the test part of the patch next.

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/9//COMMIT_MSG@9
PS9, Line 9: Kudu engine adds support for non unique primary key. It adds
nit recently


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@580
PS9, Line 580: add
nit adds


http://gerrit.cloudera.org:8080/#/c/19383/9/common/thrift/CatalogObjects.thrift@581
PS9, Line 581: .
nit. ", in this case, this field is set to false"


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java@99
PS9, Line 99: c.isPrimaryKeyUnique() ? "" : "non unique ", c.toString()
This statement can be integrated into ColumnDef.toString().


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java@227
PS9, Line 227: createStmt_.isPrimaryKeyUnique(), createStmt_.getPrimaryKeyColumnDefs(),
nit. It may be more logic to switch the two arguments, as the new argument further refines the primary key columns.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java
File fe/src/main/java/org/apache/impala/analysis/TableDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@87
PS9, Line 87: An auto-incrementing column will be added
            :   // automatically by Kudu engine as key column if primary key is not unique.
nit. If not, an auto-incrementing column will be added
automatically by Kudu engine. This extra key column helps produce a unique composite primary key (primary keys + auto-incrementing construct).


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@537
PS9, Line 537: only the columns, on which the
             :           // partitions are being created,
nit. the partition columns have to be first in the primary key columns.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/analysis/TableDef.java@566
PS9, Line 566:       throw new AnalysisException(String.format("Multiple %sprimary keys specified. " +
             :           "Composite %sprimary keys can be specified using the " +
             :           "%sPRIMARY KEY (col1, col2, ...) syntax at the end of the column definition.",
             :           isPrimaryKeyUnique() ? "" : "non unique ",
             :           isPrimaryKeyUnique() ? "" : "non unique ",
             :           isPrimaryKeyUnique() ? "" : "NON UNIQUE "));
nit. This looks similar to the lines starting at 520.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/Db.java@242
PS9, Line 242: boolean isPrimaryKeyUnique, List<ColumnDef> primaryKeyColumnDefs,
As noted in another comment, these two arguments probably should be swapped.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@134
PS9, Line 134: position
nit. 'position'.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@135
PS9, Line 135: if
nit. when


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@144
PS9, Line 144: false
One may call this function even when column.isIs_primary_key_unique() is true.  Suggest to replace false with column.isIs_primary_key_unique().


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@133
PS9, Line 133:   if (isKey && !isKeyUnique) {
             :       csb.nonUniqueKey(true);
             :     } else {
             :       csb.key(isKey);
             :     }
It seems the following covers all combos of isKey and isKeyUnique. 

csb.key(isKey);
csb.nonUniqueKey(isKey && !isKeyUnique);


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@515
PS9, Line 515:             String.format("Cannot add %s column to Kudu table %s",
nit "as its name is identical to auto incremental column name in Kudu".


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@533
PS9, Line 533:           String.format("Cannot drop %s column from Kudu table %s",
nit. same comment as above.


http://gerrit.cloudera.org:8080/#/c/19383/9/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java@563
PS9, Line 563:           String.format("Cannot alter %s column for Kudu table %s",
nit  same comment as above



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sun, 15 Jan 2023 23:44:40 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 10:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12176/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jan 2023 06:34:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 1:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12049/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Tue, 20 Dec 2022 21:52:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12273/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 12
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 31 Jan 2023 02:14:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
within a continuous region of the table served by a tablet-server.

This patch also upgraded Kudu version to 345fd44ca3 to pick up Kudu
changes needed for supporting non-unique primary key. It added
syntactic support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) PARTITIONS 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY RANGE (id)
  (PARTITION VALUES <= 1000,
   PARTITION 1000 < VALUES <= 2000,
   PARTITION 2000 < VALUES <= 3000,
   PARTITION 3000 < VALUES)
  STORED as KUDU
  AS SELECT id, int_col FROM functional.alltypestiny ORDER BY id ASC
   LIMIT 4000;

  CREATE TABLE tbl (id INT, name STRING, NON UNIQUE PRIMARY KEY(id))
  STORED as KUDU;

  CREATE TABLE tbl (a INT, b STRING, c FLOAT)
  PARTITION BY HASH (a, b) PARTITIONS 3
  STORED as KUDU;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create Kudu tables
   with non unique primary key, and tested insert/update/delete
   operations for these tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,322 insertions(+), 207 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/16
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 16
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 17:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12304/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 17
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Feb 2023 08:17:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature. The
feature works by appending a system generated auto-incrementing column
to the primary key columns to guarantee the uniqueness on primary key
when the primary key columns can be non-unique. The non unique primary
key columns and the auto-incrementing column form the effective unique
composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
in a tablet-server.

This patch upgraded Kudu version to 345fd44ca3 to pick up Kudu changes
needed for supporting non-unique primary key. It added syntactic
support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columes will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,235 insertions(+), 204 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/13
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> In this case, there is only one partition for a table so the table could no
Yeah, agree that with the tablet concept.

However,  for a range partitioned ( such as by months in a year) kudu table test, we can still benefit from the unique-ness within each partition  (aka a kudu tablet). For example, to answer the question of chronicle order of row insertions. 

Would it be possible to add such a test case?


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
File testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test@204
PS10, Line 204: order by id
> It's unique per Kudu tablet-server. Clarified in the commit message. They d
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Feb 2023 15:59:09 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 10:

(13 comments)

Added some comments on testing and the commit message.

http://gerrit.cloudera.org:8080/#/c/19383/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/10//COMMIT_MSG@30
PS10, Line 30: Kudu assign values for auto-incrementing column automatically
             : when inserting rows so insertion statements don't need to specify
             : values for auto-incrementing column.
Kudu recently enables the auto-incrementing column feature.  The feature works by appending a system generated column to the primary key columns to guarantee the uniqueness on primary key, when the primary key columns can be non-unique. 

This auto column is of type big int and the assignment to it during insertion is automatic.


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
may add some more test cases with the following.

1. No partition by clause
2. Non unique primary key includes all selected columns

It seems even when primary key list is empty, the auto incrementing column can still  function as the primary key.  Do we allow such a case?


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@88
PS10, Line 88: AnalyzesOk("create table tab (x int non unique primary key) partition by hash(x) " +
             :         "partitions 8 stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x int, non unique primary key(x)) " +
             :         "partition by hash(x) partitions 8 stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x int, y int, non unique primary key (x, y)) " +
             :         "partition by hash(x, y) partitions 8 stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x int, y int, non unique primary key (x)) " +
             :         "partition by hash(x) partitions 8 stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x int, y int, non unique primary key(x, y)) " +
             :         "partition by hash(y) partitions 8 stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x timestamp, y timestamp, non unique primary key(x)) "+
             :         "partition by hash(x) partitions 8 stored as kudu", isExternalPurgeTbl);
             :     // Promote all partition columns as non unique primary key columns if primary keys
             :     // are not declared, but partition columns must be the first columns in the table.
             :     AnalyzesOk("create table tab (x int, y int) partition by hash(x) partitions 8 " +
             :         "stored as kudu", isExternalPurgeTbl);
             :     AnalyzesOk("create table tab (x int, y int) partition by hash(x, y) partitions 8 " +
             :         "stored as kudu", isExternalPurgeTbl);
             :     AnalysisError("create table tab (x int, y int) partition by hash(y) partitions 8 " +
             :         "stored as kudu", "Specify primary key or non unique primary key for the Kudu " +
             :         "table, or create partitions with the beginning columns of the table.",
             :         isExternalPurgeTbl);
In some of the new tests, may try partition by range and no partition to cover more cases.  See https://kudu.apache.org/docs/kudu_impala_integration.html.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
File testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test@204
PS10, Line 204: order by id
Wonder if we can order DESC by auto_incrementing_id which can be useful to get the latest rows.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@709
PS10, Line 709: auto-incrementing column auto_incrementing_id
nit. The column name appeared twice.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@714
PS10, Line 714: primary key
nit. system generated primary key


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test@512
PS10, Line 512: a int, b string,
May  add a new test case where the type of a and b is swapped (i.e., a string, b int). 

Do we allow 
create able <t> (a string, non unique primary key(a))?


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test@416
PS10, Line 416: by hash (id)
Wonder if partition by range (auto_incrementing_id) is allowed, which can be quite useful to precisely control the rows in each partition.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test@139
PS10, Line 139: bigint
It may be useful to allow the type of the auto column to be specified by the user, for performance reasons.  For example, a type of int on the auto column only reads in 4 bytes instead of 8 bytes. Not sure if Kudu allows such a flexibility.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test@646
PS10, Line 646: (1, 10, 'ten'), (2, 20, 'twenty');
No duplication on id.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test@43
PS10, Line 43: Computing stats
Should we also get a report on the stats for the auto column here? The stats is useful as the auto column can appear in predicates or in select list.


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_update.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test@431
PS10, Line 431: Key column 'auto_incrementing_id' cannot be updated.
This is a cool test.

nit. System generated primary key column


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test@441
PS10, Line 441: order by
nit. May test the auto column in GROUP by, ORDER BY, or in sequence functions.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 10
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jan 2023 15:50:36 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 2:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12050/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Tue, 20 Dec 2022 22:32:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 17:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/19383/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/16//COMMIT_MSG@17
PS16, Line 17: The assignment to it during insertion is automatic so
             : insertion statements should not specify values for auto-incrementing
             : column.
> Is there any test that explicitly trying to insert into auto_incrementing_i
Kudu server will return error if auto_incrementing_id is set. Added some negative test cases in kudu_insert.test.


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java@342
PS16, Line 342: sb.append(" ").append(KuduUtil.getPrimaryKeyString(isPrimaryKeyUnique_));
              :     }
> nit: Use KuduUtil.getPrimaryKeyString() here?
Done


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@488
PS16, Line 488: sb.append(KuduUtil.getPrimaryKeyString(isPrimaryKeyUnique)).append(" (");
              :         Joiner.on(", ").appendTo(sb
> nit: Use KuduUtil.getPrimaryKeyString() here?
Done


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@501
PS16, Line 501: Joiner.on(", ").appendTo(sb, primaryKeysSql).append(")");
              :       }
> nit: Use KuduUtil.getPrimaryKeyString() here?
Done


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@72
PS16, Line 72:         || (isKey && (isPrimaryKeyUnique && !isAutoIncrementing || !isPrimaryKeyUnique)));
             :     kuduName_ = name;
             :     isKey_ = isKey;
             :     isPrimaryKeyUnique_ = isPrimaryKeyUnique;
> Does it make sense to add Preconditions here to verify valid combo between 
Done


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@391
PS16, Line 391: isPrimaryKeyUnique_ = kuduSchema_.isPrimaryKeyUnique();
              :     hasAutoIncrementingColumn_ = kuduSchema_.hasAutoIncrementingColumn();
> If isPrimaryKeyUnique() return False, does hasAutoIncrementingColumn() guar
No. auto-incrementing column must be non unique primary key, but non unique primary key may not be auto-incrementing column.


http://gerrit.cloudera.org:8080/#/c/19383/16/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

http://gerrit.cloudera.org:8080/#/c/19383/16/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@690
PS16, Line 690: AnalysisException: Column already exists: auto_incrementing_id
> If auto_incrementing_id is reserved, is it valid to create / have existing 
No, Kudu return error. There is a test case which fails to add auto_incrementing_id column with ALTER TABLE. Added a test case which fails to create a table with auto_incrementing_id column.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 17
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 03 Feb 2023 08:05:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18: Code-Review+1

Thanks Wenzhe! Please carry my +1.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 02:27:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
within a continuous region of the table served by a tablet-server.

This patch also upgraded Kudu version to 345fd44ca3 to pick up Kudu
changes needed for supporting non-unique primary key. It added
syntactic support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) PARTITIONS 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY RANGE (id)
  (PARTITION VALUES <= 1000,
   PARTITION 1000 < VALUES <= 2000,
   PARTITION 2000 < VALUES <= 3000,
   PARTITION 3000 < VALUES)
  STORED as KUDU
  AS SELECT id, int_col FROM functional.alltypestiny ORDER BY id ASC
   LIMIT 4000;

  CREATE TABLE tbl (id INT, name STRING, NON UNIQUE PRIMARY KEY(id))
  STORED as KUDU;

  CREATE TABLE tbl (a INT, b STRING, c FLOAT)
  PARTITION BY HASH (a, b) PARTITIONS 3
  STORED as KUDU;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create Kudu tables
   with non unique primary key, and tested insert/update/delete
   operations for these tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Reviewed-on: http://gerrit.cloudera.org:8080/19383
Reviewed-by: Riza Suminto <ri...@cloudera.com>
Reviewed-by: Wenzhe Zhou <wz...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,353 insertions(+), 207 deletions(-)

Approvals:
  Riza Suminto: Looks good to me, but someone else must approve
  Wenzhe Zhou: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 19
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 11: Code-Review+1

(3 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/19383/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/11//COMMIT_MSG@45
PS11, Line 45: When creating a Kudu table, specifying PRIMARY KEY is optional.
             : If there is no primary key attribute specified, the partition key
             : columes will be promoted as non unique primary key if those columns
             : are the beginning columns of the table.
             : New column "key_unique" is added to the output of 'describe' table
             : command for Kudu table.
Suggest to move this para between line 25 and 26.


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> Added two more test cases as suggested.
I see.  It may be a good feature to have!


http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
File testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test:

http://gerrit.cloudera.org:8080/#/c/19383/10/testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test@204
PS10, Line 204: order by id
> Yes, we can order by auto_incrementing_id. But auto_incrementing_id is not 
When is it unique?  Maybe we can clarify in the commit message.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 11
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Jan 2023 23:58:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
33 files changed, 747 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/6
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................

WIP IMPALA-11809: Support non unique primary key for Kudu

Kudu engine adds support for non unique primary key. It adds
a column 'auto_increment_id' automatically in a table which
has non unique primary key. The non unique primary key and
'auto_increment_id' form unique composite primary key.

This patch integrated new version of Kudu which support non
unique primary key, added syntactic support for creating table
with non unique primary key.
Example:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) partitions 3
  STORED as KUDU;
  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) partitions 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

Kudu assign values for column 'auto_increment_id' automatically
when inserting rows so insertion statements don't need to specify
values for column 'auto_increment_id'.
SELECT statement does not show 'auto_increment_id' column unless
the column is explicitly specified in select list.
UPSERT operation is not supported now for Kudu table with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Integrated new version of Kudu built on local machine, ran
   manual test in impala-shell with queries to create tables
   with non unique primary key, and tested insert/update/delete
   operations for Kudu tables with non unique primary key.
 - Added front end and end to end unit tests.
   Passed query_test/test_kudu.py and custom_cluster/test_kudu.py
   on local environment with new version of Kudu built on local
   machine.
 - TODO build toolchian with new version of Kudu, including
   the commits for KUDU-1945. Run core test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/query_test/test_kudu.py
40 files changed, 1,100 insertions(+), 189 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/7
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 7:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/12130/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 00:12:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Qifan Chen (Code Review)" <ge...@cloudera.org>.
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 16: Code-Review+1

(2 comments)

Thanks for the rework!

http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@107
PS15, Line 107: 
              :   // Set to true if primary key is unique.
              :   private boolean isPrimaryKeyUnique_ = true;
              : 
              :   // Set to true if the table has auto-incrementing column.
              :   private boolean hasAutoIncrementingColumn_ = false;
> KuduTable extends from Table, LocalKuduTable extends from LocalTable. They 
Sounds like a good plan for some future work. Done


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> Already added similar test case with range partitions in end-to-end kudu_cr
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 16
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 23:09:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 16:

(7 comments)

Thanks Qifan for your review.

http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@23
PS15, Line 23: within a continuous region of the table served by a tablet-server.
> nit within a continuous region of the table served by a tablet-server.
Done


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@25
PS15, Line 25: a
> nit also
Done


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@47
PS15, Line 47:   AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;
> It will be great to include a DDL for a range partitioned kudu table here, 
Done


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@57
PS15, Line 57: 
> Kudu
Done


http://gerrit.cloudera.org:8080/#/c/19383/15//COMMIT_MSG@59
PS15, Line 59:  (id
> nit. for these tables
Done


http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/15/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@107
PS15, Line 107: 
              :   // Set to true if primary key is unique.
              :   private boolean isPrimaryKeyUnique_ = true;
              : 
              :   // Set to true if the table has auto-incrementing column.
              :   private boolean hasAutoIncrementingColumn_ = false;
> These two new members could be put into a new class and reused between Kudu
KuduTable extends from Table, LocalKuduTable extends from LocalTable. They have different parent classes. Other two variables primaryKeyColumnNames_ and partitionBy_ are also defined in both classes. We could define a new class with these 4 variables, but it causes more code change.


http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> To observe auto column works with a range partitioned kudu table such as 
Already added similar test case with range partitions in end-to-end kudu_create.test.



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 16
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 20:07:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] WIP IMPALA-11809: Support non unique primary key for Kudu

Posted by "Abhishek Chennaka (Code Review)" <ge...@cloudera.org>.
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: WIP IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19383/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@107
PS8, Line 107: A primary key is required for a Kudu table
Is this the right Error message for this case?
Additionally, if a user wants to create multiple partition levels, the columns on which the partitions are being created have to be specified first, right? In that case does the order matter as long as all the columns are the beginning of the table?


http://gerrit.cloudera.org:8080/#/c/19383/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java@715
PS8, Line 715: add
Do we need an equivalent test for dropping of the columns?


http://gerrit.cloudera.org:8080/#/c/19383/8/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/8/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@2798
PS8, Line 2798: i INT PRIMARY KEY, NON UNIQUE PRIMARY KEY(i)
This looks a bit confusing for the end user where is declared as both PRIMARY KEY and NON UNIQUE PRIMARY KEY. Do we want to allow this?
Additionally what if we do something like below? What would the result be?
i INT NON UNIQUE PRIMARY KEY, PRIMARY KEY(i)


http://gerrit.cloudera.org:8080/#/c/19383/8/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

http://gerrit.cloudera.org:8080/#/c/19383/8/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test@535
PS8, Line 535: A primary key is required for a Kudu table
As pointed out before, maybe a more helpful message might help here?



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Jan 2023 19:25:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12257/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 11
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Jan 2023 07:42:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................

IMPALA-11809: Support non unique primary key for Kudu

Kudu engine recently enables the auto-incrementing column feature
(KUDU-1945). The feature works by appending a system generated
auto-incrementing column to the primary key columns to guarantee the
uniqueness on primary key when the primary key columns can be non
unique. The non unique primary key columns and the auto-incrementing
column form the effective unique composite primary key.

This auto-incrementing column is named as 'auto_incrementing_id' with
big int type. The assignment to it during insertion is automatic so
insertion statements should not specify values for auto-incrementing
column. In current Kudu implementation, there is no central key provider
for auto-incrementing columns. It uses a per tablet-server global
counter to assign values for auto-incrementing columns. So the values
of auto-incrementing columns are not unique in a Kudu table, but unique
within a continuous region of the table served by a tablet-server.

This patch also upgraded Kudu version to 345fd44ca3 to pick up Kudu
changes needed for supporting non-unique primary key. It added
syntactic support for creating Kudu table with non unique primary key.
When creating a Kudu table, specifying PRIMARY KEY is optional.
If there is no primary key attribute specified, the partition key
columns will be promoted as non unique primary key if those columns
are the beginning columns of the table.
New column "key_unique" is added to the output of 'describe' table
command for Kudu table.

Examples of CREATE TABLE statement with non unique primary key:
  CREATE TABLE tbl (i INT NON UNIQUE PRIMARY KEY, s STRING)
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl (i INT, s STRING, NON UNIQUE PRIMARY KEY(i))
  PARTITION BY HASH (i) PARTITIONS 3
  STORED as KUDU;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY HASH (id) PARTITIONS 3
  STORED as KUDU
  AS SELECT id, string_col FROM functional.alltypes WHERE id = 10;

  CREATE TABLE tbl NON UNIQUE PRIMARY KEY(id)
  PARTITION BY RANGE (id)
  (PARTITION VALUES <= 1000,
   PARTITION 1000 < VALUES <= 2000,
   PARTITION 2000 < VALUES <= 3000,
   PARTITION 3000 < VALUES)
  STORED as KUDU
  AS SELECT id, int_col FROM functional.alltypestiny ORDER BY id ASC
   LIMIT 4000;

  CREATE TABLE tbl (id INT, name STRING, NON UNIQUE PRIMARY KEY(id))
  STORED as KUDU;

  CREATE TABLE tbl (a INT, b STRING, c FLOAT)
  PARTITION BY HASH (a, b) PARTITIONS 3
  STORED as KUDU;

SELECT statement does not show the system generated auto-incrementing
column unless the column is explicitly specified in the select list.
Auto-incrementing column cannot be added, removed or renamed with
ALTER TABLE statements.
UPSERT operation is not supported now for Kudu tables with auto
incrementing column due to limitation in Kudu engine.

Testing:
 - Ran manual test in impala-shell with queries to create Kudu tables
   with non unique primary key, and tested insert/update/delete
   operations for these tables with non unique primary key.
 - Added front end tests, and end to end unit tests for Kudu tables
   with non unique primary key.
 - Passed exhaustive test.

Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
---
M bin/impala-config.sh
M common/thrift/CatalogObjects.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/FeDb.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu-scan-node.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_hms_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/custom_cluster/test_kudu.py
M tests/metadata/test_ddl_base.py
M tests/query_test/test_kudu.py
44 files changed, 1,353 insertions(+), 207 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/19383/18
-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 18:

(2 comments)

Thanks Riza.

http://gerrit.cloudera.org:8080/#/c/19383/17/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
File fe/src/main/java/org/apache/impala/catalog/KuduColumn.java:

http://gerrit.cloudera.org:8080/#/c/19383/17/fe/src/main/java/org/apache/impala/catalog/KuduColumn.java@71
PS17, Line 71: if (isKey) {
             :       Preconditions.checkArgument(!isPrimaryKeyUnique || !isAutoIncrementing);
> May I suggest to split this with branch on isKey?
Changed Preconditions


http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/19383/16/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@391
PS16, Line 391: isPrimaryKeyUnique_ = kuduSchema_.isPrimaryKeyUnique();
              :     hasAutoIncrementingColumn_ = kuduSchema_.hasAutoIncrementingColumn();
> So is it valid to add this precondition here?
Yes, added precondition



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 18
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Feb 2023 02:03:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Wenzhe Zhou (Code Review)" <ge...@cloudera.org>.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/19383/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2365
PS10, Line 2365: partition by hash
> Yeah, agree that with the tablet concept.
I'm not sure what's test case you want to add here. Could you clarify a bit? I added test cases in kudu-scan-node.test with "order by auto_incrementing_id" and "group by auto_incrementing_id"



-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 13
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Feb 2023 19:59:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11809: Support non unique primary key for Kudu

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19383 )

Change subject: IMPALA-11809: Support non unique primary key for Kudu
......................................................................


Patch Set 15:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12292/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d7882bf3d01a3492cc9827c072d1f3200d9eebd
Gerrit-Change-Number: 19383
Gerrit-PatchSet: 15
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Marton Greber <gr...@gmail.com>
Gerrit-Reviewer: Qifan Chen <qf...@hotmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 02 Feb 2023 03:12:05 +0000
Gerrit-HasComments: No