You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tamas Mate (Code Review)" <ge...@cloudera.org> on 2022/06/21 18:54:49 UTC

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Tamas Mate has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18649


Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................

IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

INSERT OVERWRITE can be allowed in some circumstances:
 - no partition evolution is in place
 - static values are being inserted
 - the overwrite is executed on the same table

Testing:
 - Added e2e tests.

Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
3 files changed, 60 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/18649/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <tm...@apache.org>

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18649/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/18649/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@635
PS1, Line 635:         if (overwrite_ && iceTable.getPartitionSpecs().size() > 1) {
Multiple partition specs are also problematic if there's no BUCKET transform.



-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Jun 2022 14:39:40 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 1:

(2 comments)

Thanks for fixing this. Looks great, only had small comments.

http://gerrit.cloudera.org:8080/#/c/18649/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

http://gerrit.cloudera.org:8080/#/c/18649/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@647
PS1, Line 647: destination
nit: target table


http://gerrit.cloudera.org:8080/#/c/18649/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
File testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test:

http://gerrit.cloudera.org:8080/#/c/18649/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test@177
PS1, Line 177: insert overwrite iceberg_overwrite_bucket values (1), (2), (3);
This could be 3 different INSERTs, then we could also check

SELECT INPUT__FILE__NAME, count(*) FROM t;

It should show 3 different files. And after INSERT OVERWRITE the same query would show a single file.



-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 22 Jun 2022 11:25:33 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 8: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 8
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Aug 2022 08:31:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................

IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

This change has been considered only for Iceberg tables mainly for table
maintenance reasons. Iceberg table writes create new snapshots and these
can accumulate over time. This commit allows a simple form of compaction
of these snapshots.

INSERT OVERWRITES have been blocked in case partition evolution is in
place, because it would be possible to overwrite a data file with a
newer schema that has less columns. This could cause unexpected data
loss.

For bucketed tables, the following syntax is allowed to be executed:
  INSERT OVERWRITE ice_tbl SELECT * FROM ice_tbl;
The source and target table has to be the same and specified, only
SELECT '*' queries are allowed. These requirements are also in place to
avoid unexpected data loss.
 - Values are not allowed, because inserting a single record could
   overwrite a whole file in a bucket.
 - Only source table is allowed, because at the time of the insert it
   is unknown which files will be modified, similar to values.

Testing:
 - Added e2e tests.

Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
3 files changed, 109 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/18649/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 5
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8342/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 6
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jul 2022 18:14:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11019/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Jul 2022 12:11:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Jul 2022 16:36:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 8: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 8
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Aug 2022 13:36:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Aug 2022 07:19:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10997/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 5
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jul 2022 18:06:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................

IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

This change has been considered only for Iceberg tables mainly for table
maintenance reasons. Iceberg table writes create new snapshots and these
can accumulate over time. This commit allows a simple form of compaction
of these snapshots.

INSERT OVERWRITES have been blocked in case partition evolution is in
place, because it would be possible to overwrite a data file with a
newer schema that has less columns. This could cause unexpected data
loss.

For bucketed tables, the following syntax is allowed to be executed:
  INSERT OVERWRITE ice_tbl SELECT * FROM ice_tbl;
The source and target table has to be the same and specified, only
SELECT '*' queries are allowed. These requirements are also in place to
avoid unexpected data loss.
 - Values are not allowed, because inserting a single record could
   overwrite a whole file in a bucket.
 - Only source table is allowed, because at the time of the insert it
   is unknown which files will be modified, similar to values.

Testing:
 - Added e2e tests.

Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 112 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/18649/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................

IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

This change has been considered only for Iceberg tables mainly for table
maintenance reasons. Iceberg table writes create new snapshots and these
can accumulate over time. This commit allows a simple form of compaction
of these snapshots, later the old snapshots can be expired.

All types of INSERT OVERWRITES have been blocked in case partition
evolution is in place, to avoid unexpected data loss.

For bucketed tables, the following syntax is allowed to be executed:
  INSERT OVERWRITE ice_tbl SELECT * FROM ice_tbl;
The source and target table has to be the same and specified, only
SELECT '*' select_queries are allowed. These requirements are also in
place to avoid unexpected data loss.

Testing:
 - Added e2e tests.

Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
3 files changed, 88 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/18649/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 3
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10825/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 1
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Jun 2022 19:14:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................

IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

This change has been considered only for Iceberg tables mainly for table
maintenance reasons. Iceberg table writes create new snapshots and these
can accumulate over time. This commit allows a simple form of compaction
of these snapshots.

INSERT OVERWRITES have been blocked in case partition evolution is in
place, because it would be possible to overwrite a data file with a
newer schema that has less columns. This could cause unexpected data
loss.

For bucketed tables, the following syntax is allowed to be executed:
  INSERT OVERWRITE ice_tbl SELECT * FROM ice_tbl;
The source and target table has to be the same and specified, only
SELECT '*' queries are allowed. These requirements are also in place to
avoid unexpected data loss.
 - Values are not allowed, because inserting a single record could
   overwrite a whole file in a bucket.
 - Only source table is allowed, because at the time of the insert it
   is unknown which files will be modified, similar to values.

Testing:
 - Added e2e tests.

Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Reviewed-on: http://gerrit.cloudera.org:8080/18649
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-overwrite.test
M tests/custom_cluster/test_events_custom_configs.py
4 files changed, 112 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 9
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8385/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 8
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Aug 2022 08:31:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8342/


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 6
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Jul 2022 23:07:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18649 )

Change subject: IMPALA-11378: Allow INSERT OVERWRITE for bucket tranforms in some cases
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10958/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18649
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd1bc19d839297246eadeb754cdeeec1e306098a
Gerrit-Change-Number: 18649
Gerrit-PatchSet: 3
Gerrit-Owner: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gf...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Jul 2022 13:03:52 +0000
Gerrit-HasComments: No