You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org> on 2022/03/31 08:02:12 UTC

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18371


Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................

IMPALA-11214: Impala reloads Iceberg tables per each data file

Due to a bug in IMPALA-11053, Impala reloads the Iceberg table per each
data file. This causes a serious perf regression for table loads.

This patch avoids reloading the Iceberg tables for each data file.

Testing:
 * added exhaustive e2e test

Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M tests/query_test/test_iceberg.py
4 files changed, 35 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/18371/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7992/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 31 Mar 2022 08:03:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7998/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 17:00:28 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Tamas Mate (Code Review)" <ge...@cloudera.org>.
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 1: Code-Review+2

LGTM! Thank you for the quick fix!


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 08:25:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Zoltan Borok-Nagy (Code Review)" <ge...@cloudera.org>.
Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................

IMPALA-11214: Impala reloads Iceberg tables per each data file

Due to a bug in IMPALA-11053, Impala reloads the Iceberg table per each
data file. This causes a serious perf regression for table loads.

This patch avoids reloading the Iceberg tables for each data file.

Testing:
 * added exhaustive e2e test

Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Reviewed-on: http://gerrit.cloudera.org:8080/18371
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M tests/query_test/test_iceberg.py
4 files changed, 35 insertions(+), 11 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 2: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 21:28:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10369/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 31 Mar 2022 08:22:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7992/


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 12:29:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7995/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 12:56:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 12:56:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11214: Impala reloads Iceberg tables per each data file

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18371 )

Change subject: IMPALA-11214: Impala reloads Iceberg tables per each data file
......................................................................


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7995/


-- 
To view, visit http://gerrit.cloudera.org:8080/18371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed5a8c46c97aaa873dd1e925eed83d4573cf208
Gerrit-Change-Number: 18371
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tm...@apache.org>
Gerrit-Comment-Date: Thu, 31 Mar 2022 17:25:35 +0000
Gerrit-HasComments: No