You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Smith (Code Review)" <ge...@cloudera.org> on 2022/12/20 00:35:19 UTC

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Michael Smith has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19324


Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................

IMPALA-11476: Test Ozone with erasure coding

Tweaks to test Ozone with erasure coding.

Updates to a newer CDP build to pick up HDDS-7208.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
7 files changed, 29 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/6
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#11).

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................

IMPALA-11476: Support Ozone erasure coding

Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Omits support for identifying erasure coding policy with the o3fs
protocol as that protocol is effectively deprecated and its classes
don't provide access to the ObjectStore.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
10 files changed, 103 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/11
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Jan 2023 03:10:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12227/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 21:19:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#7).

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................

IMPALA-11476: Test Ozone with erasure coding

Tweaks to test Ozone with erasure coding.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
7 files changed, 29 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/7
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................

IMPALA-11476: Support Ozone erasure coding

Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Omits support for identifying erasure coding policy with the o3fs
protocol as that protocol is effectively deprecated and its classes
don't provide access to the ObjectStore.

Refactors volumeBucketPair to use StringTokenizer.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Reviewed-on: http://gerrit.cloudera.org:8080/19324
Reviewed-by: Joe McDonnell <jo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
10 files changed, 107 insertions(+), 31 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 13
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@233
PS11, Line 233:       if (fs instanceof DistributedFileSystem) {
> Is it possible for us to use isDistributedFilesystem() and isOzoneFileSyste
It is, although we'll need a specialization for isOfsFileSystem or still have to test instanceof. Is it about performance or semantics?


http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@316
PS11, Line 316:     switch (tokens.countTokens()) {
              :       case 0:
              :         return Pair.create("", "");
              :       case 1:
              :         return Pair.create(tokens.nextToken(), "");
              :       default:
              :         return Pair.create(tokens.nextToken(), tokens.nextToken());
              :     }
> Nit: Can we have local variables corresponding to volume and bucket and the
Sure. I don't really need to rewrite this, just noticed a StringTokenizer would cover the logic I already had when I was looking at more parsing for volume/bucket/key.



-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 20:17:05 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#8).

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................

IMPALA-11476: Test Ozone with erasure coding

Tweaks to test Ozone with erasure coding.

In slower test environments the Ozone client may end up excluding slow
nodes, causing some tests to fail because there are not enough available
nodes to perform writes (erasure coding 3-2 can reconstruct data during
reads after 2 replica failures, but requires all replicas to write). In
our test environment, we don't provision enough nodes to afford
exclusion, so effectively disable it by setting
ozone.client.exclude.nodes.expiry.time=1 (ms).

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
7 files changed, 32 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/8
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@233
PS11, Line 233:       if (fs instanceof DistributedFileSystem) {
Is it possible for us to use isDistributedFilesystem() and isOzoneFileSystem()?


http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@316
PS11, Line 316:     switch (tokens.countTokens()) {
              :       case 0:
              :         return Pair.create("", "");
              :       case 1:
              :         return Pair.create(tokens.nextToken(), "");
              :       default:
              :         return Pair.create(tokens.nextToken(), tokens.nextToken());
              :     }
Nit: Can we have local variables corresponding to volume and bucket and then do Pair.create(volume, bucket)? This is just to make it very explicit.



-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 18:51:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#12).

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................

IMPALA-11476: Support Ozone erasure coding

Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Omits support for identifying erasure coding policy with the o3fs
protocol as that protocol is effectively deprecated and its classes
don't provide access to the ObjectStore.

Refactors volumeBucketPair to use StringTokenizer.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
10 files changed, 107 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/12
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19324/8/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/19324/8/bin/impala-config.sh@731
PS8, Line 731:     export OZONE_ERASURECODE_POLICY="RS-3-2-1024k"
Need to update tests to use OZONE_ERASURECODE_POLICY, and add the policy to show commands.



-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Jan 2023 00:58:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12220/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Mon, 23 Jan 2023 23:36:41 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@233
PS11, Line 233:       if (fs instanceof DistributedFileSystem) {
> It is, although we'll need a specialization for isOfsFileSystem or still ha
We switched to using the scheme in IMPALA-10266. https://github.com/apache/impala/commit/4b5c66f329cdd818dd11cd1a9c68b58c84bcf45c
https://issues.apache.org/jira/browse/IMPALA-10266

My main thought here is that it would be nice to stick with the scheme if we can. The concern would be that class names can change. That is less of a concern with HDFS and Ozone, because we have tests and would probably notice. So, I guess this is more of a consistency thing.


http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@316
PS11, Line 316:     switch (tokens.countTokens()) {
              :       case 0:
              :         return Pair.create("", "");
              :       case 1:
              :         return Pair.create(tokens.nextToken(), "");
              :       default:
              :         return Pair.create(tokens.nextToken(), tokens.nextToken());
              :     }
> Sure. I don't really need to rewrite this, just noticed a StringTokenizer w
Refactoring is fine. My nit is that I find multiple arguments of tokens.nextToken() to make me think a bit about order of evaluation. We have unit tests and it must be correct, but using local variables would make it very obvious.



-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 11
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 20:49:12 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12101/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Jan 2023 00:05:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Support erasure coding policy

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#9).

Change subject: IMPALA-11476: Support erasure coding policy
......................................................................

IMPALA-11476: Support erasure coding policy

Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
10 files changed, 103 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/9
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 12: Code-Review+2

Thanks, this looks good to me


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 22:02:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8984/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 22:02:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12044/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 6
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 20 Dec 2022 00:54:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12088/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 03 Jan 2023 23:36:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@233
PS11, Line 233:         DistributedFileSystem dfs = (Distributed
> We switched to using the scheme in IMPALA-10266. https://github.com/apache/
Done


http://gerrit.cloudera.org:8080/#/c/19324/11/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@316
PS11, Line 316:     }
              :     return numFilesMoved;
              :   }
              : 
              :   // Returns the first two elements (volume, bucket) of the unqualified path.
              :   public static Pair<String, String> volumeBucketPair(Path p) {
              :     String path = Path.getPathWithoutSchemeAndAuthority(p).toString();
              :     S
> Refactoring is fine. My nit is that I find multiple arguments of tokens.nex
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 24 Jan 2023 21:00:27 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11476: Support Ozone erasure coding

Posted by "Michael Smith (Code Review)" <ge...@cloudera.org>.
Hello Joe McDonnell, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19324

to look at the new patch set (#10).

Change subject: IMPALA-11476: Support Ozone erasure coding
......................................................................

IMPALA-11476: Support Ozone erasure coding

Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/ozone-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_coordinators.py
M tests/query_test/test_io_metrics.py
M tests/query_test/test_scanners.py
10 files changed, 103 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/19324/10
-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-11476: Test Ozone with erasure coding

Posted by "Joe McDonnell (Code Review)" <ge...@cloudera.org>.
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19324 )

Change subject: IMPALA-11476: Test Ozone with erasure coding
......................................................................


Patch Set 8: Code-Review+1

This is looking good to me so far


-- 
To view, visit http://gerrit.cloudera.org:8080/19324
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Gerrit-Change-Number: 19324
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Smith <mi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Michael Smith <mi...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Jan 2023 18:26:14 +0000
Gerrit-HasComments: No