You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Andrew Sherman (Code Review)" <ge...@cloudera.org> on 2019/03/06 16:19:00 UTC
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12680
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
IMPALA-6658 changed RleEncoder to have the ability to use run lengths
other than 8. It seemed that a slightly more complex RleEncoder could
save a small amount of disk space by using the longer run lengths, in
particular for bit width of 1. We now see a performance regression on a
simple ETL query. Overall it seems that the costs of IMPALA-6658 exceed
the benefits. This change removes IMPALA-6658.
The strategy for this was that the change to rle-encoding.h, which
contains the code, was undone using 'git revert'. I removed the test
changes in rle-test.cc that rely on different encoding lengths. This
allows us to keep some useful new tests that were written as part of
IMPALA-6658
TESTING:
Ran all end-to-end tests.
Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
---
M be/src/exec/parquet/parquet-bool-decoder-test.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
3 files changed, 139 insertions(+), 383 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/12680/1
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 1: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/12680/1/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:
http://gerrit.cloudera.org:8080/#/c/12680/1/be/src/util/rle-encoding.h@64
PS1, Line 64: For 1 bit-width values, that point is 8 values. They require 2 bytes
: /// for both the repeated encoding or the literal encoding. This value can always
: /// be computed based on the bit-width.
Maybe it could be mentioned that the optimal can be 16/24 in some cases, but we did not implement it because we are unsure about its benefits.
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Mar 2019 18:39:55 +0000
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Andrew Sherman (Code Review)" <ge...@cloudera.org>.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
IMPALA-6658 changed RleEncoder to have the ability to use run lengths
other than 8. It seemed that a slightly more complex RleEncoder could
save a small amount of disk space by using the longer run lengths, in
particular for bit width of 1. We now see a performance regression on a
simple ETL query. Overall it seems that the costs of IMPALA-6658 exceed
the benefits. This change removes IMPALA-6658.
The strategy for this was that the change to rle-encoding.h, which
contains the code, was undone using 'git revert'. I removed the test
changes in rle-test.cc that rely on different encoding lengths. This
allows us to keep some useful new tests that were written as part of
IMPALA-6658
TESTING:
Ran all end-to-end tests.
Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
---
M be/src/exec/parquet/parquet-bool-decoder-test.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
3 files changed, 141 insertions(+), 383 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/12680/3
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 3:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/2387/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Mar 2019 20:07:21 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 4: Verified+1
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Mar 2019 21:18:38 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Andrew Sherman (Code Review)" <ge...@cloudera.org>.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 1:
Thanks Csaba for the review
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Mar 2019 21:41:31 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 1:
Build Successful
https://jenkins.impala.io/job/gerrit-code-review-checks/2372/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Mar 2019 17:02:38 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 4: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Mar 2019 16:53:01 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 3: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Mar 2019 16:52:17 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
Patch Set 4:
Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3894/ DRY_RUN=false
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Comment-Date: Fri, 08 Mar 2019 16:53:02 +0000
Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12680 )
Change subject: IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
......................................................................
IMPALA-8279: Revert IMPALA-6658 to avoid ETL performance regression.
IMPALA-6658 changed RleEncoder to have the ability to use run lengths
other than 8. It seemed that a slightly more complex RleEncoder could
save a small amount of disk space by using the longer run lengths, in
particular for bit width of 1. We now see a performance regression on a
simple ETL query. Overall it seems that the costs of IMPALA-6658 exceed
the benefits. This change removes IMPALA-6658.
The strategy for this was that the change to rle-encoding.h, which
contains the code, was undone using 'git revert'. I removed the test
changes in rle-test.cc that rely on different encoding lengths. This
allows us to keep some useful new tests that were written as part of
IMPALA-6658
TESTING:
Ran all end-to-end tests.
Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Reviewed-on: http://gerrit.cloudera.org:8080/12680
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/parquet/parquet-bool-decoder-test.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
3 files changed, 141 insertions(+), 383 deletions(-)
Approvals:
Impala Public Jenkins: Looks good to me, approved; Verified
--
To view, visit http://gerrit.cloudera.org:8080/12680
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If6bcbaf564fbbe6dc83ba3afc100b4e5ccc7af40
Gerrit-Change-Number: 12680
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>