You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2018/02/23 04:53:48 UTC

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9417


Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Conflicts:
	be/src/exec/parquet-column-readers.cc
	testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
	testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj-no-deny-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj.test
A testdata/workloads/functional-query/queries/QueryTest/spilling-no-debug-action.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-sorts-exhaustive.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M testdata/workloads/tpch/queries/sort-reservation-usage.test
M tests/common/test_dimensions.py
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/query_test/test_scanners_fuzz.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
73 files changed, 3,181 insertions(+), 1,523 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/9417/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 4: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 22:39:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9417

to look at the new patch set (#3).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Conflicts:
	be/src/exec/parquet-column-readers.cc
	testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
	testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test

Other backport notes:
File sizes of functional_parquet.alltypes are slightly less than 16kb on
2.x but slightly more than 16kb on master because sorts are not clustered
by default. This changes resource requirements in some plans.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj-no-deny-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj.test
A testdata/workloads/functional-query/queries/QueryTest/spilling-no-debug-action.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-sorts-exhaustive.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M testdata/workloads/tpch/queries/sort-reservation-usage.test
M tests/common/test_dimensions.py
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/query_test/test_scanners_fuzz.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
73 files changed, 3,181 insertions(+), 1,523 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/9417/3
-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1996/


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Fri, 23 Feb 2018 04:54:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 1:

There were some conflicts here, so resolving them before the cherry-pick job gets to it. Just doing an initial test to see if things are clean.


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 05:01:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 4: Code-Review+2

The code conflicts were trivial. The planner test conflicts around file sizes were expected (I flagged this on the original review).


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 17:52:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1996/


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 08:40:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1999/


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 08:49:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.
* Added unit tests for dividing reservation between columns in parquet,
  since the algorithm is non-trivial.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. Cluster perf tests showed no significant change.

Conflicts:
	be/src/exec/parquet-column-readers.cc
	testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
	testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test

Other backport notes:
File sizes of functional_parquet.alltypes are slightly less than 16kb on
2.x but slightly more than 16kb on master because sorts are not clustered
by default. This changes resource requirements in some plans.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Reviewed-on: http://gerrit.cloudera.org:8080/9417
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/CMakeLists.txt
A be/src/exec/hdfs-parquet-scanner-test.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
M testdata/workloads/functional-query/queries/QueryTest/set.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-aggs.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj-no-deny-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-naaj.test
A testdata/workloads/functional-query/queries/QueryTest/spilling-no-debug-action.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-sorts-exhaustive.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M testdata/workloads/tpch/queries/sort-reservation-usage.test
M tests/common/test_dimensions.py
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/query_test/test_scanners_fuzz.py
M tests/query_test/test_sort.py
M tests/query_test/test_spilling.py
73 files changed, 3,181 insertions(+), 1,523 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2001/


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 17:29:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR](2.x) IMPALA-4835: Part 3: switch I/O buffers to buffer pool

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9417 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/9417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 9417
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Feb 2018 12:26:28 +0000
Gerrit-HasComments: No