You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Qifan Chen (Code Review)" <ge...@cloudera.org> on 2021/04/09 19:12:06 UTC

[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

Qifan Chen has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in coordinator
......................................................................

IMPALA-10647 Improve always-true min/max filter handling in coordinator

The change improves how a coordinator behaves when a just
arriving min/max filter is the last one to arrive or is always true.
Previously, the coordinator disables the corresponding filter
representation by setting it to Always True, which makes it
impossible to differentiate a true AlwaysTrue filter (say, set in the
hash join building step) from the one being disabled. A dedicated
Boolean variable minmaxDisabled_ is introduced to record the disabled
state. The Always True state of a filter is never altered. The
enhancement improves the display of the min and max column in
"Filter routing table" and "Final filter table" in profile. These two
columns now display the following possible values.
  1. 'PartialUpdates' - The min and the max are partially updated;
  2. 'AlwaysTrue'     - The filter is always true;
  3. 'AlwaysFalse'    - The filter is always false;
  4. Real values      - The filter is neither always true or false,
                        fully updated with the min/max real values.

A second change introduced is to record, in profile for scan node, the
arrival time of min/max filters (in elapsed time since the system is
rebooted obtained by calling MonotonicMillis()). It can help the
diagnosis of late arrival of filters, when compared with the elpased
time when a row group is filtered with these filters.

Testing:
  1. Ran unit tests;
  2. Ran core tests.

Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M tests/query_test/test_runtime_filters.py
7 files changed, 111 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/11
-- 
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 11
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>