You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tianyi Wang (Code Review)" <ge...@cloudera.org> on 2017/08/12 02:49:52 UTC

[Impala-ASF-CR] IMPALA-4794: Grouping distinct agg plan robust to data skew

Tianyi Wang has uploaded a new patch set (#2).

Change subject: IMPALA-4794: Grouping distinct agg plan robust to data skew
......................................................................

IMPALA-4794: Grouping distinct agg plan robust to data skew

This patch changes the query plan for grouping distinct aggregations to
be more robust to data skew in the grouping expressions. The existing
plan partitions data between phase-1 and phase-2 by grouping expr and
the data skewness on grouping expr directly impacts performance. The
new plan partitions data by both grouping expr and distinct aggregation
expr, then adds one more aggregation and exchange node. It is supposed
to be faster with data skew but slower otherwise.

Teting: Modified existing planner tests which already provide sufficient
coverage. The pattern is that the distinct aggregation expr is added to
exchange node, followed by an additional merge aggregation and exchange
node.

Change-Id: I7bdada0e328b555900c7b7ff8aabc8eb15ae8fa9
---
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
7 files changed, 213 insertions(+), 128 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/7643/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bdada0e328b555900c7b7ff8aabc8eb15ae8fa9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>