You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Thomas Marshall (Code Review)" <ge...@cloudera.org> on 2018/07/05 21:20:13 UTC

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Thomas Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10871


Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................

IMPALA-7251: Fix QueryMaintenance calls in Aggregators

A recent change, IMPALA-110 (part 2), refactored
PartitionedAggregationNode into several classes, including a new type
'Aggregator'. During this refactor, code that makes local allocations
while evaluating exprs was moved from the ExecNode (now
AggregationNode/StreamingAggregationNode) into the Aggregators, but
code related to cleaning these allocations up (ie QueryMaintenance())
was not, resulting in some queries using an excessive amount of
memory.

This patch removes all calls to QueryMaintenance() from the exec nodes
and moves them into the Aggregators.

Testing:
- Added a new test case with a mem limit that fails if the expr
  allocations aren't released in a timely manner.
- Passed a full exhaustive run.

Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
---
M be/src/exec/aggregation-node.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/streaming-aggregation-node.cc
M testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
4 files changed, 26 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/10871/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 13 Jul 2018 03:04:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2808/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Jul 2018 23:48:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................

IMPALA-7251: Fix QueryMaintenance calls in Aggregators

A recent change, IMPALA-110 (part 2), refactored
PartitionedAggregationNode into several classes, including a new type
'Aggregator'. During this refactor, code that makes local allocations
while evaluating exprs was moved from the ExecNode (now
AggregationNode/StreamingAggregationNode) into the Aggregators, but
code related to cleaning these allocations up (ie QueryMaintenance())
was not, resulting in some queries using an excessive amount of
memory.

This patch removes all calls to QueryMaintenance() from the exec nodes
and moves them into the Aggregators.

Testing:
- Added new test cases with a mem limit that fails if the expr
  allocations aren't released in a timely manner.
- Passed a full exhaustive run.

Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Reviewed-on: http://gerrit.cloudera.org:8080/10871
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregator.cc
M be/src/exec/aggregator.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/streaming-aggregation-node.cc
M testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
8 files changed, 51 insertions(+), 15 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Jul 2018 00:27:47 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10871

to look at the new patch set (#2).

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................

IMPALA-7251: Fix QueryMaintenance calls in Aggregators

A recent change, IMPALA-110 (part 2), refactored
PartitionedAggregationNode into several classes, including a new type
'Aggregator'. During this refactor, code that makes local allocations
while evaluating exprs was moved from the ExecNode (now
AggregationNode/StreamingAggregationNode) into the Aggregators, but
code related to cleaning these allocations up (ie QueryMaintenance())
was not, resulting in some queries using an excessive amount of
memory.

This patch removes all calls to QueryMaintenance() from the exec nodes
and moves them into the Aggregators.

Testing:
- Added new test cases with a mem limit that fails if the expr
  allocations aren't released in a timely manner.
- Passed a full exhaustive run.

Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
---
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregator.cc
M be/src/exec/aggregator.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/streaming-aggregation-node.cc
M testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
8 files changed, 51 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/10871/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Thomas Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10871/1/testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
File testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test:

http://gerrit.cloudera.org:8080/#/c/10871/1/testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test@253
PS1, Line 253: select count(distinct concat(cast(l_comment as char(120)), cast(l_comment as char(120)),
> I think there's still a similar bug in NonGroupingAggregator if the express
I fixed the NonGroupingAggregator part.

For the mem_tracker stuff, turns out its not quite trivial to fix - if you just initialize Aggregator::mem_tracker_ with ExecNode::mem_tracker as its parent you end up with issues with mismatched parent links when calling ReservationTracker::InitChildTracker() during ClaimBufferReservation(). I filed IMPALA-7256



-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Jul 2018 21:02:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10871/1/testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test
File testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test:

http://gerrit.cloudera.org:8080/#/c/10871/1/testdata/workloads/functional-query/queries/QueryTest/spilling-regression-exhaustive.test@253
PS1, Line 253: select count(distinct concat(cast(l_comment as char(120)), cast(l_comment as char(120)),
I think there's still a similar bug in NonGroupingAggregator if the expressions evaluated for the aggregate function accumulate memory, e.g.

  select min(regexp_replace(l_comment, ".", "x"))
from tpch.lineitem

I also noticed that the memory accounting in the exec summary is wrong because Aggregator::mem_tracker_'s parent isn't the ExecNodes.



-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Jul 2018 00:54:08 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7251: Fix QueryMaintenance calls in Aggregators

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10871 )

Change subject: IMPALA-7251: Fix QueryMaintenance calls in Aggregators
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Gerrit-Change-Number: 10871
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <th...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 12 Jul 2018 23:48:02 +0000
Gerrit-HasComments: No