You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Huaisi Xu (Code Review)" <ge...@cloudera.org> on 2016/03/03 01:31:42 UTC

[Impala-CR](cdh5-2.2.0_5.4.x) IMPALA-3034: Verify all consumed memory of a MemTracker is always released at destruction time.

Hello Henry Robinson, Internal Jenkins,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/2423

to review the following change.

Change subject: IMPALA-3034: Verify all consumed memory of a MemTracker is always released at destruction time.
......................................................................

IMPALA-3034: Verify all consumed memory of a MemTracker is always released at destruction time.

By adding a DCHECK at the destructor of MemTracker to verify
that all its consumed memory has been released, various problems
are found in the code. This change fixes all these issues.

The first problem has to do with a memory leak in DataStreamRecvr.
DataStreamRecvr::Close() is responsible for shutting down all the
sender queues and all the row batches in them. However, it tears
down the queues in the wrong order. Specifically, it calls
DataStreamRecvr::SenderQueue::Close() on each sender queue before
unregistering them. There exists a window between the shut down
of the queues and their unregistration which allows other threads
to enqueue into already closed queues. This change fixes the problem
by first unregistering the sender queues before shutting them down.
This guarantees that no new row batch will be created once it has
been unregistered so DataStreamRecvr::SenderQueue::Close() will not
leak any row batches.

The second problem is an accounting problem in which RuntimeFilter
fails to call Release() on query_mem_tracker_ for all the memory it
has consumed. Specifically, if ProcessBuildInput() fails in
ConstructBuildSide() of a PHJ node, the outstanding bloom filters
allocated in AllocatedRuntimeFilters() will not be added to
produced_filters_ set. Since RunFilterBank::Close() only calls
MemTracker::Release() on filters in produced_filters_ and
consumed_filters_ sets, it will miss these 'orphaned' filters.
Fortunately, all allocated filters are kept in the object pool so
there is no memory leak. Since the amount of memory consumed is
already tracked by 'memory_allocated_', this change fixes the
problem by calling MemTracker::Release() with 'memory_allocated_'
in RuntimeFilterbank::Close().

The third problem is that DataStreamSender::Send() frees local
allocations before the expression evaluation instead of after
the evaluation. This leaves some local allocations hanging around
longer than needed. This change moves the location in which local
allocations are freed. In addition, this change also fixes the order
of two fields 'runtime_state_' and 'sink_' in PlanFragmentExecutor.
The wrong order can lead lead to freed memory being accessed in the
destructor of MemTracker and potentially other places.

Change-Id: Ie87fd4cd10a7c7da806ed95eeb262ed51e74ec7b
Reviewed-on: http://gerrit.cloudera.org:8080/2269
Reviewed-by: Henry Robinson <he...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit 903920ef66fd7c209dacc970c7ef0321c6180093)
---
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/mem-tracker-test.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/plan-fragment-executor.h
6 files changed, 32 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/23/2423/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie87fd4cd10a7c7da806ed95eeb262ed51e74ec7b
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>