You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2017/08/23 06:40:15 UTC

[5/6] incubator-impala git commit: IMPALA-5815: right outer join returns invalid memory

IMPALA-5815: right outer join returns invalid memory

The bug is that OutputAllBuild() called BufferedTupleStream::GetNext()
while 'out_batch' still referenced data from the current page
of the stream. When iterating over an unpinned stream, GetNext()
sets the 'needs_deep_copy' flag when it hits the end of a page,
so that the caller has an opportunity to flush or deep copy the
data. On the next call to GetNext(), that page may be deleted
or unpinned.

The fix is to check whether the batch is at capacity before
calling BTS::GetNext().

This issue was masked by not using the 'delete_on_read' mode of the
stream, which would have freed the stream's buffers earlier and
increased the odds of ASAN detecting the problem.

Testing:
Running TestTPCHJoinQueries.test_outer_joins() reliably reproduced this
for me locally under ASAN. After the fix the problem does not reoccur.

Change-Id: Ia14148499ddaec41c2e70fef5d53e5d06ea0538d
Reviewed-on: http://gerrit.cloudera.org:8080/7772
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/3b1a1df7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/3b1a1df7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/3b1a1df7

Branch: refs/heads/master
Commit: 3b1a1df7e30de7d21cc30eba8bc7d318ee063d5a
Parents: 5f32312
Author: Tim Armstrong <ta...@cloudera.com>
Authored: Fri Aug 18 13:10:51 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Wed Aug 23 06:17:17 2017 +0000

----------------------------------------------------------------------
 be/src/exec/partitioned-hash-join-node.cc | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3b1a1df7/be/src/exec/partitioned-hash-join-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/partitioned-hash-join-node.cc b/be/src/exec/partitioned-hash-join-node.cc
index 2c656b0..13d660f 100644
--- a/be/src/exec/partitioned-hash-join-node.cc
+++ b/be/src/exec/partitioned-hash-join-node.cc
@@ -387,7 +387,7 @@ Status PartitionedHashJoinNode::PrepareSpilledPartitionForProbe(
     DCHECK(NeedToProcessUnmatchedBuildRows());
     bool got_read_buffer = false;
     RETURN_IF_ERROR(input_partition_->build_partition()->build_rows()->PrepareForRead(
-        false, &got_read_buffer));
+        true, &got_read_buffer));
     if (!got_read_buffer) {
       return mem_tracker()->MemLimitExceeded(
           runtime_state_, Substitute(PREPARE_FOR_READ_FAILED_ERROR_MSG, id_));
@@ -682,6 +682,12 @@ Status PartitionedHashJoinNode::OutputAllBuild(RowBatch* out_batch) {
     if (output_unmatched_batch_iter_->AtEnd()) {
       output_unmatched_batch_->TransferResourceOwnership(out_batch);
       output_unmatched_batch_->Reset();
+      // Need to flush any resources attached to 'out_batch' before calling
+      // build_rows()->GetNext().  E.g. if the previous call to GetNext() set the
+      // 'needs_deep_copy' flag, then calling GetNext() before we return the current
+      // batch leave the batch referencing invalid memory (see IMPALA-5815).
+      if (out_batch->AtCapacity()) break;
+
       RETURN_IF_ERROR(output_build_partitions_.front()->build_rows()->GetNext(
           output_unmatched_batch_.get(), &eos));
       output_unmatched_batch_iter_.reset(