You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2017/04/19 01:20:29 UTC

[2/6] incubator-impala git commit: IMPALA-5173: crash with hash join feeding directly into nlj

IMPALA-5173: crash with hash join feeding directly into nlj

The background for this bug is that we can't transfer ownership
of BufferdBlockMgr::Blocks that are attached to RowBatches.

The NestedLoopJoinNode accumulates row batches on its right
side and tries to take ownership of the memory, which doesn't
work as expected in this case.

The fix is to copy the data when we encounter one of these
(likely very rare) cases.

Testing:
Added a regression test that produces a crash before the fix and
succeeds after the fix.

Change-Id: I0c04952e591d17e5ff7e994884be4c4c899ae192
Reviewed-on: http://gerrit.cloudera.org:8080/6568
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/96316e3b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/96316e3b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/96316e3b

Branch: refs/heads/master
Commit: 96316e3b34e1102d0df2714e617d9847b853a4c4
Parents: 1fa3315
Author: Tim Armstrong <ta...@cloudera.com>
Authored: Wed Apr 5 16:26:44 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Tue Apr 18 20:11:23 2017 +0000

----------------------------------------------------------------------
 be/src/exec/nested-loop-join-builder.cc         |  8 ++++-
 .../queries/QueryTest/spilling.test             | 38 +++++++++++++++++++-
 2 files changed, 44 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/96316e3b/be/src/exec/nested-loop-join-builder.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/nested-loop-join-builder.cc b/be/src/exec/nested-loop-join-builder.cc
index 5631df4..86848f5 100644
--- a/be/src/exec/nested-loop-join-builder.cc
+++ b/be/src/exec/nested-loop-join-builder.cc
@@ -45,10 +45,16 @@ Status NljBuilder::Send(RuntimeState* state, RowBatch* batch) {
   build_batch->AcquireState(batch);
 
   AddBuildBatch(build_batch);
-  if (build_batch->needs_deep_copy()) {
+  if (build_batch->needs_deep_copy() || build_batch->num_blocks() > 0
+      || build_batch->num_buffers() > 0) {
     // This batch and earlier batches may refer to resources passed from the child
     // that aren't owned by the row batch itself. Deep copying ensures that the row
     // batches are backed by memory owned by this node that is safe to hold on to.
+    //
+    // Acquiring ownership of attached Blocks or Buffers does not correctly update the
+    // accounting, so also copy data in that cases to avoid stealing reservation
+    // from whoever created the Block/Buffer. TODO: remove workaround when IMPALA-4179
+    // is fixed.
     RETURN_IF_ERROR(DeepCopyBuildBatches(state));
   }
   return Status::OK();

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/96316e3b/testdata/workloads/functional-query/queries/QueryTest/spilling.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/spilling.test b/testdata/workloads/functional-query/queries/QueryTest/spilling.test
index 89668e8..aa524a3 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/spilling.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/spilling.test
@@ -689,4 +689,40 @@ BIGINT
 5995258
 ---- RUNTIME_PROFILE
 row_regex: .*NumHashTableBuildsSkipped: .* \([1-9][0-9]*\)
-====
\ No newline at end of file
+====
+---- QUERY
+# IMPALA-5171: spilling hash join feeding into right side of nested loop join.
+# Equivalent to:
+#   select *
+#   from lineitem
+#   where l1.l_quantity = 31.0 and l1.l_tax = 0.03 and l1.l_orderkey <= 100000
+# order by l_orderkey, l_partkey, l_suppkey, l_linenumber
+# limit 5
+set max_block_mgr_memory=7m;
+set num_nodes=1;
+select straight_join l.*
+from
+   (select *
+    from tpch_parquet.orders limit 1) o,
+   (select l2.*
+     from tpch_parquet.lineitem l1
+        inner join tpch_parquet.lineitem l2 on l1.l_orderkey = l2.l_orderkey
+            and l1.l_partkey = l2.l_partkey
+            and l1.l_suppkey = l2.l_suppkey and l1.l_linenumber = l2.l_linenumber
+     where
+        # Include a selective post-join predicate so that the RHS of the nested loop join
+        # doesn't consume too much memory.
+        (l1.l_quantity != l2.l_quantity or l1.l_quantity = 31.0 and l1.l_tax = 0.03)
+        # Reduce the data size to get the test to execute quicker
+         and l1.l_orderkey <= 100000) l
+order by l_orderkey, l_partkey, l_suppkey, l_linenumber
+limit 5;
+---- TYPES
+bigint,bigint,bigint,int,decimal,decimal,decimal,decimal,string,string,string,string,string,string,string,string
+---- RESULTS
+288,50641,8157,1,31.00,49340.84,0.00,0.03,'N','O','1997-03-17','1997-04-28','1997-04-06','TAKE BACK RETURN','AIR','instructions wa'
+418,18552,1054,1,31.00,45587.05,0.00,0.03,'N','F','1995-06-05','1995-06-18','1995-06-26','COLLECT COD','FOB','final theodolites. fluffil'
+482,61141,6154,3,31.00,34166.34,0.04,0.03,'N','O','1996-06-01','1996-05-06','1996-06-17','NONE','MAIL',' blithe pin'
+1382,156162,6163,5,31.00,37762.96,0.07,0.03,'R','F','1993-10-26','1993-10-15','1993-11-09','TAKE BACK RETURN','FOB','hely regular dependencies. f'
+1509,186349,3904,6,31.00,44495.54,0.04,0.03,'A','F','1993-07-14','1993-08-21','1993-08-06','COLLECT COD','SHIP','ic deposits cajole carefully. quickly bold '
+====