You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2016/06/02 22:26:54 UTC

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3282

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/util/stopwatch.h
27 files changed, 656 insertions(+), 428 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/util/stopwatch.h
File be/src/util/stopwatch.h:

Line 241:   ScopedStopWatch(T* sw, bool enabled = true) :
> I was thinking along the lines of a more straightforward solution to the sp
That doesn't work if you do any kind of early return. E.g. RETURN_IF_ERROR() will leave the stopwatch running. It's hard to find timing bugs with profile counters so I'd rather stick with a scoped timer pattern that avoids the possibility.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#12).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
29 files changed, 735 insertions(+), 442 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/12
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 12
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/16/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 101:   /// Constructs the join build from the right child's output. Note that this can be
sorry to chime in late, but i find this terminology confusing. 'build' doesn't really have a meaning in this context, other than being a verb. likewise, i would find 'probe' as a term for referring to the probe side confusing.

maybe we should find a better term for the in-memory state of a join (the build side/rhs is really only the input to that).


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/16/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 101:   /// Constructs the join build from the right child's output. Note that this can be
> sorry to chime in late, but i find this terminology confusing. 'build' does
Aren't build side and probe side standard terms?

The 'build' is not necessarily just in-memory state.

ConstructBuildSideState()?


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#13).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 748 insertions(+), 448 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/13
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 13
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#16).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. Also removes some redundancy between DataSink subclasses in
setting up RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. The main challenge here is that NestedLoopJoinNode
recycles row batches to avoid reallocations and copies of row
batches in subplans. The solution to this is:
- Retain the special-case optimisation for SingularRowSrc
- Use the row batch cache and RowBatch::AcquireState() to copy
  the state of row batches passed to Send(), while recycling the
  RowBatch objects.

Refactoring the partitioned hash join is left for Part 2.

Testing:
Ran exhaustive, core ASAN, and exhaustive non-partioned agg/join builds.
Also ran a local stress test.

Performance:
Ran TPC-H nested locally. The results show that the change is perf-neutral.

+------------------+-----------------------+---------+------------+------------+----------------+
| Workload         | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+------------------+-----------------------+---------+------------+------------+----------------+
| TPCH_NESTED(_20) | parquet / none / none | 7.75    | +0.19%     | 5.64       | -0.34%         |
+------------------+-----------------------+---------+------------+------------+----------------+

+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| Workload         | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| TPCH_NESTED(_20) | TPCH-Q17 | parquet / none / none | 18.96  | 17.95       |   +5.61%   |   4.97%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q14 | parquet / none / none | 3.61   | 3.56        |   +1.25%   |   0.97%   |   1.19%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q8  | parquet / none / none | 6.25   | 6.23        |   +0.44%   |   0.44%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q10 | parquet / none / none | 5.84   | 5.83        |   +0.30%   |   1.21%   |   1.82%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q5  | parquet / none / none | 4.91   | 4.90        |   +0.11%   |   0.18%   |   0.78%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q21 | parquet / none / none | 17.82  | 17.81       |   +0.07%   |   0.66%   |   0.58%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q4  | parquet / none / none | 5.12   | 5.12        |   -0.02%   |   0.97%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q9  | parquet / none / none | 23.85  | 23.88       |   -0.15%   |   0.72%   |   0.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q12 | parquet / none / none | 6.15   | 6.16        |   -0.16%   |   1.60%   |   1.72%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q3  | parquet / none / none | 5.46   | 5.47        |   -0.23%   |   1.28%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q16 | parquet / none / none | 3.61   | 3.62        |   -0.26%   |   1.00%   |   1.36%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q19 | parquet / none / none | 20.19  | 20.31       |   -0.58%   |   1.67%   |   0.65%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q7  | parquet / none / none | 9.42   | 9.48        |   -0.68%   |   0.87%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q18 | parquet / none / none | 12.94  | 13.06       |   -0.90%   |   0.59%   |   0.48%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q22 | parquet / none / none | 1.09   | 1.10        |   -0.92%   |   2.26%   |   2.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q13 | parquet / none / none | 3.75   | 3.78        |   -0.94%   |   2.04%   |   2.86%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q20 | parquet / none / none | 4.33   | 4.37        |   -1.10%   |   3.00%   |   2.43%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q2  | parquet / none / none | 2.39   | 2.42        |   -1.38%   |   1.54%   |   1.30%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q11 | parquet / none / none | 1.43   | 1.46        |   -1.78%   |   2.05%   |   2.77%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q6  | parquet / none / none | 2.29   | 2.33        |   -1.79%   |   0.56%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q15 | parquet / none / none | 5.04   | 5.13        |   -1.84%   |   0.61%   |   2.01%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q1  | parquet / none / none | 5.98   | 6.12        |   -2.30%   |   1.84%   |   3.19%        | 1           | 10    |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 704 insertions(+), 458 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/16
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/5/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 496:     return Status::OK();
This drops 'status' in the error case. Need to return 'status'.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3282/16//COMMIT_MSG
Commit Message:

Line 18: Partially refactor join nodes that so control flow between
> so that the
Done


Line 21: subclasses: ConstructBuildSide(). Once we convert all join nodes
> ConstructJoinBuild()
Done


http://gerrit.cloudera.org:8080/#/c/3282/16/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 101:   /// Constructs the join build from the right child's output. Note that this can be
> Aren't build side and probe side standard terms?
Something like: ProcessBuildInput() or ConsumeBuildInput()?

That focuses on the fact that it consumes the whole build-side input, leaving the fact that it builds some data structures implicit.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#8).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 683 insertions(+), 435 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/9/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 71:   virtual Status NextBatchToSend(RowBatch** batch);
I figured out a reasonably clean way to get rid of this. The approach is to add a JoinBuildSide class that has an extended interface that can be used by BlockingJoinNode.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/16/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 101:   /// Constructs the join build from the right child's output. Note that this can be
> i think either one of the last 3 works.
I went with ProcessBuildInput() and SendBuildInputToSink() and updated comments accordingly.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#14).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 748 insertions(+), 448 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/14
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 14
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 7:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 115:   template <bool ASYNC_BUILD>
> why does this need to be a template?
It's to avoid the overhead of timers that are only used for async builds. See IMPALA-2407 for some of the issues they caused. Updated comment.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 40: /// Superclass of all data sinks.
> i think it's time to provide this with a real class comment (what is the ab
I added a short description. I feel like the second paragraph is redundant with the method comments but I'll let you decide if you think it's useful (I typically read method comments before class comments).


Line 47:   virtual std::string Name() = 0;
> GetName(), otherwise it sounds like a getter
Done


Line 51:   virtual Status Prepare(RuntimeState* state, MemTracker* mem_tracker);
> explain param; also, why does it get created outside this class? possibly e
Just for flexibility. We don't want to add another layer of memtrackers in join nodes by giving each sink its own memtracker.


Line 59:   virtual Status NextBatchToSend(RowBatch** batch);
> i don't understand the desired control flow, please explain in class commen
I added to the class comment, this comment and the Send() comment also mention the relationship between the two methods.

I'm not in love with this interface but nested types is sensitive to the overhead of creating these batches so I wanted to avoid making perf-sensitive changes in this patch.

I had trouble finding a alternative way to deal with this that wouldn't require larger-scale changes.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

Line 131:   /// Prepares output_exprs and partition_key_exprs, and connects to HDFS.
> leave blank line between function declaration and the comment for the next 
Done


Line 142:   /// TODO: Move calls to functions that can fail in Close() to FlushFinal()
> is this todo still appropriate?
It's IMPALA-2988


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 25: /// Builder for the NestedLoopJoinNode operator.
> explain builder (or use different term; you're basically just repeating the
Done.


Line 32: class NestedLoopJoinBuilder : public DataSink {
> fine to abbreviate to Nlj
Done


Line 49:   /// Returns the next batch from 'build_batch_cache_'
> don't reference internal state
Done.


Line 50:   inline RowBatch* GetNextBatchForBuild() { return build_batch_cache_.GetNextBatch(); }
> this is going beyond DataSink functionality, and should be described in the
I extended the method comments to explain the context.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 35: /// null-aware left anti-join.
> missing reference to builder class
Done.

I'd personally lean towards not cross-referencing classes like this. Seems likely to get out of sync with the code over time. It doesn't bother me since I don't tend to read these comments.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

Line 196:   boost::scoped_ptr<RowBatch> row_batch_;
> separate sink-related items visually w/ blank line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3282/16//COMMIT_MSG
Commit Message:

Line 18: Partially refactor join nodes that so control flow between
so that the


Line 21: subclasses: ConstructBuildSide(). Once we convert all join nodes
ConstructJoinBuild()
(thanks for renaming!)


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#11).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
29 files changed, 733 insertions(+), 442 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/11
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 19: Code-Review+1

Works for me. Still lgtm.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 19
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3282/5/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 244:   RETURN_IF_ERROR(Expr::Open(build_expr_ctxs_, state));
> Lines 244-245 are redundant since we already do that below, I'll remove in 
Done


http://gerrit.cloudera.org:8080/#/c/3282/5/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 496:     return status;
> This drops 'status' in the error case. Need to return 'status'.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 7:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 115:   template <bool ASYNC_BUILD>
why does this need to be a template?


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 40: /// Superclass of all data sinks.
i think it's time to provide this with a real class comment (what is the abstraction behind it, where is it used, etc.)


Line 47:   virtual std::string Name() = 0;
GetName(), otherwise it sounds like a getter


Line 51:   virtual Status Prepare(RuntimeState* state, MemTracker* mem_tracker);
explain param; also, why does it get created outside this class? possibly explain in class comment


Line 59:   virtual Status NextBatchToSend(RowBatch** batch);
i don't understand the desired control flow, please explain in class comment


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

Line 131:   /// Prepares output_exprs and partition_key_exprs, and connects to HDFS.
leave blank line between function declaration and the comment for the next function


Line 142:   /// TODO: Move calls to functions that can fail in Close() to FlushFinal()
is this todo still appropriate?


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 25: /// Builder for the NestedLoopJoinNode operator.
explain builder (or use different term; you're basically just repeating the class name).

i should be able to get a rough understanding of what this class does and where it's being used from the class comment.


Line 32: class NestedLoopJoinBuilder : public DataSink {
fine to abbreviate to Nlj


Line 49:   /// Returns the next batch from 'build_batch_cache_'
don't reference internal state


Line 50:   inline RowBatch* GetNextBatchForBuild() { return build_batch_cache_.GetNextBatch(); }
this is going beyond DataSink functionality, and should be described in the class comment.


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 35: /// null-aware left anti-join.
missing reference to builder class


http://gerrit.cloudera.org:8080/#/c/3282/7/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

Line 196:   boost::scoped_ptr<RowBatch> row_batch_;
separate sink-related items visually w/ blank line


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

I ended up simplifying the interface by removing DataAsink::GetNextRowBatch() and removing the JoinBuildSink interface entirely. TPC-H benchmark results were neutral, so it doesn't look like performance is negatively affected (I think skipping the timers probably compensates for the slight inefficiency of RowBatch::AcquireState())

I put the latest patchset through a battery of tests described in the commit message.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 11:

(37 comments)

Addressed most of the comments. There is still an unresolved question around whether the DataSink interface should allocate row batches. 

I spoke to Marcel about it and we didn't reach a conclusion about which solution would be better, so I've left that unchanged in this patch but cleaned up the other issues.

http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 23: #include "exec/data-sink.h"
> forward declare DataSink
This became necessary in a later iteration of the patch, since we declare a subclass of DataSink in the header.


Line 98:   /// Prepare() work, e.g. codegen.
> Nit, but I don't think we construct the "build side". The build side is an 
Done. Renamed ConstructBuildSide to ConstructJoinBuild


Line 104:   /// Subclasses should close any other structures and then call
> ... the Open() and build ...
Not quite sure what you were suggesting, tried to make the wording a bit better.


Line 105:   /// BlockingJoinNode::Close().
> permit (plural)
Done


Line 106:   virtual void Close(RuntimeState* state);
> Suggest: If the build_sink is non-NULL, uses it to construct the join build
Done


PS8, Line 109: 
Changed to JoinBuildSink to be more accurate.


Line 112:   TJoinOp::type join_op_;
> ... to construct the join build via a data sink.
Done


Line 118: 
> left child (probe side).
Done


Line 123:   TupleRow* current_probe_row_;  // The row currently being probed
> typo: extra '.'
Done


Line 196: 
> nit: I don't think we use the DataSink interface to construct the join buil
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 42: /// BlockingJoinNode. Reset() is added to support subplans and more efficient allocation
> remove "is added"
Done


Line 43: /// of RowBatch objects is enabled with the GetNextBatchForBuild() interface.
> remove "is enabled"
Reworded to avoid it.


Line 51:   /// Frees resources associated with cache row batches.
> cached row batches
Done


Line 52:   /// Subclasses must call JoinBuildSink::Close().
> is the order of Close() calls important? If so, mention which Close() shoul
Not really, but documented it for consistency.


Line 55:   /// Returns the next build batch that should be filled and passed to AddBuildBatch()
> Returns an empty build batch that should be filled and passed to Send().
Done


Line 56:   /// or Send(). For join types that accumulates batches, this helps avoid additional
> accumulate
Done


Line 60:   inline RowBatch* GetNextBatchForBuild() { return build_batch_cache_.GetNextBatch(); }
> GetEmptyBatch() or GetNextEmptyBatch()? The important part seems to be that
Renamed to GetNextEmptyBatch().

Still looking at the alternative with the DataSink interface and going back-and-forth. I think adding it to the DataSink interface results in less code, but prevents some usage patterns of the DataSink interface that may make sense in the future. E.g. if the caller of the DataSink methods wants to have multiple batches in flight - maybe it's pulling data from multiple queues?


Line 70:   const bool using_builder_interface_;
> Isn't the accumulates_batches_ flag sufficient? The comment on accumulates_
No, they're different. The DataSink interface doesn't have a GetNextBatchForBuild() method, so we can't assume that the DataSink client allocated a batch in that way - it can pass in an arbitrary row batch. This flag implies that we can assume that the caller allocated the batch from the cache.

E.g. for PartitionedHashJoinNode, using_builder_interface_ will be true, but accumulates_batches_ will be false, since the contents of the batches are always copied.


PS11, Line 70: using_builder_interface_
I renamed this to using_join_build_interface for consistency.


Line 73:   /// If true, the caller should call NextBuildBatch() to get an empty batch before every
Updated this comment to reflect that only BlockingJoinNode can call GetNextEmptyBatch().


Line 76:   const bool accumulates_batches_;
> Couldn't this be done by making GetNextBatchForBuild() virtual? My thinking
I did that in an earlier version of the patch. The concern with that approach was that a subplan-specific optimisation was complicating the interface for all data sinks, most of which can never appear in a subplan.

The virtue of the current approach is that the complication for this optimisation is isolated to blocking-join-node.{cc,h}


Line 150:   virtual Status ConstructBuildSide(RuntimeState* state) = 0;
> Terminology nit that is not your fault: To me "Build Side" means "the exec 
Did this.


Line 154:   /// If build_sink is non-NULL, the sink interface will be used to construct the build
> If build_sink is non-NULL, uses it to construct the build side.
Done


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

Line 166
> single line
This was removed in a later version of the patch.


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 71:   virtual Status FlushFinal(RuntimeState* state) = 0;
> The important piece seems to be that the returned batch is empty and is exp
The method was removed in the later patchset. Will do this if we decide to revert to this approach.


Line 104:   /// Owned by the RuntimeState.
> const methods
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.cc
File be/src/exec/hash-join-node.cc:

Line 207:   RETURN_IF_ERROR(BlockingJoinNode::GetFirstProbeRow(state));
> Seems a little weird that we prepare for probe after getting the first prob
ResetForProbe() looks up the first probe row in the hash table for GetNext(), so it needs to be called after GetFirstProbeRow().

This keeps the flow more analogous to PHJ (although ResetForProbe() is called there once per batch).


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
> Can you spell out 'everything' a little more?
I don't actually know. This was a mechanical refactor to move this from BlockingJoinNode since it's not used by any other subclass. Hoping to delete this module soon.


Line 120:   void ResetForProbe();
> PrepareForProbe()?
Renamed to InitGetNext().


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
> can you spell out 'everything' a little more? is it memory referenced from 
Replied on the other patchset, but this was just a mechanical refactor to move it from BlockingJoinNode. Doesn't seem worth it to clean this up.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hdfs-table-writer.cc
File be/src/exec/hdfs-table-writer.cc:

Line 20: #include "runtime/row-batch.h"
> remove?
Done. Maybe was an editing error.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 24: /// Builder for the NestedLoopJoinNode operator that accumulates the build-side rows
> remove "operator"
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.cc
File be/src/exec/nested-loop-join-node.cc:

Line 59:   SCOPED_TIMER(runtime_profile_->total_time_counter());
> won't this double count the time spent in BlockingJoinNode::Open()? I think
The general pattern (E.g. ExecNode::Open(), BlockingJoinNode::Open()) is that the abstract base classes don't use timers, so I think this is right as-is.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 59:   boost::scoped_ptr<NljBuilder> build_side_;
> build_sink_?
Renamed to builder_ to match the class name.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 242:   SCOPED_TIMER(runtime_profile_->total_time_counter());
> I think the counter should start after BlockingJoinNode::Open()
Why is that? It seems strange not to include the time spent in the parent class.


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/row-batch-cache.h
File be/src/exec/row-batch-cache.h:

Line 61:     row_batches_.clear(); // unique_ptr automatically calls all destructors.
> Add a DCHECK in the RowBatchCache d'tor to make sure row_batches_ is empty?
Done


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/util/stopwatch.h
File be/src/util/stopwatch.h:

Line 241:   ScopedStopWatch(T* sw, bool enabled = true) :
> The enabled flag seems awkward because it's not clear why anyone would use 
If you used ScopedStopWatch directly it would still be useful in any situation where you want to have a scoped timer but also make a decision about whether to enable a timer at runtime (or at compile time with a template argument). Otherwise you have to mess around ifdefs or very carefully turn the timer on and off at the right places.

You could mess about with template types and have a DummyScopedStopWatch type, but that seems clunkier and less general (since you can't make a runtime decision to enable the timer then).


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/16/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 101:   /// Constructs the join build from the right child's output. Note that this can be
> Something like: ProcessBuildInput() or ConsumeBuildInput()?
i think either one of the last 3 works.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 16
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/util/stopwatch.h
File be/src/util/stopwatch.h:

Line 241:   ScopedStopWatch(T* sw, bool enabled = true) :
> If you used ScopedStopWatch directly it would still be useful in any situat
I was thinking along the lines of a more straightforward solution to the specific problem this is trying to solve in blocking-join-node.cc. The following seems simpler and less invasive:

if (ASYNC_BUILD) built_probe_overlap_stop_watch_.Start();
// Do the work here
if (ASYNC_BUILD) built_probe_overlap_stop_watch_.Stop();

It's not clear we're going to need this conditional timer in many other places, so I'd prefer the straightforward solution.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3282/5/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 244:   RETURN_IF_CANCELLED(state);
Lines 244-245 are redundant since we already do that below, I'll remove in next iteration.


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has abandoned this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Abandoned

Moved to https://gerrit.cloudera.org/#/c/3842/1

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 19
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 15:

Patchset 15 is a draft, need to look it over myself before publishing. For some reason gerrit seems to send out email notifications for drafts in some cases.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 15
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Marcel Kornacker (Code Review)" <ge...@cloudera.org>.
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 5:

I'll have time for this review when I get back on 6/20.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#15).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. Also removes some redundancy between DataSink subclasses in
setting up RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. The main challenge here is that NestedLoopJoinNode
recycles row batches to avoid reallocations and copies of row
batches in subplans. The solution to this is:
- Retain the special-case optimisation for SingularRowSrc
- Use the row batch cache and std::swap() to efficiently recycle
  and copy the state of row batches passed to Send()

Refactoring the partitioned hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/util/stopwatch.h
29 files changed, 701 insertions(+), 459 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/15
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 15
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#10).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
29 files changed, 731 insertions(+), 442 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/10
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 5:

This passed an exhaustive build aside from one test. I fixed that and restarted the exhaustive build, but that will take a while so I thought I would just post the patch.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 11:

Fixed a couple of minor stylistic issues with the previous patchset.

-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. The NestedLoopJoinNode also differs from other joins in that
it recycles row batches to avoid copies, so we need to add a
NextBatchToSend() method to the DataSink that returns a batch. Also
removes some redundancy between DataSink subclasses in setting up
RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes that so control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructBuildSide(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. Refactoring the hash join is left for Part 2.

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/util/stopwatch.h
27 files changed, 654 insertions(+), 428 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/6
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................


Patch Set 11:

(35 comments)

http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 23: #include "exec/data-sink.h"
forward declare DataSink


Line 98:   /// Prepare() work, e.g. codegen.
Nit, but I don't think we construct the "build side". The build side is an exec node child. We consume the build side and construct a join build.

Imo, the original ConstructBuildSide() seems to be somewhat of a misnomer also. What do you think?

Suggest:
Constructs the join build from the right child's output...


Line 104:   /// Subclasses should close any other structures and then call
... the Open() and build ...


Line 105:   /// BlockingJoinNode::Close().
permit (plural)


Line 106:   virtual void Close(RuntimeState* state);
Suggest: If the build_sink is non-NULL, uses it to construct the join build. Otherwise, ...


Line 112:   TJoinOp::type join_op_;
... to construct the join build via a data sink.

(the "build side" does not implement the DataSink interface)


Line 118: 
left child (probe side).


Line 123:   TupleRow* current_probe_row_;  // The row currently being probed
typo: extra '.'


Line 196: 
nit: I don't think we use the DataSink interface to construct the join build. We use the data sink.

Suggest: Constructs the join build via 'build_sink", if it is non-NULL. Otherwise, ...


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

Line 42: /// BlockingJoinNode. Reset() is added to support subplans and more efficient allocation
remove "is added"


Line 43: /// of RowBatch objects is enabled with the GetNextBatchForBuild() interface.
remove "is enabled"


Line 51:   /// Frees resources associated with cache row batches.
cached row batches


Line 52:   /// Subclasses must call JoinBuildSink::Close().
is the order of Close() calls important? If so, mention which Close() should be called first.


Line 55:   /// Returns the next build batch that should be filled and passed to AddBuildBatch()
Returns an empty build batch that should be filled and passed to Send().

(I don't see an AddBuildBatch() anywhere)


Line 56:   /// or Send(). For join types that accumulates batches, this helps avoid additional
accumulate


Line 60:   inline RowBatch* GetNextBatchForBuild() { return build_batch_cache_.GetNextBatch(); }
GetEmptyBatch() or GetNextEmptyBatch()? The important part seems to be that the batch is empty.

Probably something to be discussed f2f:
I'm thinking that it actually makes sense to make this part of the DataSink interface. The join build can implement this via a cache for the join types that need it.


Line 70:   const bool using_builder_interface_;
Isn't the accumulates_batches_ flag sufficient? The comment on accumulates_batches_ even states that GetNextBatchForBuild() should either be called just once, or once for every Send().


Line 76:   const bool accumulates_batches_;
Couldn't this be done by making GetNextBatchForBuild() virtual? My thinking is that we could hide those details completely in the sink itself. We could also move the row batch cache to only the relevant build sinks.


Line 150:   virtual Status ConstructBuildSide(RuntimeState* state) = 0;
Terminology nit that is not your fault: To me "Build Side" means "the exec node child used for constructing the join build". So imo this function should be called ConstructJoinBuild(), and similar changes elsewhere.
What do you think?


Line 154:   /// If build_sink is non-NULL, the sink interface will be used to construct the build
If build_sink is non-NULL, uses it to construct the build side.

(the sink "interface" can't really construct anything, we use the build_sink instance)


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

Line 166
single line


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

Line 44: /// Clients of the DataSink interface drive the data sink by repeatedly calling Send()
These changes make a lot of sense to me


Line 71:   virtual Status FlushFinal(RuntimeState* state) = 0;
The important piece seems to be that the returned batch is empty and is expected to be populated, so how about:

CreateEmptyBatch() or GetEmptyBatch() or NextEmptyBatch()

Seems to make more sense when explaining the flow:
b = sink.CreateEmptyBatch();
plan.GetNext(b);
sink.Send(b);


Line 104:   /// Owned by the RuntimeState.
const methods


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.cc
File be/src/exec/hash-join-node.cc:

Line 207:   RETURN_IF_ERROR(BlockingJoinNode::GetFirstProbeRow(state));
Seems a little weird that we prepare for probe after getting the first probe row.


http://gerrit.cloudera.org:8080/#/c/3282/8/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
Can you spell out 'everything' a little more?


Line 120:   void ResetForProbe();
PrepareForProbe()?


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hash-join-node.h
File be/src/exec/hash-join-node.h:

Line 70:   /// holds everything referenced from build side
can you spell out 'everything' a little more? is it memory referenced from build batches?


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/hdfs-table-writer.cc
File be/src/exec/hdfs-table-writer.cc:

Line 20: #include "runtime/row-batch.h"
remove?


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

Line 24: /// Builder for the NestedLoopJoinNode operator that accumulates the build-side rows
remove "operator"


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.cc
File be/src/exec/nested-loop-join-node.cc:

Line 59:   SCOPED_TIMER(runtime_profile_->total_time_counter());
won't this double count the time spent in BlockingJoinNode::Open()? I think should be moved down a line


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/nested-loop-join-node.h
File be/src/exec/nested-loop-join-node.h:

Line 59:   boost::scoped_ptr<NljBuilder> build_side_;
build_sink_?


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 242:   SCOPED_TIMER(runtime_profile_->total_time_counter());
I think the counter should start after BlockingJoinNode::Open()


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/exec/row-batch-cache.h
File be/src/exec/row-batch-cache.h:

Line 61:     row_batches_.clear(); // unique_ptr automatically calls all destructors.
Add a DCHECK in the RowBatchCache d'tor to make sure row_batches_ is empty?


http://gerrit.cloudera.org:8080/#/c/3282/11/be/src/util/stopwatch.h
File be/src/util/stopwatch.h:

Line 241:   ScopedStopWatch(T* sw, bool enabled = true) :
The enabled flag seems awkward because it's not clear why anyone would use it (outside of a macro). Any alternative solutions?


-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 11
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3282

to look at the new patch set (#17).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. Also removes some redundancy between DataSink subclasses in
setting up RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes so that the control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructJoinBuild(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. The main challenge here is that NestedLoopJoinNode
recycles row batches to avoid reallocations and copies of row
batches in subplans. The solution to this is:
- Retain the special-case optimisation for SingularRowSrc
- Use the row batch cache and RowBatch::AcquireState() to copy
  the state of row batches passed to Send(), while recycling the
  RowBatch objects.

Refactoring the partitioned hash join is left for Part 2.

Testing:
Ran exhaustive, core ASAN, and exhaustive non-partioned agg/join builds.
Also ran a local stress test.

Performance:
Ran TPC-H nested locally. The results show that the change is perf-neutral.

+------------------+-----------------------+---------+------------+------------+----------------+
| Workload         | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+------------------+-----------------------+---------+------------+------------+----------------+
| TPCH_NESTED(_20) | parquet / none / none | 7.75    | +0.19%     | 5.64       | -0.34%         |
+------------------+-----------------------+---------+------------+------------+----------------+

+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| Workload         | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| TPCH_NESTED(_20) | TPCH-Q17 | parquet / none / none | 18.96  | 17.95       |   +5.61%   |   4.97%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q14 | parquet / none / none | 3.61   | 3.56        |   +1.25%   |   0.97%   |   1.19%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q8  | parquet / none / none | 6.25   | 6.23        |   +0.44%   |   0.44%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q10 | parquet / none / none | 5.84   | 5.83        |   +0.30%   |   1.21%   |   1.82%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q5  | parquet / none / none | 4.91   | 4.90        |   +0.11%   |   0.18%   |   0.78%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q21 | parquet / none / none | 17.82  | 17.81       |   +0.07%   |   0.66%   |   0.58%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q4  | parquet / none / none | 5.12   | 5.12        |   -0.02%   |   0.97%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q9  | parquet / none / none | 23.85  | 23.88       |   -0.15%   |   0.72%   |   0.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q12 | parquet / none / none | 6.15   | 6.16        |   -0.16%   |   1.60%   |   1.72%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q3  | parquet / none / none | 5.46   | 5.47        |   -0.23%   |   1.28%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q16 | parquet / none / none | 3.61   | 3.62        |   -0.26%   |   1.00%   |   1.36%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q19 | parquet / none / none | 20.19  | 20.31       |   -0.58%   |   1.67%   |   0.65%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q7  | parquet / none / none | 9.42   | 9.48        |   -0.68%   |   0.87%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q18 | parquet / none / none | 12.94  | 13.06       |   -0.90%   |   0.59%   |   0.48%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q22 | parquet / none / none | 1.09   | 1.10        |   -0.92%   |   2.26%   |   2.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q13 | parquet / none / none | 3.75   | 3.78        |   -0.94%   |   2.04%   |   2.86%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q20 | parquet / none / none | 4.33   | 4.37        |   -1.10%   |   3.00%   |   2.43%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q2  | parquet / none / none | 2.39   | 2.42        |   -1.38%   |   1.54%   |   1.30%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q11 | parquet / none / none | 1.43   | 1.46        |   -1.78%   |   2.05%   |   2.77%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q6  | parquet / none / none | 2.29   | 2.33        |   -1.79%   |   0.56%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q15 | parquet / none / none | 5.04   | 5.13        |   -1.84%   |   0.61%   |   2.01%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q1  | parquet / none / none | 5.98   | 6.12        |   -2.30%   |   1.84%   |   3.19%        | 1           | 10    |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 704 insertions(+), 458 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/17
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 17
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-CR](cdh5-trunk) IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3282

to look at the new patch set (#19).

Change subject: IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks
......................................................................

IMPALA-3567: Part 1: groundwork to make Join build sides DataSinks

Refactor DataSink interface to be more generic. We need more flexibility
in setting up MemTrackers, so that memory is accounted against the right
ExecNode. Also removes some redundancy between DataSink subclasses in
setting up RuntimeProfiles, etc.

Remove the redundancy in the DataSink between passing eos to GetNext()
and FlushFinal(). This simplifies HdfsTableSink quite a bit and makes
handling empty batches simpler.

Partially refactor join nodes so that the control flow between
BlockingJoinNode::Open() and its subclasses is easier to follow.
BlockingJoinNode now only calls one virtual function on its
subclasses: ConstructJoinBuild(). Once we convert all join nodes
to use the DataSink interface, we will also be able to remove that
as well.

As a minor optimisation, avoid updating a timer that is ignored for
non-async builds.

As a proof of concept, this patch separates out the build side of
NestedLoopJoinNode into a class that implements the DataSink
interface. The main challenge here is that NestedLoopJoinNode
recycles row batches to avoid reallocations and copies of row
batches in subplans. The solution to this is:
- Retain the special-case optimisation for SingularRowSrc
- Use the row batch cache and RowBatch::AcquireState() to copy
  the state of row batches passed to Send(), while recycling the
  RowBatch objects.

Refactoring the partitioned hash join is left for Part 2.

Testing:
Ran exhaustive, core ASAN, and exhaustive non-partioned agg/join builds.
Also ran a local stress test.

Performance:
Ran TPC-H nested locally. The results show that the change is perf-neutral.

+------------------+-----------------------+---------+------------+------------+----------------+
| Workload         | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+------------------+-----------------------+---------+------------+------------+----------------+
| TPCH_NESTED(_20) | parquet / none / none | 7.75    | +0.19%     | 5.64       | -0.34%         |
+------------------+-----------------------+---------+------------+------------+----------------+

+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| Workload         | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| TPCH_NESTED(_20) | TPCH-Q17 | parquet / none / none | 18.96  | 17.95       |   +5.61%   |   4.97%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q14 | parquet / none / none | 3.61   | 3.56        |   +1.25%   |   0.97%   |   1.19%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q8  | parquet / none / none | 6.25   | 6.23        |   +0.44%   |   0.44%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q10 | parquet / none / none | 5.84   | 5.83        |   +0.30%   |   1.21%   |   1.82%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q5  | parquet / none / none | 4.91   | 4.90        |   +0.11%   |   0.18%   |   0.78%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q21 | parquet / none / none | 17.82  | 17.81       |   +0.07%   |   0.66%   |   0.58%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q4  | parquet / none / none | 5.12   | 5.12        |   -0.02%   |   0.97%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q9  | parquet / none / none | 23.85  | 23.88       |   -0.15%   |   0.72%   |   0.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q12 | parquet / none / none | 6.15   | 6.16        |   -0.16%   |   1.60%   |   1.72%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q3  | parquet / none / none | 5.46   | 5.47        |   -0.23%   |   1.28%   |   0.90%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q16 | parquet / none / none | 3.61   | 3.62        |   -0.26%   |   1.00%   |   1.36%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q19 | parquet / none / none | 20.19  | 20.31       |   -0.58%   |   1.67%   |   0.65%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q7  | parquet / none / none | 9.42   | 9.48        |   -0.68%   |   0.87%   |   0.71%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q18 | parquet / none / none | 12.94  | 13.06       |   -0.90%   |   0.59%   |   0.48%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q22 | parquet / none / none | 1.09   | 1.10        |   -0.92%   |   2.26%   |   2.22%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q13 | parquet / none / none | 3.75   | 3.78        |   -0.94%   |   2.04%   |   2.86%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q20 | parquet / none / none | 4.33   | 4.37        |   -1.10%   |   3.00%   |   2.43%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q2  | parquet / none / none | 2.39   | 2.42        |   -1.38%   |   1.54%   |   1.30%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q11 | parquet / none / none | 1.43   | 1.46        |   -1.78%   |   2.05%   |   2.77%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q6  | parquet / none / none | 2.29   | 2.33        |   -1.79%   |   0.56%   |   1.23%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q15 | parquet / none / none | 5.04   | 5.13        |   -1.84%   |   0.61%   |   2.01%        | 1           | 10    |
| TPCH_NESTED(_20) | TPCH-Q1  | parquet / none / none | 5.98   | 6.12        |   -2.30%   |   1.84%   |   3.19%        | 1           | 10    |
+------------------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+

Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
A be/src/exec/nested-loop-join-builder.cc
A be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/row-batch-cache.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/row-batch.cc
M be/src/util/stopwatch.h
28 files changed, 715 insertions(+), 460 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/82/3282/19
-- 
To view, visit http://gerrit.cloudera.org:8080/3282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7608181eeacfe706a09c1e153d0a3e1ee9b475
Gerrit-PatchSet: 19
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>