You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Henry Robinson (Code Review)" <ge...@cloudera.org> on 2017/06/16 18:50:26 UTC

[Impala-ASF-CR] IMPALA-5481: Clarify RowDescriptor ownership

Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-5481: Clarify RowDescriptor ownership
......................................................................

IMPALA-5481: Clarify RowDescriptor ownership

RowDescriptors are originally allocated in-line with the exec node that
uses them. Their lifetime is therefore guaranteed to be as long as the
parent fragment instance. However, we copy the RowDescriptors into
RowBatches, which adds allocation pressure (RowDescriptors contain
vectors), and is unnecessary as a) the descriptor is constant and b)
RowBatches get destroyed before exec nodes.

This patch standardises ownership of RowDescriptor objects, by changing
members that were copies or const references to RowDescriptors to be
const RowDescriptor*. Method arguments are either const* to convey that
ownership is to be shared, or const& to convey that the descriptor is to
be used but not mutated by the callee.

The tradeoff of fewer allocations appears to outweigh any loss of cache
locality due to sharing the RowDescriptor. On a 16-node cluster that
previously spend ~20% of its tcmalloc time allocating RowDescriptors,
this patch reduced that time to 0%.

Change-Id: I2fc39170f775581d406b6a97445046f508d8d75f
---
M be/src/exec/aggregation-node.cc
M be/src/exec/analytic-eval-node.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exchange-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/exec/row-batch-cache.h
M be/src/exec/row-batch-list-test.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/exec/union-node.cc
M be/src/exec/unnest-node.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/buffered-tuple-stream-v2.cc
M be/src/runtime/buffered-tuple-stream-v2.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/util/debug-util.cc
55 files changed, 309 insertions(+), 318 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/7206/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2fc39170f775581d406b6a97445046f508d8d75f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>