You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Michael Ho (Code Review)" <ge...@cloudera.org> on 2016/05/05 07:34:35 UTC

[Impala-CR](cdh5-trunk) IMPALA-3286: Prefetching for PHJ probing.

Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/2959

Change subject: IMPALA-3286: Prefetching for PHJ probing.
......................................................................

IMPALA-3286: Prefetching for PHJ probing.

This change pipelines the code which probes the hash tables
for matches. This is based on the idea which Mostafa presented
earlier. Similar to prefetching for the build side, this change
modifies the hash table probing code to prefetch all the
relevant data referenced by the probing code before probing
actually happens. The code is structured as follows:

1. go over all the probe rows to evaluate them against the probe
   expressions and hash the result values.
2. go over all the hash values to prefetch all the hash table
   buckets.
3. go over all the hash table buckets referenced to prefetch
   the tuples referenced.
4. do the actual probing.

Combined with the build side prefetching, a self join of the table
lineitem went from 35.68s to 21.66s on average, a 39% decrease in
query time.

The self-join used is:

select count(*)
from lineitem o1, lineitem o2
where o1.l_orderkey = o2.l_orderkey and
      o1.l_linenumber = o2.l_linenumber;

Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
---
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/buffered-tuple-stream.inline.h
8 files changed, 185 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/59/2959/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <kw...@cloudera.com>