You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/12/02 22:37:00 UTC

[jira] [Commented] (IMPALA-9126) Cleanly separate build and probe state in hash join node

    [ https://issues.apache.org/jira/browse/IMPALA-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986431#comment-16986431 ] 

ASF subversion and git services commented on IMPALA-9126:
---------------------------------------------------------

Commit b421741d7f17868958c50b4b19f22d255071472a in impala's branch refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b421741 ]

IMPALA-9126: part 3: move more logic to PhjBuilder

The general flavour of this patch is to move code
that orchestrates the top-level spilling hash join
algorithm to PhjBuilder, and better encapsulate
state in PhjBuilder by reducing the number of
public methods.

Specific changes include:
* Move HashJoinState to PhjBuilder, which is necessary
  for the shared join build since the builder will
  be orchestrating the spilling.
* Reduce public methods of PhjBuilder. The goal is
  for the builder to hand off pointers to hash tables,
  partitions, etc only during transitions of the
  state machine (i.e. synchronization points when
  we have a shared build).
* Highlight methods of PhjBuilder that will be
  synchronization points for the shared join build
  where hash tables are built and destroyed.
* Highlight other future changes to PhjBuilder.
* Move some of the output build partition logic
  from NextSpilledProbeRowBatch() to the builder,
  which required a few other changes - e.g. explicit
  *eos output arguments for some functions.

This does *not* include a change to have the builder pick
the spilled partition to process. That will be a follow-on,
because it requires more refactoring of the relationship
between PhjBuilder::Partition and ProbePartition.

Testing:
* Earlier version of patch passed exhaustive tests.

Change-Id: I0e233468de1eeae86651ab96df207de19e091053
Reviewed-on: http://gerrit.cloudera.org:8080/14787
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Cleanly separate build and probe state in hash join node
> --------------------------------------------------------
>
>                 Key: IMPALA-9126
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9126
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: multithreading
>
> As a precursor to IMPALA-4224, we should clean up the hash join implementation so that the build and probe state is better separated. The builder should not deal with probe side data structures (like the probe streams that it allocates) and all accesses to the build-side data structures should go through as narrow APIs as possible.
> The nested loop join is already pretty clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org