You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2019/11/26 01:08:48 UTC

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14790


Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................

IMPALA-9126: part 4: hash join builder manages spilling

This is the final patch for IMPALA-9126.

This will allow the many:1 relationship of probe:build
partitions that we need for the shared join build.

Key changes:
* Builder picks the next spilled partition to process.
* Partitions are identified by unique ID so can be
  decoupled between build and probe.
* unique_ptr is used to manage build partitions. This
  helps document the lifecycle of the partitions better,
  particularly when they are handed off to
  PartitionedHashJoinNode.

Testing:
* Ran exhaustive tests.
* Ran a single node TPC-H and TPC-DS stress test with 1000 queries.

Perf:
Ran a single node TPC-H 30 test against master from
before IMPALA-9126 changes. No significant perf
change.

Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
---
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
6 files changed, 186 insertions(+), 121 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14790/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14790/6/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/6/be/src/exec/partitioned-hash-join-builder.h@210
PS6, Line 210:   /// 'output_partitions' for build modes like right outer join that output unmatched rows.
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Nov 2019 01:09:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Thomas Tauber-Marshall (Code Review)" <ge...@cloudera.org>.
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 8: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@200
PS8, Line 200: when *repartitioning as true
I think this is supposed to be "when *repartitioned is true"?


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@213
PS8, Line 213: repartitioning
same - repartitioned?


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.cc@641
PS8, Line 641: Pick the next spilled partition the input partition will stay in
             :   // 'spilled_partitions_' until we are done probing it or repartitioning its probe.
Having some trouble parsing this sentence. I think you missed a period after "Pick the next spilled partition"?


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-node.cc@1098
PS8, Line 1098: // The build partitions we need to retain for further processing.
stale



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 19:35:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h@183
PS7, Line 183: itit
> typo
Done


http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h@615
PS7, Line 615:   /// Spilled_partitions_.back() is the spilled partition being processed, if one is
             :   /// currently being processed (i.e. between BeginSpilledProbe() and the corresponding
             :   /// DoneProbing*() call).
> Maybe using a separate member like active_spilled_partition_ would be clear
I think I'd prefer to avoid adding another member - it looks like it adds more complexity than it saves. The current logic around spilled_partitions_ is definitely a little subtle but I felt like it was well-contained and isn't really that much code.



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Dec 2019 21:34:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@200
PS8, Line 200: when *repartitioning as true
> I think this is supposed to be "when *repartitioned is true"?
Done


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@205
PS8, Line 205:   /// 'output_partitions' for build modes like right outer join that output unmatched rows.
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@213
PS8, Line 213: repartitioning
> same - repartitioned?
Done


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.cc@641
PS8, Line 641: Pick the next spilled partition the input partition will stay in
             :   // 'spilled_partitions_' until we are done probing it or repartitioning its probe.
> Having some trouble parsing this sentence. I think you missed a period afte
Done


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-node.cc@1091
PS8, Line 1091:     // Walk the partitions that had hash tables built for the probe phase and close them.
This was also stale


http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-node.cc@1098
PS8, Line 1098: // The build partitions we need to retain for further processing.
> stale
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 20:59:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 7: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h@183
PS7, Line 183: itit
typo


http://gerrit.cloudera.org:8080/#/c/14790/7/be/src/exec/partitioned-hash-join-builder.h@615
PS7, Line 615:   /// Spilled_partitions_.back() is the spilled partition being processed, if one is
             :   /// currently being processed (i.e. between BeginSpilledProbe() and the corresponding
             :   /// DoneProbing*() call).
Maybe using a separate member like active_spilled_partition_ would be clearer.



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Comment-Date: Thu, 28 Nov 2019 19:07:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Thomas Tauber-Marshall, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14790

to look at the new patch set (#9).

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................

IMPALA-9126: part 4: hash join builder manages spilling

This is the final patch for IMPALA-9126.

This will allow the many:1 relationship of probe:build
partitions that we need for the shared join build.

Key changes:
* Builder picks the next spilled partition to process.
* Partitions are identified by unique ID so can be
  decoupled between build and probe.
* unique_ptr is used to manage build partitions. This
  helps document the lifecycle of the partitions better,
  particularly when they are handed off to
  PartitionedHashJoinNode.

Testing:
* Ran exhaustive tests.
* Ran a single node TPC-H and TPC-DS stress test with 1000 queries.

Perf:
Ran a single node TPC-H 30 test against master from
before IMPALA-9126 changes. No significant perf
change.

Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
---
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
6 files changed, 182 insertions(+), 120 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14790/9
-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 8: Code-Review+1

carry csaba's +1


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Dec 2019 21:34:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 10: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Dec 2019 01:28:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5215/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 21:29:02 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5140/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Tue, 26 Nov 2019 01:54:04 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 21:00:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 9: Code-Review+2

carry from thomas


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 20:59:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/5201/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Dec 2019 22:03:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5317/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Dec 2019 21:00:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Hello Thomas Tauber-Marshall, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14790

to look at the new patch set (#8).

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................

IMPALA-9126: part 4: hash join builder manages spilling

This is the final patch for IMPALA-9126.

This will allow the many:1 relationship of probe:build
partitions that we need for the shared join build.

Key changes:
* Builder picks the next spilled partition to process.
* Partitions are identified by unique ID so can be
  decoupled between build and probe.
* unique_ptr is used to manage build partitions. This
  helps document the lifecycle of the partitions better,
  particularly when they are handed off to
  PartitionedHashJoinNode.

Testing:
* Ran exhaustive tests.
* Ran a single node TPC-H and TPC-DS stress test with 1000 queries.

Perf:
Ran a single node TPC-H 30 test against master from
before IMPALA-9126 changes. No significant perf
change.

Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
---
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
6 files changed, 179 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14790/8
-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................

IMPALA-9126: part 4: hash join builder manages spilling

This is the final patch for IMPALA-9126.

This will allow the many:1 relationship of probe:build
partitions that we need for the shared join build.

Key changes:
* Builder picks the next spilled partition to process.
* Partitions are identified by unique ID so can be
  decoupled between build and probe.
* unique_ptr is used to manage build partitions. This
  helps document the lifecycle of the partitions better,
  particularly when they are handed off to
  PartitionedHashJoinNode.

Testing:
* Ran exhaustive tests.
* Ran a single node TPC-H and TPC-DS stress test with 1000 queries.

Perf:
Ran a single node TPC-H 30 test against master from
before IMPALA-9126 changes. No significant perf
change.

Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Reviewed-on: http://gerrit.cloudera.org:8080/14790
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
6 files changed, 182 insertions(+), 120 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-9126: part 4: hash join builder manages spilling

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14790 )

Change subject: IMPALA-9126: part 4: hash join builder manages spilling
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14790/8/be/src/exec/partitioned-hash-join-builder.h@205
PS8, Line 205:   /// 'output_partitions' for build modes like right outer join that output unmatched rows.
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/14790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6de5f62e3eacf80f72c8ea0ed8cba012f0f53c90
Gerrit-Change-Number: 14790
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tm...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Dec 2019 21:34:47 +0000
Gerrit-HasComments: Yes