You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org> on 2022/09/14 16:44:44 UTC

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Noemi Pap-Takacs has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18824


Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode, PartialSortNode
and TopNNode), and in KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input runs.
Re-writed SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode, PartialSortNode, TopNPNode and
ExchangeNode.

This change lets the merger use the codegened version of TupleRowComparator
instead of the interpreted one, which can increase the speed especially
in case of complex comparison expressions. This change also serves as a
base for further codegen-related optimizations in the merger.

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
18 files changed, 227 insertions(+), 69 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 4:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/11352/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 14 Sep 2022 16:55:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8704/


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 14 Oct 2022 15:28:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11557/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Oct 2022 13:12:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 12: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 12
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 14:08:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12424/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 9
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Feb 2023 10:14:33 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12490/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Feb 2023 14:55:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 11
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 14:08:13 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode), and in
KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode and ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
15 files changed, 260 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/9
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 9
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode), and in
KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode and ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
15 files changed, 254 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/11
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 11
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 7:

Thanks, let's wait a bit and see if Csaba has any further comments on the new patch set.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Oct 2022 15:38:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8704/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 14 Oct 2022 10:43:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11624/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 13 Oct 2022 14:05:23 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/codegen/gen_ir_descriptions.py@261
PS8, Line 261:    ["TUPLE_SORTER_SORT_HELPER",
nit: can you fix the indentation of the last one?


http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/exec/partial-sort-node.cc@101
PS8, Line 101:   AddCodegenStatus(
             :       SortedRunMerger::Codegen(state, compare_fn, &codegend_heapify_helper_fn_));
Without the minirun patch, partial sort will not use a merger. I would add this only in the minirun patch.


http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/runtime/sorted-run-merger.cc
File be/src/runtime/sorted-run-merger.cc:

http://gerrit.cloudera.org:8080/#/c/18824/8/be/src/runtime/sorted-run-merger.cc@30
PS8, Line 30:   SortedRunMerger::SortedRunWrapper::SortedRunWrapper
nit: the indentation of this functions should be removed as they are not inside a class block anymore



-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Tue, 18 Oct 2022 12:38:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode), and in
KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode and ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
15 files changed, 258 insertions(+), 108 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/10
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12504/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 11
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 13:59:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode, PartialSortNode
and TopNNode), and in KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input runs.
Re-writed SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode, PartialSortNode, TopNPNode and
ExchangeNode.

This change lets the merger use the codegened version of TupleRowComparator
instead of the interpreted one, which can increase the speed especially
in case of complex comparison expressions. This change also serves as a
base for further codegen-related optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
19 files changed, 234 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18824/4/be/src/runtime/sorted-run-merger.cc
File be/src/runtime/sorted-run-merger.cc:

http://gerrit.cloudera.org:8080/#/c/18824/4/be/src/runtime/sorted-run-merger.cc@30
PS4, Line 30:   SortedRunMerger::SortedRunWrapper::SortedRunWrapper(SortedRunMerger* parent, const RunBatchSupplierFn& sorted_run)
line too long (116 > 90)


http://gerrit.cloudera.org:8080/#/c/18824/4/be/src/runtime/sorted-run-merger.cc@169
PS4, Line 169:   llvm::Function* fn = codegen->GetFunction(IRFunction::SORTED_RUN_MERGER_HEAPIFY_HELPER, true);
line too long (96 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 14 Sep 2022 16:45:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8667/


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Oct 2022 16:06:40 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8589/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Sep 2022 12:52:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8667/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Oct 2022 14:17:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode, PartialSortNode
and TopNNode), and in KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input runs.
Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode, PartialSortNode, TopNPNode and
ExchangeNode.

This change lets the merger use the codegened version of TupleRowComparator
instead of the interpreted one, which can increase the speed especially
in case of complex comparison expressions. This change also serves as a
base for further codegen-related optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
19 files changed, 228 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode, PartialSortNode
and TopNNode), and in KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode, PartialSortNode, TopNPNode and
ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
19 files changed, 228 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/18824/8
-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 7: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Oct 2022 15:38:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 6: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 15 Sep 2022 15:02:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8589/


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Sep 2022 17:36:03 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 10: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18824/10/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/10/be/src/exec/exchange-node.cc@133
PS10, Line 133: 
nit: remove line


http://gerrit.cloudera.org:8080/#/c/18824/10/be/src/runtime/sorted-run-merger.cc
File be/src/runtime/sorted-run-merger.cc:

http://gerrit.cloudera.org:8080/#/c/18824/10/be/src/runtime/sorted-run-merger.cc@77
PS10, Line 77: Status
We always return OK, so the Status could be void instead



-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Feb 2023 15:32:26 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 12
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 19:24:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 6:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/18824/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18824/6//COMMIT_MSG@14
PS6, Line 14: Re-writed
Nit: Re-wrote


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/exchange-node.cc
File be/src/exec/exchange-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/exchange-node.cc@141
PS6, Line 141:   const ExchangePlanNode& pnode = static_cast<const ExchangePlanNode&>(plan_node_);
Could be moved within the IF below to limit its scope.

Also, optionally we could add a DCHECK with a dynamic cast:
DCHECK(dynamic_cast<const ExchangePlanNode*>(&plan_node_) != nullptr);


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/partial-sort-node.cc@101
PS6, Line 101:   AddCodegenStatus(
Is there a reason why in ExchangePlanNode::Codegen() we check whether 'compare_fn' is NULL and only proceed if it isn't, but here we don't?
If 'compare_fn' should never be NULL here, we could add a DCHECK checking that here.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/sort-node.cc
File be/src/exec/sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/sort-node.cc@121
PS6, Line 121:         SortedRunMerger::Codegen(state, compare_fn, &codegend_heapify_helper_fn_);
See comment at partial-sort-node.cc:101.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/exec/topn-node.cc@225
PS6, Line 225:     codegen_status =
See comment at partial-sort-node.cc:101 (it also applies to Sorter::TupleSorter::Codegen on L221).


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/krpc-data-stream-recvr.h@108
PS6, Line 108: _
Parameter shouldn't end in an underscore.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/krpc-data-stream-recvr.h@196
PS6, Line 196: codegend_heapify_helper_fn_
Do we use it anywhere? In CreateMerger() we use the parameter, right?


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/krpc-data-stream-recvr.cc@643
PS6, Line 643: _
Parameter shouldn't end in an underscore.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger-ir.cc
File be/src/runtime/sorted-run-merger-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger-ir.cc@23
PS6, Line 23:  
Nit: extra space.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger-ir.cc@25
PS6, Line 25: Status
We always return Status::OK(), do we need to change the return type from void to Status in this patch?


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger.cc
File be/src/runtime/sorted-run-merger.cc:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger.cc@17
PS6, Line 17: //#include "runtime/row-batch.h"
Remove commented lines, see also L21.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger.cc@85
PS6, Line 85:     RETURN_IF_ERROR(heapify_helper_fn(this, parent_index));
Nit: you could simply return in both branches of the IF statement, no need for RETURN_IF_ERROR.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger.cc@172
PS6, Line 172: NULL
Nit: nullptr is better.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorted-run-merger.cc@175
PS6, Line 175: compare_fn
We could also DCHECK that 'compare_fn' is not null.


http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

http://gerrit.cloudera.org:8080/#/c/18824/6/be/src/runtime/sorter.h@123
PS6, Line 123: _
Parameter shouldn't end with an underscore.



-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 19 Sep 2022 13:50:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11362/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 15 Sep 2022 14:36:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode), and in
KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode and ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Reviewed-on: http://gerrit.cloudera.org:8080/18824
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-test.cc
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/sorted-run-merger-ir.cc
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
15 files changed, 254 insertions(+), 106 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 13
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9087/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 12
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Mar 2023 14:08:53 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11477: Adding Codegen to sorted-run-merger

Posted by "Daniel Becker (Code Review)" <ge...@cloudera.org>.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18824 )

Change subject: IMPALA-11477: Adding Codegen to sorted-run-merger
......................................................................


Patch Set 9:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/data-stream-test.cc@452
PS9, Line 452:     ///TODO: codegend_heapify_helper_fn currently stores a nullptr. Should call codegen
Can we resolve this TODO in this change or should it be another commit?


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.h
File be/src/runtime/sorted-run-merger.h:

http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.h@132
PS9, Line 132: and
Nit: wouldn't "or" be better? See also sorter.h#{261,265}.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.h@142
PS9, Line 142: class SortedRunMerger::SortedRunWrapper {
Do we need to move this class here from the .cc file? I haven't found any places where we use it outside of the .cc.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc
File be/src/runtime/sorted-run-merger.cc:

http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@33
PS9, Line 33: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@42
PS9, Line 42: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@46
PS9, Line 46: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@53
PS9, Line 53: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@63
PS9, Line 63: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@74
PS9, Line 74: NULL
Nit: nullptr would be better.


http://gerrit.cloudera.org:8080/#/c/18824/9/be/src/runtime/sorted-run-merger.cc@76
PS9, Line 76: NULL
Nit: nullptr would be better.



-- 
To view, visit http://gerrit.cloudera.org:8080/18824
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Gerrit-Change-Number: 18824
Gerrit-PatchSet: 9
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 Feb 2023 15:11:04 +0000
Gerrit-HasComments: Yes