You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/03/02 00:22:00 UTC

[jira] [Commented] (IMPALA-11477) Codegen Heapify in SortedRunMerger

    [ https://issues.apache.org/jira/browse/IMPALA-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695381#comment-17695381 ] 

ASF subversion and git services commented on IMPALA-11477:
----------------------------------------------------------

Commit 939a6ae14e5416845a43d3d41d01c43e8123a0b9 in impala's branch refs/heads/master from noemi
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=939a6ae14 ]

IMPALA-11477: Adding Codegen to sorted-run-merger

SortedRunMerger is used to merge multiple, already sorted runs.
It is used for external merge in the sorter (SortNode), and in
KRPC data stream receiver (ExchangeNode).

SortedRunMerger builds and maintains a min heap of the sorted input
runs. Rewrote SortedRunMerger::Heapify from recursive to iterative
and moved to a separate new source file: sorted-run-merger-ir.cc.
Added a static Codegen() to SortedRunMerger and call it from the
corresponding ExecNodes: SortNode and ExchangeNode.

This change lets the merger use the codegened version of
TupleRowComparator instead of the interpreted one, which can increase
the speed, especially in case of complex comparison expressions.
This change also serves as a base for further codegen-related
optimizations in the merger.

Testing:
 - run existing E2E sort tests (test-sort.py)
 - manual testing: run queries that instantiate sort nodes and
   merging exchange nodes
Benchmarking:
 - did not cause regression on TPCH query set
 - made merge-intensive queries and IMPALA-4530 (in-memory merge of
   quicksorted small runs) faster

Change-Id: Ic35c7460bdbd54b8ec5872a83680e2f41ceae9fd
Reviewed-on: http://gerrit.cloudera.org:8080/18824
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Codegen Heapify in SortedRunMerger
> ----------------------------------
>
>                 Key: IMPALA-11477
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11477
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Noemi Pap-Takacs
>            Assignee: Noemi Pap-Takacs
>            Priority: Minor
>
> SortedRunMerger is used to merge multiple, already sorted runs. It is used for external merge in the sorter (SortNode, PartialSortNode and TopNNode), and in KRPC data stream receiver (ExchangeNode).
> SortedRunMerger builds and maintains a min heap of the sorted input runs. Codegening this Heapify function and the comparator could improve the performance of the merger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org