You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org> on 2022/08/04 14:56:44 UTC

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Noemi Pap-Takacs has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18802


Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................

IMPALA-11474: Codegen Tuple size in sorter-ir.cc

The number of bytes in a tuple is known before execution, it is
available in the query plan. However, currently the tuple size
is treated as a member variable of the TupleSorter.
Using Codegen to replace this member variable by a constant in
sorter-ir.cc can speed up the quicksort phase by up to 20% in the
most simple queries.

Some examples using tpch_parquet lineitem, scale factor=8:
disable_outermost_topn=1;
Query: select _ from lineitem order by _ limit 1;
+---------------+-------------------------+------+----------+----------+-------------+
|   Order by    |          Tuple          | NDV  | Constant | Variable | Improvement |
+---------------+-------------------------+------+----------+----------+-------------+
| rand()        | int, rand               | 48M  | 12.21s   | 14.21s   | 14%         |
| l_linenumber  | int                     | 7    | 1.81s    | 2.34s    | 22%         |
| l_orderkey    | bigint                  | 12M  | 5.38s    | 6.69s    | 19%         |
| l_receiptdate | string, decimal, bigint | 2600 | 17.3s    | 18.7s    | 7%          |
+---------------+-------------------------+------+----------+----------+-------------+

Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
---
M be/src/exec/partial-sort-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
6 files changed, 20 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/18802/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 3
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................

IMPALA-11474: Codegen Tuple size in sorter-ir.cc

The number of bytes in a tuple is known before execution, it is
available in the query plan. However, currently the tuple size
is treated as a member variable of the TupleSorter.
Using Codegen to replace this member variable by a constant in
sorter-ir.cc can speed up the quicksort phase by up to 20% in the
most simple queries with small tuples.

Some examples using tpch_parquet lineitem, scale factor=8:
disable_outermost_topn=1;
Query: select _ from lineitem order by _ limit 1;
+---------------+-------------------------+------+----------+----------+-------------+
|   Order by    |          Tuple          | NDV  | Constant | Variable | Improvement |
+---------------+-------------------------+------+----------+----------+-------------+
| rand()        | int, rand               | 48M  | 12.21s   | 14.21s   | 14%         |
| l_linenumber  | int                     | 7    | 1.81s    | 2.34s    | 22%         |
| l_orderkey    | bigint                  | 12M  | 5.38s    | 6.69s    | 19%         |
| l_receiptdate | string, decimal, bigint | 2600 | 17.3s    | 18.7s    | 7%          |
+---------------+-------------------------+------+----------+----------+-------------+

Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
---
M be/src/exec/partial-sort-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
6 files changed, 20 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/18802/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11097/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 15:26:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11096/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 3
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 15:17:29 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Aug 2022 11:21:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Aug 2022 16:14:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 15:18:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18802/5/be/src/exec/partial-sort-node.cc
File be/src/exec/partial-sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/18802/5/be/src/exec/partial-sort-node.cc@98
PS5, Line 98:  children_[0]->row_descriptor_->tuple_descriptors()[0]->byte_size()
Looked at this after seeing that the tests failed in Kudu inserts (which uses partial sort)

This doesn't look right to me, as the child's tuple size can be different than the materialized sort tuple size. The child can also have multiple tuples.

We should use row_descriptor_'s tuple size similarly to SortNode.



-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Fri, 05 Aug 2022 07:13:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 15:19:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Aug 2022 11:22:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8411/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Aug 2022 11:22:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................

IMPALA-11474: Codegen Tuple size in sorter-ir.cc

The number of bytes in a tuple is known before execution, it is
available in the query plan. However, currently the tuple size
is treated as a member variable of the TupleSorter.
Using Codegen to replace this member variable by a constant in
sorter-ir.cc can speed up the quicksort phase by up to 20% in the
most simple queries with small tuples.

Some examples using tpch_parquet lineitem, scale factor=8:
disable_outermost_topn=1;
Query: select _ from lineitem order by _ limit 1;
+---------------+-------------------------+------+----------+----------+-------------+
|   Order by    |          Tuple          | NDV  | Constant | Variable | Improvement |
+---------------+-------------------------+------+----------+----------+-------------+
| rand()        | int, rand               | 48M  | 12.21s   | 14.21s   | 14%         |
| l_linenumber  | int                     | 7    | 1.81s    | 2.34s    | 22%         |
| l_orderkey    | bigint                  | 12M  | 5.38s    | 6.69s    | 19%         |
| l_receiptdate | string, decimal, bigint | 2600 | 17.3s    | 18.7s    | 7%          |
+---------------+-------------------------+------+----------+----------+-------------+

Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Reviewed-on: http://gerrit.cloudera.org:8080/18802
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/partial-sort-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
6 files changed, 20 insertions(+), 8 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18802/3/be/src/runtime/sorter-internal.h
File be/src/runtime/sorter-internal.h:

http://gerrit.cloudera.org:8080/#/c/18802/3/be/src/runtime/sorter-internal.h@446
PS3, Line 446:   static Status Codegen(FragmentState* state, llvm::Function* compare_fn, int tuple_byte_size,
line too long (94 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 3
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 14:57:34 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11112/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 6
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Mon, 08 Aug 2022 11:35:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Noemi Pap-Takacs (Code Review)" <ge...@cloudera.org>.
Noemi Pap-Takacs has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................

IMPALA-11474: Codegen Tuple size in sorter-ir.cc

The number of bytes in a tuple is known before execution, it is
available in the query plan. However, currently the tuple size
is treated as a member variable of the TupleSorter.
Using Codegen to replace this member variable by a constant in
sorter-ir.cc can speed up the quicksort phase by up to 20% in the
most simple queries.

Some examples using tpch_parquet lineitem, scale factor=8:
disable_outermost_topn=1;
Query: select _ from lineitem order by _ limit 1;
+---------------+-------------------------+------+----------+----------+-------------+
|   Order by    |          Tuple          | NDV  | Constant | Variable | Improvement |
+---------------+-------------------------+------+----------+----------+-------------+
| rand()        | int, rand               | 48M  | 12.21s   | 14.21s   | 14%         |
| l_linenumber  | int                     | 7    | 1.81s    | 2.34s    | 22%         |
| l_orderkey    | bigint                  | 12M  | 5.38s    | 6.69s    | 19%         |
| l_receiptdate | string, decimal, bigint | 2600 | 17.3s    | 18.7s    | 7%          |
+---------------+-------------------------+------+----------+----------+-------------+

Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
---
M be/src/exec/partial-sort-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
6 files changed, 20 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/18802/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8399/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 15:19:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11474: Codegen Tuple size in sorter-ir.cc

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18802 )

Change subject: IMPALA-11474: Codegen Tuple size in sorter-ir.cc
......................................................................


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8399/


-- 
To view, visit http://gerrit.cloudera.org:8080/18802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4161a61db1782dc448dae9a1d4c1d120b055b3c
Gerrit-Change-Number: 18802
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <da...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <np...@cloudera.com>
Gerrit-Comment-Date: Thu, 04 Aug 2022 18:10:55 +0000
Gerrit-HasComments: No