You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/11/10 04:57:00 UTC
[jira] [Commented] (IMPALA-3816) Codegen perf-critical loops in
Sorter
[ https://issues.apache.org/jira/browse/IMPALA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17228970#comment-17228970 ]
ASF subversion and git services commented on IMPALA-3816:
---------------------------------------------------------
Commit 1337adff6cc0c59a0f066b5c9a64352d929627f2 in impala's branch refs/heads/master from Qifan Chen
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1337adf ]
IMPALA-3816: Codegen perf critical loops in Sorter
This fix added the functionality to codegen recursive method
Sorter::TupleSorter::SortHelper() in sorter, which improves the
performance for both the sort and the partial sort operators.
In one unit test to order 7300 rows from table functional.alltypes,
the speedup of the code-gen version over non-code-gen version of
the method is around 65%. In another unit test to partially
order 2880404 rows, the speedup is around 61%.
Testing:
1. Unit testing;
2. Ran Core tests successfully.
Change-Id: Ie08137449d4a7b554ca8b8650260f8bd72e0a81b
Reviewed-on: http://gerrit.cloudera.org:8080/16621
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
> Codegen perf-critical loops in Sorter
> -------------------------------------
>
> Key: IMPALA-3816
> URL: https://issues.apache.org/jira/browse/IMPALA-3816
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 2.7.0
> Reporter: Tim Armstrong
> Assignee: Qifan Chen
> Priority: Minor
> Labels: codegen
> Attachments: 7e455a394bf2622b_adcea7ad00000000_opt.ll, percentile query profile.txt, tpch_30.txt
>
>
> In the sorter, we codegen the comparator function but call it indirectly via a function pointer. We should consider codegening the perf-critical loops so that we can make the comparator function call direct and inlinable. Inlining the comparison will be very beneficial if it is trivial, e.g. order by a numeric column: I expect sorts on simple keys will get noticably faster.
> We should also be able to get rid of FreeLocalAllocations() calls for most comparators, although I'm not sure what the best way to approach that is.
> The Partition() loop is the most perf-critical, followed by InsertionSort().
> We also don't do this yet for the TopN node, see IMPALA-3815.
> Mostafa's analysis:
> While evaluating Sort performance I noticed that the codegened compare function is not inlined which results in large overhead per row.
> Expected speedup is 10-15%
> {code}
> /// Returns a negative value if lhs is less than rhs, a positive value if lhs is
> /// greater than rhs, or 0 if they are equal. All exprs (ordering_exprs_lhs_ and
> /// ordering_exprs_rhs_) must have been prepared and opened before calling this,
> /// i.e. 'sort_key_exprs' in the constructor must have been opened.
> int ALWAYS_INLINE Compare(const TupleRow* lhs, const TupleRow* rhs) const {
> return codegend_compare_fn_ == NULL ?
> CompareInterpreted(lhs, rhs) :
> (*codegend_compare_fn_)(ordering_expr_evals_lhs_.data(),
> ordering_expr_evals_rhs_.data(), lhs, rhs);
> }
> {code}
> From Perf
> {code}
> │ bool Sorter::TupleSorter::Less(const TupleRow* lhs, const TupleRow* rhs) { ▒
> 7.43 │ push %rbp ▒
> 3.23 │ mov %rsp,%rbp ▒
> 9.44 │ push %r12 ▒
> 2.69 │ push %rbx ▒
> 3.89 │ mov %rsi,%r12 ▒
> 2.98 │ mov %rdi,%rbx ▒
> 6.06 │ sub $0x10,%rsp ◆
> │ --num_comparisons_till_free_; ▒
> │ DCHECK_GE(num_comparisons_till_free_, 0); ▒
> │ if (UNLIKELY(num_comparisons_till_free_ == 0)) { ▒
> 3.75 │ subl $0x1,0x18(%rdi) ▒
> 9.42 │ ↓ je 58 ▒
> │ parent_->expr_results_pool_.Clear(); ▒
> │ num_comparisons_till_free_ = state_->batch_size(); ▒
> │ } ▒
> │ return comparator_.Less(lhs, rhs); ▒
> 1.09 │17: mov 0x10(%rbx),%rdi ▒
> │ /// Returns a negative value if lhs is less than rhs, a positive value if lhs is ▒
> │ /// greater than rhs, or 0 if they are equal. All exprs (ordering_exprs_lhs_ and ▒
> │ /// ordering_exprs_rhs_) must have been prepared and opened before calling this, ▒
> │ /// i.e. 'sort_key_exprs' in the constructor must have been opened. ▒
> │ int ALWAYS_INLINE Compare(const TupleRow* lhs, const TupleRow* rhs) const { ▒
> │ return codegend_compare_fn_ == NULL ? ▒
> 2.69 │ mov 0x58(%rdi),%rax ▒
> │ CompareInterpreted(lhs, rhs) : ▒
> │ (*codegend_compare_fn_)(ordering_expr_evals_lhs_.data(), ▒
> │ ordering_expr_evals_rhs_.data(), lhs, rhs); ▒
> 5.43 │ test %rax,%rax ▒
> │ ↓ je 40 ▒
> 6.85 │ mov 0x20(%rdi),%rsi ▒
> 0.86 │ mov %rdx,%rcx ▒
> 2.55 │ mov 0x8(%rdi),%rdi ▒
> 3.38 │ mov %r12,%rdx ▒
> 6.10 │ → callq *(%rax) ▒
> │ } ▒
> 5.84 │ add $0x10,%rsp ▒
> │ /// All exprs (ordering_exprs_lhs_ and ordering_exprs_rhs_) must have been prepared ▒
> │ /// and opened before calling this. ▒
> │ /// Force inlining because it tends not to be always inlined at callsites, even in ▒
> │ /// hot loops. ▒
> │ bool ALWAYS_INLINE Less(const TupleRow* lhs, const TupleRow* rhs) const { ▒
> │ return Compare(lhs, rhs) < 0; ▒
> 1.77 │ shr $0x1f,%eax ▒
> 7.91 │ pop %rbx ▒
> 4.11 │ pop %r12 ▒
> 0.49 │ pop %rbp ▒
> 1.75 │ ← retq ▒
> │ /// i.e. 'sort_key_exprs' in the constructor must have been opened. ▒
> │ int ALWAYS_INLINE Compare(const TupleRow* lhs, const TupleRow* rhs) const { ▒
> │ return codegend_compare_fn_ == NULL ? ▒
> │ CompareInterpreted(lhs, rhs) : ▒
> │ (*codegend_compare_fn_)(ordering_expr_evals_lhs_.data(), ▒
> │ ordering_expr_evals_rhs_.data(), lhs, rhs); ▒
> │40: mov %r12,%rsi ▒
> │ → callq impala::TupleRowComparator::CompareInterpreted(impala::TupleRow const*, impala::TupleRow const*) const ▒
> │ add $0x10,%rsp ▒
> │ /// All exprs (ordering_exprs_lhs_ and ordering_exprs_rhs_) must have been prepared ▒
> Press 'h' for help on key bindings
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org