You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2022/11/22 21:10:00 UTC

[jira] [Commented] (IMPALA-10001) Find good default value for SORT_RUN_BYTES_LIMIT

    [ https://issues.apache.org/jira/browse/IMPALA-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17637456#comment-17637456 ] 

Riza Suminto commented on IMPALA-10001:
---------------------------------------

Setting SORT_RUN_BYTES_LIMIT comes with a risk of unnecessarily spilling when the query can actually fit all data in memory.
We have been using 512MB in our tpcds-impala-kit script for sometime now:
[https://github.com/cloudera/impala-tpcds-kit/blob/d829fc392a70df8300a8d9fd265977fa078a2dab/scripts/impala-insert.sql#L8] 

Got to chat with [~noemi] who has been experimenting with sort implementation a lot.
Generally we don't want to set SORT_RUN_BYTES_LIMIT too low as it can cause too frequent spilling. But we also don't want to set it too high such that the cost for in-memory sort + spilling an already too large sort-run can block for minutes. SORT_RUN_BYTES_LIMIT=2G might be ideal to balance in-memory sort time vs spill time.

> Find good default value for SORT_RUN_BYTES_LIMIT
> ------------------------------------------------
>
>                 Key: IMPALA-10001
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10001
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Perf Investigation
>            Reporter: Riza Suminto
>            Priority: Minor
>
> IMPALA-6692 add query option SORT_RUN_BYTES_LIMIT to trigger early sort before the query hit memory limit.
> Currently, it is disabled as default. We need to find a good default value for this query option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org