You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/08/06 23:03:00 UTC

[jira] [Resolved] (IMPALA-10025) Avoid rebuilding in-memory heap during output phase of top-n

     [ https://issues.apache.org/jira/browse/IMPALA-10025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-10025.
------------------------------------
    Resolution: Invalid

I'm going to do this in the first version of the patch

> Avoid rebuilding in-memory heap during output phase of top-n
> ------------------------------------------------------------
>
>                 Key: IMPALA-10025
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10025
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>
> In the patch for IMPALA-9853, we reuse some code in the output phase that necessitated building the in-memory heap from the sorter's output. This has some inherent overhead that gets worse for larger limits and/or partition counts.
> It would be better to have the sorter do a full sort on partition/order by columns and then apply the limit while streaming the results back from the sorter. In combination with IMPALA-10023 this would let us gracefully degrade to doing something closer to a regular sort and probably let us bump ANALYTIC_PUSHDOWN_THRESHOLD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org