You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@impala.apache.org by "Michael Ho (JIRA)" <ji...@apache.org> on 2018/11/26 22:27:00 UTC

[jira] [Resolved] (IMPALA-2885) Scanners store per-split objects in per-query object pool

     [ https://issues.apache.org/jira/browse/IMPALA-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Ho resolved IMPALA-2885.
--------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.7.0

The other patch I was working on was abandoned. So closing this for now. The code has changed a lot since then and will reopen Jira if there is any clean up opportunity in the future.

> Scanners store per-split objects in per-query object pool
> ---------------------------------------------------------
>
>                 Key: IMPALA-2885
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2885
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.5.0
>            Reporter: Tim Armstrong
>            Assignee: Michael Ho
>            Priority: Minor
>              Labels: resource-management
>             Fix For: Impala 2.7.0
>
>
> Various scanners store control structures in RuntimeState::object_pool_ to be cleaned up at the end of the query. Since some of these control structures are allocated for every input split, a small amount of memory is wasted on control structures that are no longer needed. If a large number of scan ranges and columns are processed in a query, this can add megabytes or 10s of megabytes to the query's memory consumption.
> I added some logging and saw that for a largish scan there was 10000+ objects in the object pool.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)