You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/09/06 00:16:00 UTC

[jira] [Resolved] (IMPALA-5885) Parquet scanner does not free local allocations in filter contexts

     [ https://issues.apache.org/jira/browse/IMPALA-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-5885.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.11.0


IMPALA-5885: free runtime filter allocations in Parquet

This fixes the parquet scanner to free local allocations in
runtime filter contexts for every batch.

Testing:
Added a regression test that runs out of memory before this fix.

Ran core and ASAN builds.

Change-Id: Iecdda6af12d5ca578f7d2cb393e9cb9f49438f09
Reviewed-on: http://gerrit.cloudera.org:8080/7931
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins

> Parquet scanner does not free local allocations in filter contexts
> ------------------------------------------------------------------
>
>                 Key: IMPALA-5885
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5885
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>              Labels: resource-management
>             Fix For: Impala 2.11.0
>
>
> This problem can occur if runtime filter expressions that are evaluated in the scan allocate temporary memory - "local allocations". These accumulate for each scan range and are only 
> freed upon scan range completion. 
> A contrived query that exhibits the problem is the following. If I continue adding upper() and lower() to the expression the memory consumption of the scan node will continue to grow - up to 100MB for each extra function call!
> {code}
> set runtime_filter_wait_time_ms=1000000;
> select straight_join count(*) from tpch_parquet.lineitem l1 join tpch_parquet.lineitem l2 on upper(lower(upper(lower(l1.l_comment)))) = concat(l2.l_comment, 'foo');
> summary;
> {code}
> I think other conjuncts in the scanner may be affected by the same problem, e.g. the min_max conjuncts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)