You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/23 18:43:00 UTC

[jira] [Commented] (DRILL-5737) Hash Agg uses more than the allocated memory under certain low memory conditions

    [ https://issues.apache.org/jira/browse/DRILL-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138863#comment-16138863 ] 

ASF GitHub Bot commented on DRILL-5737:
---------------------------------------

GitHub user sohami opened a pull request:

    https://github.com/apache/drill/pull/920

    DRILL-5737: Hash Agg uses more than the allocated memory under certai…

    …n low memory conditions
    
                Note: Provide a new config parameter HASHAGG_FALLBACK_ENABLED which is set to true by default. When 2 Phase
                HashAgg doesn't have enough memory to hold 2 partitions then based on this flag it either fallsback to old
                behavior of consuming unbounded memory or it fails the query.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sohami/drill DRILL-5737

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #920
    
----
commit b534fd67908ea6cc5471c650bd6c3ae1c193d2f4
Author: Sorabh Hamirwasia <sh...@maprtech.com>
Date:   2017-08-23T01:20:51Z

    DRILL-5737: Hash Agg uses more than the allocated memory under certain low memory conditions
                Note: Provide a new config parameter HASHAGG_FALLBACK_ENABLED which is set to true by default. When 2 Phase
                HashAgg doesn't have enough memory to hold 2 partitions then based on this flag it either fallsback to old
                behavior of consuming unbounded memory or it fails the query.

----


> Hash Agg uses more than the allocated memory under certain low memory conditions
> --------------------------------------------------------------------------------
>
>                 Key: DRILL-5737
>                 URL: https://issues.apache.org/jira/browse/DRILL-5737
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Sorabh Hamirwasia
>            Assignee: Sorabh Hamirwasia
>
> Reported by [~rkins]
> Based on the memory computations drill thinks that there is not sufficient memory and falls back to a single partition case. The single partition case however does not respect the memory constraints imposed and completes the query using ~130MB of memory
> {code:java}
> alter session set `planner.width.max_per_node` = 1;
> alter session set `planner.memory.max_query_memory_per_node` = 117127360;
> select count(*) from (select max(nulls_col), max(length(nulls_col)), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col) d;
> {code}
> Based on analysis by [~ben-zvi] this is by design. When the Hash Aggr Op finds that there is not enough memory for at least two partitions, it falls back to the pre 1.11 behavior ( using 10GB limit ). 
> Solution is to provide a configuration based on which the fallback will be either allowed or query will be failed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)