You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Julian Hyde (JIRA)" <ji...@apache.org> on 2016/03/01 20:32:18 UTC

[jira] [Commented] (DRILL-4460) Provide feature that allows fall back to sort aggregation

    [ https://issues.apache.org/jira/browse/DRILL-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174268#comment-15174268 ] 

Julian Hyde commented on DRILL-4460:
------------------------------------

Falling back to external hashing would be another viable solution to the problem. It's a little more expensive to switch from memory hashing to external hashing when you discover that the data set is larger than you expected (hashing uses a different data structure for external data, whereas sorting uses essentially the same data structure)

> Provide feature that allows fall back to sort aggregation
> ---------------------------------------------------------
>
>                 Key: DRILL-4460
>                 URL: https://issues.apache.org/jira/browse/DRILL-4460
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Flow
>    Affects Versions: 1.5.0
>            Reporter: John Omernik
>
> Currently, the default setting for Drill is to use a Hash (in Memory) model for aggregations (set by planner.enable_hashagg = true as default).  This works well, but it's memory dependent and an out of memory condition will cause a query failure.  At this point, a user can alter session set `planner.enable_hashagg` = false and run the query again. If memory is a challenge again, the sort based approach will spill to disk allowing the query to complete (slower).
> What I am requesting is a feature, that defaults to be off (so Drill default behavior will be the same after this feature is added) that would allow a query that tried hash aggregation and failed due to out of memory to restart the same query with sort aggregation.  Basically, allowing the query to succeed, it will try hash first, then go to sort.  This would make for a better user experience in that the query would succeed. Perhaps a warning could be set for the user that would allow them to understand that this occurred, so they could just go to a sort based query by default in the future. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)