You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/02/27 17:53:04 UTC

[jira] [Updated] (DRILL-2327) Upper bound on join's row_count_estimate_factor should be increased to handle expanding joins

     [ https://issues.apache.org/jira/browse/DRILL-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aman Sinha updated DRILL-2327:
------------------------------
    Attachment: 0001-DRILL-2327-Raise-the-max-value-allowed-for-join-s-ca.patch

The row_count_estimate_factor parameter is of type double and I raised the upper bound to Double.MAX_VALUE.  Note that even if the NDV statistics for join columns were available, there are certain joins (particularly expanding joins and joins where columns are correlated to one another) where getting good estimates is difficult.  Hence, having a knob to control the estimated cardinality is useful for the optimizer. 

[~jni] could you pls take a look at the minor change. 

> Upper bound on join's row_count_estimate_factor should be increased to handle expanding joins
> ---------------------------------------------------------------------------------------------
>
>                 Key: DRILL-2327
>                 URL: https://issues.apache.org/jira/browse/DRILL-2327
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>             Fix For: 0.8.0
>
>         Attachments: 0001-DRILL-2327-Raise-the-max-value-allowed-for-join-s-ca.patch
>
>
> The current bounds for planner.join.row_count_estimate_factor is between 0 to 100.  The default value is 1.0.  This parameter determines the estimated output cardinality of a join.  For hugely expanding joins, this is inadequate and we need to allow substantially larger upper bound. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)