You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-dev@db.apache.org by "Mamta A. Satoor (JIRA)" <ji...@apache.org> on 2012/10/02 07:25:07 UTC

[jira] [Updated] (DERBY-2482) Improve optimizer costing for "probe predicates" so that index multi-probing is more likely to occur.

     [ https://issues.apache.org/jira/browse/DERBY-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mamta A. Satoor updated DERBY-2482:
-----------------------------------

    Urgency: Normal
     Labels: derby_triage10_10  (was: )
    
> Improve optimizer costing for "probe predicates" so that index multi-probing is more likely to occur.
> -----------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2482
>                 URL: https://issues.apache.org/jira/browse/DERBY-2482
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.3.1.4
>            Reporter: A B
>            Priority: Minor
>              Labels: derby_triage10_10
>
> Pulled from the write-up for DERBY-47:
> <start>
> The costing logic for probing as implemented in DERBY-47 is not perfect. It works great in situations where the IN list is significantly smaller than the number of rows in the table--ex. we see good results for the test program attached to that issue when there are 100k rows in the target table. However, I discovered that if we just run with 10,000 rows, then once we hit 1,000 values in the IN list the costing of probe predicates causes the optimizer to think that it would be too expensive, so it (the optimizer) ends up doing a table scan. In truth the table scan is still far slower than index probing, but the relative size of the IN list with respect to the number of rows in the table throws the costing off. To confirm this I just removed the probing cost logic (so that it effectively becomes the cost of a single "col = ?" predicate) and then the optimizer chose to do index probing for the 10,000 row scenario, which was much, much faster (as expected). 
> <end>
> So it would be great if the costing could be improved such that the optimizer will recognize cases like this and thus can choose to do multi-probing.  DERBY-47 showed that multi-probing can significantly improve execution times, so the more we can get the optimizer to (correctly) choose to do it, the better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira