You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Mike Matrigali (JIRA)" <ji...@apache.org> on 2014/01/22 19:27:21 UTC
[jira] [Comment Edited] (DERBY-6301) SQL layer should push down IN list predicates to store when doing a scan

    [ https://issues.apache.org/jira/browse/DERBY-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804635#comment-13804635 ] 

Mike Matrigali edited comment on DERBY-6301 at 1/22/14 6:26 PM:
----------------------------------------------------------------

I believe multi-probe is only picked when internally we have an in-list.  We will get internal in-list either in the case of a user specified
IN-list or the code may (maybe always - i am not sure) convert a set of equality OR's to an internal IN-list.

I did not mean to say there is a current bound.  I just remember that a long time ago there were derby compiler problems with very long
IN-lists, and Dan did work in this area to avoid them.  So whatever we do in this area we should think about how it might affect the size of
a compiled query plan with 1000's of terms in an IN-list.  

There is a point when the optimizer will not pick multi-probe on an IN-list even if a good index exists, but it is more of a costing issue.  The cost to
 probe each term is
more than a normal scan so as the number of terms gets close to or exceeds the number of rows in the table then multi-probe will not be
picked.

In general I think the more we can get the sql layer to push predicates into store the better.   Read committed locking works best when this
is the case, and I assume it also fixes issues with virtual tables and predicates.


was (Author: mikem):
I believe multi-probe is only picked when internally we have an in-list.  We will get internal in-list either in the case of a user specified
IN-list or the code may (maybe always - i am not sure) convert a set of equality OR's to an internal IN-list.

I did not mean to say there is a current bound.  I just remember that a long time ago there were derby compiler problems with very long
IN-lists, and Dan did work in this area to avoid them.  So whatever we do in this area we should think about how it might affect the size of
a compiled query plan with 1000's of terms in an IN-list.  

There is a point when the optimizer will not pick multi-probe on an IN-list even if a good index exists, but it is more of a costing issue.  The cost to
 probe each term is
more than a normal scan so as the number of terms gets close to or exceeds the number of rows in the table then multi-probe will not be
picked.

In general I think the more we can get the sql layer to push predicates into store the better.   Read committed locking worst best when this
is the case, and I assume it also fixes issues with virtual tables and predicates.

> SQL layer should push down IN list predicates to store when doing a scan
> ------------------------------------------------------------------------
>
>                 Key: DERBY-6301
>                 URL: https://issues.apache.org/jira/browse/DERBY-6301
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.10.1.1
>            Reporter: Mike Matrigali
>
> The store interface allows for OR and AND qualifiers to be passed down to store as part of either
> a heap or btree scan.  It is more efficient to qualify the rows at the lowest levels.  The SQL level
> does not seem to  push any qualifier in the case of IN lists.
> This does not matter if the optimizer choses the multi-probe execution strategy for the IN list as that also
> qualifies the row at the lowest level.
> The problem arises when the optimizer chooses not to do multi-probe, for instance if it determines there
> are too many terms in the in-list relative to the size of the table and the cardinality of the terms.  In this
> case it chooses a scan with no qualifiers which results in all rows being returned to the sql layer and qualified there.  
> In addition to performance considerations this presents a locking problem with respect to the repeatable read isolation level.   It is optimal in repeatable read to not maintain locks on those
> rows that do not qualify.  Currently this locking optimization only takes place for those rows that
> are qualified in the store vs. those qualified in the upper SQL layer.  So in the case of a non-multi-probe IN-LIST plan all non-qualified rows looked at as part of the execution will remain locked in repeatable 
> read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)