You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Mamta A. Satoor (JIRA)" <ji...@apache.org> on 2014/01/28 18:56:38 UTC

[jira] [Commented] (DERBY-6301) SQL layer should push down IN list predicates to store when doing a scan

    [ https://issues.apache.org/jira/browse/DERBY-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884363#comment-13884363 ] 

Mamta A. Satoor commented on DERBY-6301:
----------------------------------------

If a query is using multi-probe, will it be reflected in the query plan? 

I have a test db with following table
create table t1(c11 int primary key, c12 int, c13 int, c14 int);
Table t1 has 200 rows with c11 values going from 1-200 but rest of the columns have 1 in them for all 200 rows. 

I ran two queries against this table with in list as follows
select c11 from t1 where c11 in (1,2,30,100,186);
select c11 from t1 where c11  in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30);

The first query uses index scan but I do not see any direct indication of multi-probe use. The query plan for the first query is as shown below
(SESSIONID = 1), select c11 from t1 where c11 in (1,2,30,100,186) ******* Index Scan ResultSet for T1 using constraint SQL140123220605150 at read committed isolation level using share row locking chosen by the optimizer
Number of opens = 5
Rows seen = 5
Rows filtered = 0
Fetch Size = 1
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	next time in milliseconds/row = 0

scan information:
	Bit set of columns fetched={0}
	Number of columns fetched=1
	Number of deleted rows visited=0
	Number of pages visited=11
	Number of rows qualified=5
	Number of rows visited=10
	Scan type=btree
	Tree height=-1
	start position:
		>= on first 1 column(s).
		Ordered null semantics on the following columns: 
	stop position:
		> on first 1 column(s).
		Ordered null semantics on the following columns: 
	qualifiers:
		None
	optimizer estimated row count: 5.12
	optimizer estimated cost: 17.64

The query plan for 2nd query is as shown below(it is using table scan)
Tue Jan 28 08:33:34 PST 2014 Thread[main,5,main] (XID = 637), (SESSIONID = 1), select c11 from t1 where c11  in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30) ******* Project-Restrict ResultSet (2):
Number of opens = 1
Rows seen = 200
Rows filtered = 170
restriction = true
projection = false
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	restriction time (milliseconds) = 0
	projection time (milliseconds) = 0
	optimizer estimated row count: 20.50
	optimizer estimated cost: 78.76
Source result set:
	Table Scan ResultSet for T1 at read committed isolation level using instantaneous share row locking chosen by the optimizer
	Number of opens = 1
	Rows seen = 200
	Rows filtered = 0
	Fetch Size = 16
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 0
		close time (milliseconds) = 0
		next time in milliseconds/row = 0

	scan information:
		Bit set of columns fetched={0}
		Number of columns fetched=1
		Number of pages visited=3
		Number of rows qualified=200
		Number of rows visited=200
		Scan type=heap
		start position:
			null
		stop position:
			null
		qualifiers:
			None
		optimizer estimated row count: 20.50
		optimizer estimated cost: 78.76


> SQL layer should push down IN list predicates to store when doing a scan
> ------------------------------------------------------------------------
>
>                 Key: DERBY-6301
>                 URL: https://issues.apache.org/jira/browse/DERBY-6301
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.10.1.1
>            Reporter: Mike Matrigali
>
> The store interface allows for OR and AND qualifiers to be passed down to store as part of either
> a heap or btree scan.  It is more efficient to qualify the rows at the lowest levels.  The SQL level
> does not seem to  push any qualifier in the case of IN lists.
> This does not matter if the optimizer choses the multi-probe execution strategy for the IN list as that also
> qualifies the row at the lowest level.
> The problem arises when the optimizer chooses not to do multi-probe, for instance if it determines there
> are too many terms in the in-list relative to the size of the table and the cardinality of the terms.  In this
> case it chooses a scan with no qualifiers which results in all rows being returned to the sql layer and qualified there.  
> In addition to performance considerations this presents a locking problem with respect to the repeatable read isolation level.   It is optimal in repeatable read to not maintain locks on those
> rows that do not qualify.  Currently this locking optimization only takes place for those rows that
> are qualified in the store vs. those qualified in the upper SQL layer.  So in the case of a non-multi-probe IN-LIST plan all non-qualified rows looked at as part of the execution will remain locked in repeatable 
> read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)