You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Alex Vasiliev (JIRA)" <ji...@apache.org> on 2017/10/05 03:40:00 UTC
[jira] [Created] (DERBY-6967) Query Optimizer picks Table Scan
based on WHERE predicate, ignores expensive ORDER BY.
Alex Vasiliev created DERBY-6967:
------------------------------------
Summary: Query Optimizer picks Table Scan based on WHERE predicate, ignores expensive ORDER BY.
Key: DERBY-6967
URL: https://issues.apache.org/jira/browse/DERBY-6967
Project: Derby
Issue Type: Bug
Affects Versions: 10.13.1.1
Reporter: Alex Vasiliev
h5. Problem:
It is appeared that query optimizer mises dramatically for a fairly simple query:
{code:sql}
SELECT * FROM Customer
WHERE ID >= 0
ORDER BY ID
-- optionally: FETCH NEXT 2 ROWS ONLY;
{code}
The "ID >= 0" predicate is expected to return almost entire table (true), so optimizer decides to utilize Table Scan to avoid having to resolve index row references, but it should take into account that then it need to order the result set (sort), which is a way more expensive operation. Especially if it's followed by, let's say, "FETCH NEXT 2 ROWS ONLY" which would explicitly indicate that there is no need to iterate through the entire table.
h5. Steps to recreate:
{code:sql}
CREATE TABLE Customer (
ID BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
Name VARCHAR(1024) NOT NULL
);
INSERT INTO Customer (Name) VALUES ('Alex'), ('Peter'), ('Bob'), ('Fred');
CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1);
SELECT * FROM Customer WHERE ID >= 0 ORDER BY ID;
VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
{code}
h5. Result:
{code}
Statement Name:
null
Statement Text:
SELECT * FROM Customer WHERE ID >= 0 ORDER BY ID
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text:
Sort ResultSet:
Number of opens = 1
Rows input = 4
Rows returned = 4
Eliminate duplicates = false
In sorted order = false
Sort information:
Number of rows input=4
Number of rows output=4
Sort type=internal
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 2.97
optimizer estimated cost: 39.32
Source result set:
Table Scan ResultSet for CUSTOMER at read committed isolation level using instantaneous share row locking chosen by the optimizer
Number of opens = 1
Rows seen = 4
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=2
Number of pages visited=1
Number of rows qualified=4
Number of rows visited=4
Scan type=heap
start position:
null
stop position:
null
qualifiers:
Column[0][0] Id: 0
Operator: <
Ordered nulls: false
Unknown return value: true
Negate comparison result: true
optimizer estimated row count: 2.97
optimizer estimated cost: 39.32
{code}
^ Note that Table Scan is picked and not an index over ID.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)