You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Harri Pesonen <fu...@nic.fi> on 2006/01/26 19:35:27 UTC
A few questions about index usage
I have been testing Derby for a few days. I noticed a couple of odd
things. Table definition is
create table Customer (
Id bigint generated by default as identity primary key,
Name varchar(128),
Modified timestamp default current_timestamp
)
There are no extra indexes, but Id gets automatically an index. I have
inserted a million rows.
1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer" are
both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow, taking 5
seconds. Why?
2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM
Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index
backwards?
I am wondering if Derby is being improved in this area, and if I could
help. I have developed database systems in the past.
Thanks, Harri
Re: A few questions about index usage
Posted by Rajesh Kartha <ka...@gmail.com>.
HI Harri,
Allowing reverse scans in Derby would be a nice enhancement. In addition,
the CREATE INDEX statement
could be modified to provide the 'ALLOW/DISALLOW REVERSE SCANS' option
during index creation.
You may begin by logging an 'Enhancement' issue in JIRA and
proposing/discussing your ideas
to achieve this, on the derby-dev mailing list.
The Derby wiki has some useful information if you plan to get involved.
http://wiki.apache.org/db-derby/
-Rajesh
On 1/26/06, Jeffrey Lichtman <sw...@rcn.com> wrote:
>
>
> >1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer"
> >are both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow,
> >taking 5 seconds. Why?
>
> Derby knows how to use an index to quickly find a minimum or a
> maximum (by traversing down one side of the B-tree or the other). It
> doesn't know how to do both in the same query, which would take two
> traversals.
>
> >2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM
> >Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index
> backwards?
>
> Nope - Derby doesn't do backwards scrolling. I think the physical
> data layout could support it (i.e. there are backwards pointers), but
> backwards scrolling has never been implemented. Could someone more
> familiar with the store confirm this (Mike?).
>
>
> - Jeff Lichtman
> swazoo@rcn.com
> Check out Swazoo Koolak's Web Jukebox at
> http://swazoo.com/
>
>
Re: A few questions about index usage
Posted by Jeffrey Lichtman <sw...@rcn.com>.
>>Derby knows how to use an index to quickly find a minimum or a
>>maximum (by traversing down one side of the B-tree or the other).
>>It doesn't know how to do both in the same query, which would take
>>two traversals.
>
>Is the work to fix this the same as making IN list use multiple probes,
>and/or makeing OR lists do multiple probes? There are existing JIRA
>items for those, or is it different enough to have a separate JIRA?
I believe they're different. While each case would use multiple
probes, in the MIN/MAX case one of the probes would go down the right
side of the BTree, while in the IN/OR case it would do "normal"
scans. Also, the MIN/MAX case and the IN/OR case would require
different logic to recognize when the optimizations are possible. The
costing logic in the optimizer would be different, too. It's possible
the two cases could share some execution code, but the rest of it
would require different implementations.
- Jeff Lichtman
swazoo@rcn.com
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/
Re: A few questions about index usage
Posted by Mike Matrigali <mi...@sbcglobal.net>.
Jeffrey Lichtman wrote:
>
>> 1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer"
>> are both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow,
>> taking 5 seconds. Why?
>
>
> Derby knows how to use an index to quickly find a minimum or a maximum
> (by traversing down one side of the B-tree or the other). It doesn't
> know how to do both in the same query, which would take two traversals.
Is the work to fix this the same as making IN list use multiple probes,
and/or makeing OR lists do multiple probes? There are existing JIRA
items for those, or is it different enough to have a separate JIRA?
>
>> 2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM
>> Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index
>> backwards?
The structure could support this, the work just has not been done. With
row level locking, concurrent tree splitting, and current assumptions
throughout the access method that scans go top/down, left to right the
work to do a backward scan is harder than doing a forward scan. Also
once the store portion is done, there would be necessary changes to:
optimizer, execution engine, and scan interface to allow the backward
scan. This would be a hard first project.
The current workaround is that you can create both an ascending and
a descending index if you need ordering in both directions to go fast.
You should check out the access/btree white papers Dibyendo (sp?)
produced, on the web site.
I will go ahead and log an enhancement in JIRA (DERBY-884)
>
>
> Nope - Derby doesn't do backwards scrolling. I think the physical data
> layout could support it (i.e. there are backwards pointers), but
> backwards scrolling has never been implemented. Could someone more
> familiar with the store confirm this (Mike?).
>
>
> - Jeff Lichtman
> swazoo@rcn.com
> Check out Swazoo Koolak's Web Jukebox at
> http://swazoo.com/
>
>
Re: A few questions about index usage
Posted by Jeffrey Lichtman <sw...@rcn.com>.
>1) "SELECT MIN(Id) FROM Customer" and "SELECT MAX(Id) FROM Customer"
>are both fast, but "SELECT MIN(Id), MAX(Id) FROM Customer" is slow,
>taking 5 seconds. Why?
Derby knows how to use an index to quickly find a minimum or a
maximum (by traversing down one side of the B-tree or the other). It
doesn't know how to do both in the same query, which would take two traversals.
>2) "SELECT * FROM Customer ORDER BY Id" is fast but "SELECT * FROM
>Customer ORDER BY Id DESC" is slow. Why? Can't it scroll the index backwards?
Nope - Derby doesn't do backwards scrolling. I think the physical
data layout could support it (i.e. there are backwards pointers), but
backwards scrolling has never been implemented. Could someone more
familiar with the store confirm this (Mike?).
- Jeff Lichtman
swazoo@rcn.com
Check out Swazoo Koolak's Web Jukebox at
http://swazoo.com/