You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Lou DeGenaro <lo...@gmail.com> on 2014/01/05 16:12:53 UTC

slow query? (from a practical newbie)

My table: Job

(
 primaryKey int primary key generated always as identity,
 id int not null,
 startDate bigint not null,
 endDate bigint not null,
 duration int not null,
 userId varchar( 64 ) not null,
 submitter varchar( 64 ) not null,
 schedulingClass varchar( 64 ) not null,
 debugPortDriver int not null,
 debugPortProcess int not null,
 state varchar( 64 ) not null,
 stateIndex int not null,
 completionType varchar( 64 ) not null,
 completionRationale varchar( 1024 ) not null,
 services int not null,
 processes int not null,
 initFails int not null,
 runFails int not null,
 pgin int not null,
 swapGbMax double not null,
 memorySizeGb int not null,
 workItemsTotal int not null,
 workItemsDone int not null,
 workItemsError int not null,
 workItemsRetry int not null,
 workItemsPreempt int not null,
 workItemsStatMax int not null,
 workItemsStatMin int not null,
 workItemsStatMean int not null,
 workItemsStatStdDev int not null,
 description varchar( 4096 ) not null,
 schedulingSharesMax int not null,
 logDirectory varchar( 4096 ) not null,
 jsonServiceDependencies clob not null,
 jsonJobDeployment clob not null,
 projection bigint not null,
 swapGbCurrent double not null,
 workItemsDispatch int not null,
 constraint uniqueColumns unique ( id )
)

=====

My indexes:

CREATE INDEX id_desc ON Job (id desc)
CREATE INDEX stateIndex_asc_id_desc ON Job (stateIndex asc, id desc)

=====

Query #1 - fast! meaning sub-second using prepared statement

     select * from Job order by stateIndex asc, id desc offset ? rows fetch
next ? rows only

=====

Query #2 - slow :-( meaning on the order of 30 seconds on a table with
55,000 records

     select * from Job where id < ? order by stateIndex asc, id desc offset
? rows fetch next ? rows only

=====

Ostensibly the where clause really slow does the query.  How come and how
do I remedy?

Thanx!

Lou.

Re: slow query? (from a practical newbie)

Posted by degenaro <lo...@gmail.com>.
Bryan Pendleton-3 wrote
> In both cases, I'm wondering if somehow the problem is in the output
> phase, like there's a huge amount of output because there's a BLOB
> column in your table or something like that, or the client side is
> just taking an eternity to process the output.

I can try, but I doubt the volume of data is the issue since the where
clause lessens the amount produced.

Lou.



--
View this message in context: http://apache-database.10148.n7.nabble.com/slow-query-from-a-practical-newbie-tp136285p136343.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: slow query? (from a practical newbie)

Posted by Bryan Pendleton <bp...@gmail.com>.
On 1/7/2014 12:52 PM, degenaro wrote:
> Bryan Pendleton-3 wrote
>> Are the queries that we're discussing here running slowly for you?
>
> Yes, very slow.  With about 50,000 rows the query takes about 30 seconds
> give or take.  But if I remove the where clause, the time is sub second.

Wow, that's crazy! That query should take about .03 seconds, not 30!

What happens if you change the query from

	SELECT * FROM ...
to
	SELECT COUNT(*) FROM ...

That is, select the count rather than the actual rows.

Also, instead of

	SELECT * FROM ...
try
	SELECT ID FROM ...

In both cases, I'm wondering if somehow the problem is in the output
phase, like there's a huge amount of output because there's a BLOB
column in your table or something like that, or the client side is
just taking an eternity to process the output.

Because the query plan indicates that you're just reading 57 pages,
and 50K rows, and that's nothing that should even cause Derby to
break a sweat.

thanks,

bryan



Re: slow query? (from a practical newbie)

Posted by degenaro <lo...@gmail.com>.
Bryan Pendleton-3 wrote
> Are the queries that we're discussing here running slowly for you?

Yes, very slow.  With about 50,000 rows the query takes about 30 seconds
give or take.  But if I remove the where clause, the time is sub second.

Lou.



--
View this message in context: http://apache-database.10148.n7.nabble.com/slow-query-from-a-practical-newbie-tp136285p136327.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: slow query? (from a practical newbie)

Posted by Bryan Pendleton <bp...@gmail.com>.
> So I tried:
>
> ij> select * from ducc.Job where id < 117000 order by stateIndex asc, id desc;

You're right; the ORDER BY has to be the final clause, not before the WHERE.
Thanks for correcting that.

> And I see this query plan:

OK, I'm afraid I've forgotten the overall context. This query plan
looks just like the previous query plan, and in both cases they appear
to be using the index to restrict the number of rows searched.

Are the queries that we're discussing here running slowly for you?

According to the query plan, they're running quite efficiently.

thanks,

bryan


Re: slow query? (from a practical newbie)

Posted by degenaro <lo...@gmail.com>.
ij> select * from ducc.Job order by stateIndex asc, id desc where id <
117000;
ERROR 42X01: Syntax error: Encountered "where" at line 1, column 57.
Issue the 'help' command for general information on IJ command syntax.
Any unrecognized commands are treated as potential SQL commands and executed
directly.
Consult your DBMS server reference documentation for details of the SQL
syntax supported by your server.

So I tried:

ij> select * from ducc.Job where id < 117000 order by stateIndex asc, id
desc;

And I see this query plan:

ue Jan 07 09:06:12 EST 2014 Thread[main,5,main] (XID = 10483), (SESSIONID =
1), select * from ducc.Job where id < 117000 order by stateIndex asc, id
desc ******* Sort ResultSet:
Number of opens = 1
Rows input = 5468
Rows returned = 5468
Eliminate duplicates = false
In sorted order = false
Sort information:
        Number of rows input=5468
        Number of rows output=5468
        Sort type=internal
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count: 5453.00
        optimizer estimated cost: 426177.87
Source result set:
        Index Row to Base Row ResultSet for JOB:
        Number of opens = 1
        Rows seen = 5468
        Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37}
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                optimizer estimated row count: 5453.00
                optimizer estimated cost: 426177.87
                Index Scan ResultSet for JOB using constraint UNIQUECOLUMNS
at read committed isolation level using instantaneous share row locking
chosen by the optimizer
                Number of opens = 1
                Rows seen = 5468
                Rows filtered = 0
                Fetch Size = 16
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        next time in milliseconds/row = 0

                scan information:
                        Bit set of columns fetched=All
                        Number of columns fetched=2
                        Number of deleted rows visited=0
                        Number of pages visited=57
                        Number of rows qualified=5468
                        Number of rows visited=5469
                        Scan type=btree
                        Tree height=2
                        start position:
                                None
                        stop position:
                                >= on first 1 column(s).
                                Ordered null semantics on the following
columns:
                                0
                        qualifiers:
                                None
                        optimizer estimated row count: 5453.00
                        optimizer estimated cost: 426177.87






--
View this message in context: http://apache-database.10148.n7.nabble.com/slow-query-from-a-practical-newbie-tp136285p136320.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: slow query? (from a practical newbie)

Posted by Bryan Pendleton <bp...@gmail.com>.
> select a.* from (select * from ducc.job order by stateIndex asc, id
> desc) a where id < 117000

Why is this query written like this, as opposed to, say:

   select * from ducc.job order by stateIndex asc, id desc
          where id < 117000

Do you get a different query plan / run time if you use a
single top-level query and eliminate the sub-query?

I think the optimizer does a pretty good job in general of
optimizing sub queries but maybe the presence of the ORDER BY
in the sub query is making it hard for the query optimizer
to perform the query efficiently?

thanks,

bryan


Re: slow query? (from a practical newbie)

Posted by degenaro <lo...@gmail.com>.
Bryan Pendleton-3 wrote
>> Ostensibly the where clause really slow does the query.  How come and how
>> do I remedy?
> 
> Here's a good place to start:
> 
> http://wiki.apache.org/db-derby/PerformanceDiagnosisTips
> 
> thanks,
> 
> bryan

>From ij show indexes:

DUCC                |JOB                 |PRIMARYKEY          |false |3   |A  
|NULL    |NULL
DUCC                |JOB                 |ID                  |false |3   |A  
|NULL    |NULL
DUCC                |JOB                 |ID                  |true  |3   |D  
|NULL    |NULL
DUCC                |JOB                 |STATEINDEX          |true  |3   |A  
|NULL    |NULL
DUCC                |JOB                 |ID                  |true  |3   |D  
|NULL    |NULL

from derby log:

Mon Jan 06 09:22:10 EST 2014 Thread[main,5,main] (XID = 10384), (SESSIONID =
1), select a.* from (select * from ducc.job order by stateIndex asc, id
desc) a where id < 117000 ******* Sort
ResultSet:
Number of opens = 1
Rows input = 5468
Rows returned = 5468
Eliminate duplicates = false
In sorted order = false
Sort information:
        Number of rows input=5468
        Number of rows output=5468
        Sort type=internal
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count: 5453.00
        optimizer estimated cost: 426177.87
Source result set:
        Index Row to Base Row ResultSet for JOB:
        Number of opens = 1
        Rows seen = 5468
        Columns accessed from heap = {0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37}
                constructor time (milliseconds) = 0
                open time (milliseconds) = 0
                next time (milliseconds) = 0
                close time (milliseconds) = 0
                optimizer estimated row count: 5453.00
                optimizer estimated cost: 426177.87
                Index Scan ResultSet for JOB using constraint UNIQUECOLUMNS
at read committed isolation level using share row locking chosen by the
optimizer
                Number of opens = 1
                Rows seen = 5468
                Rows filtered = 0
                Fetch Size = 1
                        constructor time (milliseconds) = 0
                        open time (milliseconds) = 0
                        next time (milliseconds) = 0
                        close time (milliseconds) = 0
                        next time in milliseconds/row = 0

                scan information:
                        Bit set of columns fetched=All
                        Number of columns fetched=2
                        Number of deleted rows visited=0
                        Number of pages visited=57
                        Number of rows qualified=5468
                        Number of rows visited=5469
                        Scan type=btree
                        Tree height=2
                        start position:
                                None
                        stop position:
                                >= on first 1 column(s).
                                Ordered null semantics on the following
columns:
                                0
                        qualifiers:
                                None
                        optimizer estimated row count: 5453.00
                        optimizer estimated cost: 426177.87

As best as I can tell from this, for this query the composite index
(stateIndex asc, id desc) is not used, which is why the query is so slow?  

If there is an example or strategies for how to accomplish "complex"
queries?  There are 2 columns of interest: id and stateIndex.  The desired
ordering is based upon the combined (stateIndex asc, id desc) and the
starting row is to be a specified id (e.g. the where clause).

Thanks.

Lou.





--
View this message in context: http://apache-database.10148.n7.nabble.com/slow-query-from-a-practical-newbie-tp136285p136295.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: slow query? (from a practical newbie)

Posted by Bryan Pendleton <bp...@gmail.com>.
> Ostensibly the where clause really slow does the query.  How come and how do I remedy?

Here's a good place to start:

http://wiki.apache.org/db-derby/PerformanceDiagnosisTips

thanks,

bryan