You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by ganest <ga...@ced.tuc.gr> on 2009/05/21 14:04:24 UTC

Performance Tuning Problem ?

Hi all,

I am trying to evaluate Derby (version 10.5.1.1 on linux machine, with java
version "1.6.0_07"), 
using the embedded driver but I am having difficulty getting the performance
of Derby to get close to 
mysql running on the same hardware.

Any suggestions about what I can do to close the gap would be appreciated. 

The situation is as follows:

I have created two tables (I attach at the end the full DDL file produced by
dblook): 

big(int id, int num, varchar (256) name)  
bigref(int id, int bigid, varchar(256) name) 

and I inserted 10000000 tuples in the table big and 1000000 tuples in table
bigref. 

The data inserted using the following code:

// Issue the INSERT operation for table big.
PreparedStatement is = con.prepareStatement("INSERT INTO big (num, name)
VALUES(?, ?)" );
    
Random r = new Random();
for (int i = 0; i < 10000000; ++i ) {               
	is.setInt(1, r.nextInt(1000000));
    is.setString(2, getRandom() + " " + getRandom() );
              
    is.execute();
}

// Issue the INSERT operation for table bigref.
PreparedStatement is = con.prepareStatement("INSERT INTO bigref (bigid,
name) VALUES(?, ?)" );

Random r = new Random();
for (int i = 0; i < 1000000; ++i ) {               
   is.setInt(1, r.nextInt(10000000));
   is.setString(2, getRandom() + " " + getRandom() );
      
   is.execute();
}

String getRandom() {
	UUID uuid = UUID.randomUUID();
    String myRandom = uuid.toString();
	return myRandom;
}

Inticative tuple of table big:
ID			NUM			NAME
3569699    |766673     |0ffa0185-1270-43d4-ba59-8fa0d6bb84d1
38ce4fd0-d2ec-4f00-8fe7-f70325344f50

Inticative tuple of table bigref:
ID			BIGID		NAME
154507     |3569699    |e9ad27c0-f322-4f0e-b5a6-894eda12dd7b
08a8b797-bd5a-4f0c-a4c3-4fa1d056ce80

I created the indexes after data insertion and I run (I read about it this
mailing list) :
call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('GANEST', 'BIG', 1)
call SYSCS_UTIL.SYSCS_COMPRESS_TABLE('GANEST', 'BIGREF', 1)

I set the database properties:
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.storage.pageSize',
'32768');
CALL SYSCS_UTIL.SYSCS_SET_DATABASE_PROPERTY('derby.storage.pageCacheSize',
'4000');

I am executing using ij tool (java -Dderby.language.maxMemoryPerTable=4096
-Xms256m -Xmx256m -jar $DERBY_HOME/lib/derbyrun.jar ij) 
the following query: (I read about derby.language.maxMemoryPerTable in this
mailing list)

select count(*) from  big inner join bigref on big.id=bigref.bigid and
big.name like '0ff%';

The result is: 258 and it takes more than 20 seconds to be executed. Using
mysql with the same
configuration the result is produced in milliseconds.

It seems that the indexes are taken into account and the query plan is
correct.

Any idea about what is going wrong would be appreciated. 

Thanks in advance.

===================================================================================

VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();

Statement Name: 
	null
Statement Text: 
	select count(*) from  big inner join bigref on big.id=bigref.bigid and
big.name like '0ff%'
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text: 
Project-Restrict ResultSet (7):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	restriction time (milliseconds) = 0
	projection time (milliseconds) = 0
	optimizer estimated row count:            1.00
	optimizer estimated cost:        17576.58

Source result set:
	Scalar Aggregate ResultSet:
	Number of opens = 1
	Rows input = 258
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 0
		close time (milliseconds) = 0
		optimizer estimated row count:         2522.46
		optimizer estimated cost:        17576.58

	Index Key Optimization = false
	Source result set:
		Project-Restrict ResultSet (6):
		Number of opens = 1
		Rows seen = 258
		Rows filtered = 0
		restriction = false
		projection = true
			constructor time (milliseconds) = 0
			open time (milliseconds) = 0
			next time (milliseconds) = 0
			close time (milliseconds) = 0
			restriction time (milliseconds) = 0
			projection time (milliseconds) = 0
			optimizer estimated row count:         2522.46
			optimizer estimated cost:        17576.58

		Source result set:
			Nested Loop Join ResultSet:
			Number of opens = 1
			Rows seen from the left = 2407
			Rows seen from the right = 258
			Rows filtered = 0
			Rows returned = 258
				constructor time (milliseconds) = 0
				open time (milliseconds) = 0
				next time (milliseconds) = 0
				close time (milliseconds) = 0
				optimizer estimated row count:         2522.46
				optimizer estimated cost:        17576.58

			Left result set:
				Index Row to Base Row ResultSet for BIG:
				Number of opens = 1
				Rows seen = 2407
				Columns accessed from heap = {0, 2}
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					optimizer estimated row count:         2400.00
					optimizer estimated cost:         5717.82

					Index Scan ResultSet for BIG using index BIGNAME at read committed
isolation level using instantaneous share row locking chosen by the
optimizer
					Number of opens = 1
					Rows seen = 2407
					Rows filtered = 0
					Fetch Size = 16
						constructor time (milliseconds) = 0
						open time (milliseconds) = 0
						next time (milliseconds) = 0
						close time (milliseconds) = 0
						next time in milliseconds/row = 0

					scan information: 
						Bit set of columns fetched=All
						Number of columns fetched=2
						Number of deleted rows visited=0
						Number of pages visited=10
						Number of rows qualified=2407
						Number of rows visited=2408
						Scan type=btree
						Tree height=-1
						start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
						stop position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
						qualifiers:
None
						optimizer estimated row count:         2400.00
						optimizer estimated cost:         5717.82


			Right result set:
				Index Scan ResultSet for BIGREF using constraint SQL090517120557051 at
read committed isolation level using instantaneous share row locking chosen
by the optimizer
				Number of opens = 2407
				Rows seen = 258
				Rows filtered = 0
				Fetch Size = 16
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					next time in milliseconds/row = 0

				scan information: 
					Bit set of columns fetched={0}
					Number of columns fetched=1
					Number of deleted rows visited=0
					Number of pages visited=7231
					Number of rows qualified=258
					Number of rows visited=2665
					Scan type=btree
					Tree height=3
					start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					stop position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					qualifiers:
None
					optimizer estimated row count:         2522.46
					optimizer estimated cost:        11858.76

=============================================================================

-- This file was created using Derby's dblook utility.
-- Timestamp: 2009-05-21 14:05:58.306
-- Source database is: /home/ganest/tmp/derbyDBdir/test1
-- Connection URL is: jdbc:derby:/home/ganest/tmp/derbyDBdir/test1...
-- Specified schema is: ganest
-- appendLogs: false

-- ----------------------------------------------
-- DDL Statements for schemas
-- ----------------------------------------------

CREATE SCHEMA "GANEST";

-- ----------------------------------------------
-- DDL Statements for tables
-- ----------------------------------------------


CREATE TABLE "GANEST"."BIGREF" ("ID" INTEGER NOT NULL GENERATED ALWAYS AS
IDENTITY (START WITH 1, INCREMENT BY 1), "BIGID" INTEGER NOT NULL, "NAME"
VARCHAR(256) NOT NULL);

CREATE TABLE "GANEST"."BIG" ("ID" INTEGER NOT NULL GENERATED ALWAYS AS
IDENTITY (START WITH 1, INCREMENT BY 1), "NUM" INTEGER NOT NULL, "NAME"
VARCHAR(256) NOT NULL);


-- ----------------------------------------------
-- DDL Statements for indexes
-- ----------------------------------------------

CREATE INDEX "GANEST"."BIGNAME" ON "GANEST"."BIG" ("NAME");

CREATE INDEX "GANEST"."BIGREFNAME" ON "GANEST"."BIGREF" ("NAME");

-- ----------------------------------------------
-- DDL Statements for keys
-- ----------------------------------------------

-- primary/unique

ALTER TABLE "GANEST"."BIG" ADD CONSTRAINT "SQL090516111642550" PRIMARY KEY
("ID");

ALTER TABLE "GANEST"."BIGREF" ADD CONSTRAINT "SQL090517120557050" PRIMARY
KEY ("ID");


-- foreign
ALTER TABLE "GANEST"."BIGREF" ADD CONSTRAINT "SQL090517120557051" FOREIGN
KEY ("BIGID") REFERENCES "GANEST"."BIG" ("ID") ON DELETE NO ACTION ON UPDATE
NO ACTION;

-- 
View this message in context: http://www.nabble.com/Performance-Tuning-Problem---tp23652098p23652098.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: Performance Tuning Problem ?

Posted by George Anestis <ga...@ced.tuc.gr>.

Hi,

Knut Anders Hatlen wrote:
> ganest <ga...@ced.tuc.gr> writes:
>
>   
>> I am executing using ij tool (java -Dderby.language.maxMemoryPerTable=4096
>> -Xms256m -Xmx256m -jar $DERBY_HOME/lib/derbyrun.jar ij) 
>> the following query: (I read about derby.language.maxMemoryPerTable in this
>> mailing list)
>>
>> select count(*) from  big inner join bigref on big.id=bigref.bigid and
>> big.name like '0ff%';
>>
>> The result is: 258 and it takes more than 20 seconds to be executed. Using
>> mysql with the same
>> configuration the result is produced in milliseconds.
>>     
>
> For the record, I ran the code you provided on my machine, using only
> the default settings for Derby and the JVM, and I see that the query
> takes less than 150 ms:
>
> ij> elapsedtime on;
> ij> select count(*) from  big inner join bigref on big.id=bigref.bigid and big.name like '0ff%'; 
> 1          
> -----------
> 255        
>
> 1 row selected
> ELAPSED TIME = 133 milliseconds
>
> This is with head of the Derby 10.5 branch, OpenSolaris 2009.06 snv_111a
> X86, and
>
> java version "1.6.0_13"
> Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
> Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode)
>
> The runtime statistics on my machine are almost identical to the ones
> you provided (some small differences in page count and row count due to
> the randomness of the code that populates the tables).
>
>   
Obviously the cache was used that is why you take such time. When I run 
exactly the same query twice I get the same time too.
If you run the same query with a different value in the WHERE expression 
( i.e. big.name like '5aa%', etc), It will take 20 -30 seconds

As I said the problem remains. I hoped that it was a tuning issue, but 
it seems that is an intrinsic problem
related with the join implementation. I like Derby and I am really 
disappointed, especially when I compare it with mysql.

In any case, thanks again for your answers.

George

Re: Performance Tuning Problem ?

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.

George Anestis <ga...@ced.tuc.gr> writes:

>> Another difference is that the previous query had to access the base
>> table (big) to get the id column, whereas the latest query finds all the
>> columns it needs in the indexes. You may want to try to add an extra
>> index to speed up the previous query:
>>
>>   CREATE UNIQUE INDEX BIGNAMEID ON BIG(NAME, ID)
>>
>>   
> If I understand well, the creation of the new index (BIGNAMEID) will
> solve the second problem you mentioned,
> the id column of table big will be taken from the index BIGNAMEID and
> not from the base table, but the first issue
>
> "the index on bigref.bigid was accessed almost completely
> randomly, and a large number of pages all over the table had to be
> visited."
>
>
> will remain? Is than correct ?

Correct.

> For the record, the creation of the index you proposed speeds up the
> query in a spectacular way !!

That's great! :)

> Is it correct to assume that the main source of the problem was the
> access of the base table (big) to get the id
> column and not the almost random access of the index on bigref.bigid ?

Yes, that sounds reasonable. The accesses to the base table are random
too, since the base table and the BIGNAME index are ordered
differently. The index on BIGREF.BIGID is much more compact than the BIG
table, so the random reads there probably use the cache more efficiently
and don't hurt the performance that much.

-- 
Knut Anders

Re: Performance Tuning Problem ?

Posted by George Anestis <ga...@ced.tuc.gr>.

Dean Knut,

Thanks for answer,

> The accessed data has higher locality in this query than in the previous
> query, and therefore the page cache is used more efficiently.
>
> The previous query used the indexes on big.name and bigref.bigid, which
> are ordered differently. So even if the index on big.name was scanned
> sequentially, the index on bigref.bigid was accessed almost completely
> randomly, and a large number of pages all over the table had to be
> visited.
>
> In the latest query, the indexes on big.id and bigref.bigid are
> used. Those indexes are ordered the same way, and all the interesting
> rows are located consecutively on a relatively small number of pages. So
> even if the number of page accesses is higher, the number of different
> pages accessed is probably much lower. And since the rows in both
> indexes are accessed in the same order that they are stored, the
> requested page is found in the page cache most of the time.
>
> Another difference is that the previous query had to access the base
> table (big) to get the id column, whereas the latest query finds all the
> columns it needs in the indexes. You may want to try to add an extra
> index to speed up the previous query:
>
>   CREATE UNIQUE INDEX BIGNAMEID ON BIG(NAME, ID)
>
>   
If I understand well, the creation of the new index (BIGNAMEID) will 
solve the second problem you mentioned,
the id column of table big will be taken from the index BIGNAMEID and 
not from the base table, but the first issue

"the index on bigref.bigid was accessed almost completely
randomly, and a large number of pages all over the table had to be
visited."


will remain? Is than correct ?

For the record, the creation of the index you proposed speeds up the 
query in a spectacular way !!

Is it correct to assume that the main source of the problem was the 
access of the base table (big) to get the id
column and not the almost random access of the index on bigref.bigid ?

Anyway, I would not like to become (more :-)) annoying, thanks a lot for 
your answers.

Best regards,
George

Re: Performance Tuning Problem ?

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.

ganest <ga...@ced.tuc.gr> writes:

> Hi all, 
>
> In an attempt to investigate the problem I mentioned in my previous post, I
> run additional tests and some interesting and weird results were appeared.
> May be someone who has good knowledge of Derby's internal operation can give
> some explanation. Any suggestion is welcomed. I describe what I have done:
>
> Fresh Start of ij to ensure that no cache is used:
>
> java -Dderby.language.maxMemoryPerTable=128
> -Dderby.storage.pageCacheSize=4000 -Xms256m -Xmx256m -jar
> $DERBY_HOME/lib/derbyrun.jar ij
>
> prepare ps1 as 'select count(*) from  ganest1.big inner join ganest1.bigref
> on ganest1.big.id=ganest1.bigref.bigid and ganest1.big.id > 4000000 and
> ganest1.big.id < 4100000';
> ELAPSED TIME = 334 milliseconds
> ij> execute ps1;
> 1          
> -----------
> 10128      
>
> 1 row selected
> ELAPSED TIME = 440 milliseconds
>
> From the query plan we can see that a nested loop join strategy is used and
> it
> opens 10128 index scans on the inner table. The situation looks similar to
> the one
> I described in my previous post, but the execution time is quite
> satisfactory!
> (milliseconds vs 20 seconds see below)

The accessed data has higher locality in this query than in the previous
query, and therefore the page cache is used more efficiently.

The previous query used the indexes on big.name and bigref.bigid, which
are ordered differently. So even if the index on big.name was scanned
sequentially, the index on bigref.bigid was accessed almost completely
randomly, and a large number of pages all over the table had to be
visited.

In the latest query, the indexes on big.id and bigref.bigid are
used. Those indexes are ordered the same way, and all the interesting
rows are located consecutively on a relatively small number of pages. So
even if the number of page accesses is higher, the number of different
pages accessed is probably much lower. And since the rows in both
indexes are accessed in the same order that they are stored, the
requested page is found in the page cache most of the time.

Another difference is that the previous query had to access the base
table (big) to get the id column, whereas the latest query finds all the
columns it needs in the indexes. You may want to try to add an extra
index to speed up the previous query:

  CREATE UNIQUE INDEX BIGNAMEID ON BIG(NAME, ID)

-- 
Knut Anders

Re: Performance Tuning Problem ?

Posted by ganest <ga...@ced.tuc.gr>.

Hi all, 

In an attempt to investigate the problem I mentioned in my previous post, I
run additional tests and some interesting and weird results were appeared.
May be someone who has good knowledge of Derby's internal operation can give
some explanation. Any suggestion is welcomed. I describe what I have done:

Fresh Start of ij to ensure that no cache is used:

java -Dderby.language.maxMemoryPerTable=128
-Dderby.storage.pageCacheSize=4000 -Xms256m -Xmx256m -jar
$DERBY_HOME/lib/derbyrun.jar ij

prepare ps1 as 'select count(*) from  ganest1.big inner join ganest1.bigref
on ganest1.big.id=ganest1.bigref.bigid and ganest1.big.id > 4000000 and
ganest1.big.id < 4100000';
ELAPSED TIME = 334 milliseconds
ij> execute ps1;
1          
-----------
10128      

1 row selected
ELAPSED TIME = 440 milliseconds

>From the query plan we can see that a nested loop join strategy is used and
it
opens 10128 index scans on the inner table. The situation looks similar to
the one
I described in my previous post, but the execution time is quite
satisfactory!
(milliseconds vs 20 seconds see below)

I attach the query plan at the end of the post. (in my previous post I
attached the DDL for the whole test)

After that I run the query (after restarting ij in an attempt to prevent
cache usage...):

prepare ps3  as 'select count(*) from  ganest1.big where ganest1.big.name
like ''abc%'' ';
ij> ELAPSED TIME = 208 milliseconds
ij> execute ps3;
1          
-----------
2540       

1 row selected
ELAPSED TIME = 148 milliseconds

Again quite satisfactory time !!

And the problematic query:
prepare ps2 as 'select count(*) from  ganest1.big inner join ganest1.bigref
on ganest1.big.id=ganest1.bigref.bigid and ganest1.big.name like ''abc%'' ';
ELAPSED TIME = 51 milliseconds
ij> execute ps2;
1          
-----------
258        

1 row selected
ELAPSED TIME = 18072 milliseconds

It takes 18 seconds !! ( I attach again the query plan)


=============== Query Plans ==============================
Statement Name: 
	null
Statement Text: 
	select count(*) from  ganest1.big inner join ganest1.bigref on
ganest1.big.id=ganest1.bigref.bigid and ganest1.big.id > 4000000 and
ganest1.big.id < 4100000
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text: 
Project-Restrict ResultSet (6):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	restriction time (milliseconds) = 0
	projection time (milliseconds) = 0
	optimizer estimated row count:            1.00
	optimizer estimated cost:        49737.71

Source result set:
	Scalar Aggregate ResultSet:
	Number of opens = 1
	Rows input = 10128
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 0
		close time (milliseconds) = 0
		optimizer estimated row count:        10126.00
		optimizer estimated cost:        49737.71

	Index Key Optimization = false
	Source result set:
		Project-Restrict ResultSet (5):
		Number of opens = 1
		Rows seen = 10128
		Rows filtered = 0
		restriction = false
		projection = true
			constructor time (milliseconds) = 0
			open time (milliseconds) = 0
			next time (milliseconds) = 0
			close time (milliseconds) = 0
			restriction time (milliseconds) = 0
			projection time (milliseconds) = 0
			optimizer estimated row count:        10126.00
			optimizer estimated cost:        49737.71

		Source result set:
			Nested Loop Exists Join ResultSet:
			Number of opens = 1
			Rows seen from the left = 10128
			Rows seen from the right = 10128
			Rows filtered = 0
			Rows returned = 10128
				constructor time (milliseconds) = 0
				open time (milliseconds) = 0
				next time (milliseconds) = 0
				close time (milliseconds) = 0
				optimizer estimated row count:        10126.00
				optimizer estimated cost:        49737.71

			Left result set:
				Index Scan ResultSet for BIGREF using constraint SQL090522104305131 at
read committed isolation level using instantaneous share row locking chosen
by the optimizer
				Number of opens = 1
				Rows seen = 10128
				Rows filtered = 0
				Fetch Size = 16
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					next time in milliseconds/row = 0

				scan information: 
					Bit set of columns fetched={0}
					Number of columns fetched=1
					Number of deleted rows visited=0
					Number of pages visited=9
					Number of rows qualified=10128
					Number of rows visited=10129
					Scan type=btree
					Tree height=-1
					start position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					stop position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					qualifiers:
None
					optimizer estimated row count:        10126.00
					optimizer estimated cost:         1998.68

			Right result set:
				Index Scan ResultSet for BIG using constraint SQL090522104249720 at read
committed isolation level using share row locking chosen by the optimizer
				Number of opens = 10128
				Rows seen = 10128
				Rows filtered = 0
				Fetch Size = 1
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					next time in milliseconds/row = 0

				scan information: 
					Bit set of columns fetched={0}
					Number of columns fetched=1
					Number of deleted rows visited=0
					Number of pages visited=30384
					Number of rows qualified=10128
					Number of rows visited=10128
					Scan type=btree
					Tree height=3
					start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					stop position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					qualifiers:
Column[0][0] Id: 0
Operator: <
Ordered nulls: false
Unknown return value: false
Negate comparison result: false
Column[0][1] Id: 0
Operator: <=
Ordered nulls: false
Unknown return value: true
Negate comparison result: true

					optimizer estimated row count:        10126.00
					optimizer estimated cost:        47739.03

================================ query plan of the problematic query
===============
Statement Name: 
	null
Statement Text: 
	select count(*) from  ganest1.big inner join ganest1.bigref on
ganest1.big.id=ganest1.bigref.bigid and ganest1.big.name like 'abc%' 
Parse Time: 0
Bind Time: 0
Optimize Time: 0
Generate Time: 0
Compile Time: 0
Execute Time: 0
Begin Compilation Timestamp : null
End Compilation Timestamp : null
Begin Execution Timestamp : null
End Execution Timestamp : null
Statement Execution Plan Text: 
Project-Restrict ResultSet (7):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	restriction time (milliseconds) = 0
	projection time (milliseconds) = 0
	optimizer estimated row count:            1.00
	optimizer estimated cost:        14540.85

Source result set:
	Scalar Aggregate ResultSet:
	Number of opens = 1
	Rows input = 258
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 0
		close time (milliseconds) = 0
		optimizer estimated row count:         2662.25
		optimizer estimated cost:        14540.85

	Index Key Optimization = false
	Source result set:
		Project-Restrict ResultSet (6):
		Number of opens = 1
		Rows seen = 258
		Rows filtered = 0
		restriction = false
		projection = true
			constructor time (milliseconds) = 0
			open time (milliseconds) = 0
			next time (milliseconds) = 0
			close time (milliseconds) = 0
			restriction time (milliseconds) = 0
			projection time (milliseconds) = 0
			optimizer estimated row count:         2662.25
			optimizer estimated cost:        14540.85

		Source result set:
			Nested Loop Join ResultSet:
			Number of opens = 1
			Rows seen from the left = 2540
			Rows seen from the right = 258
			Rows filtered = 0
			Rows returned = 258
				constructor time (milliseconds) = 0
				open time (milliseconds) = 0
				next time (milliseconds) = 0
				close time (milliseconds) = 0
				optimizer estimated row count:         2662.25
				optimizer estimated cost:        14540.85

			Left result set:
				Index Row to Base Row ResultSet for BIG:
				Number of opens = 1
				Rows seen = 2540
				Columns accessed from heap = {0, 2}
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					optimizer estimated row count:         2533.00
					optimizer estimated cost:         6034.08

					Index Scan ResultSet for BIG using index BIGNAME1 at read committed
isolation level using instantaneous share row locking chosen by the
optimizer
					Number of opens = 1
					Rows seen = 2540
					Rows filtered = 0
					Fetch Size = 16
						constructor time (milliseconds) = 0
						open time (milliseconds) = 0
						next time (milliseconds) = 0
						close time (milliseconds) = 0
						next time in milliseconds/row = 0

					scan information: 
						Bit set of columns fetched=All
						Number of columns fetched=2
						Number of deleted rows visited=0
						Number of pages visited=10
						Number of rows qualified=2540
						Number of rows visited=2541
						Scan type=btree
						Tree height=-1
						start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
						stop position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
						qualifiers:
None
						optimizer estimated row count:         2533.00
						optimizer estimated cost:         6034.08


			Right result set:
				Index Scan ResultSet for BIGREF using constraint SQL090522104305131 at
read committed isolation level using instantaneous share row locking chosen
by the optimizer
				Number of opens = 2540
				Rows seen = 258
				Rows filtered = 0
				Fetch Size = 16
					constructor time (milliseconds) = 0
					open time (milliseconds) = 0
					next time (milliseconds) = 0
					close time (milliseconds) = 0
					next time in milliseconds/row = 0

				scan information: 
					Bit set of columns fetched={0}
					Number of columns fetched=1
					Number of deleted rows visited=0
					Number of pages visited=5081
					Number of rows qualified=258
					Number of rows visited=2798
					Scan type=btree
					Tree height=2
					start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					stop position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 
0 
					qualifiers:
None
					optimizer estimated row count:         2662.25
					optimizer estimated cost:         8506.77

==================================end of query plan
=============================

Knut Anders Hatlen wrote:
> 
> ganest <ga...@ced.tuc.gr> writes:
> 
>> I am executing using ij tool (java
>> -Dderby.language.maxMemoryPerTable=4096
>> -Xms256m -Xmx256m -jar $DERBY_HOME/lib/derbyrun.jar ij) 
>> the following query: (I read about derby.language.maxMemoryPerTable in
>> this
>> mailing list)
>>
>> select count(*) from  big inner join bigref on big.id=bigref.bigid and
>> big.name like '0ff%';
>>
>> The result is: 258 and it takes more than 20 seconds to be executed.
>> Using
>> mysql with the same
>> configuration the result is produced in milliseconds.
> 
> For the record, I ran the code you provided on my machine, using only
> the default settings for Derby and the JVM, and I see that the query
> takes less than 150 ms:
> 
> ij> elapsedtime on;
> ij> select count(*) from  big inner join bigref on big.id=bigref.bigid and
> big.name like '0ff%'; 
> 1          
> -----------
> 255        
> 
> 1 row selected
> ELAPSED TIME = 133 milliseconds
> 
> This is with head of the Derby 10.5 branch, OpenSolaris 2009.06 snv_111a
> X86, and
> 
> java version "1.6.0_13"
> Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
> Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode)
> 
> The runtime statistics on my machine are almost identical to the ones
> you provided (some small differences in page count and row count due to
> the randomness of the code that populates the tables).
> 
> -- 
> Knut Anders
> 
> 


-----
George Anestis
-- 
View this message in context: http://www.nabble.com/Performance-Tuning-Problem---tp23652098p23703523.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

Re: Performance Tuning Problem ?

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.

ganest <ga...@ced.tuc.gr> writes:

> I am executing using ij tool (java -Dderby.language.maxMemoryPerTable=4096
> -Xms256m -Xmx256m -jar $DERBY_HOME/lib/derbyrun.jar ij) 
> the following query: (I read about derby.language.maxMemoryPerTable in this
> mailing list)
>
> select count(*) from  big inner join bigref on big.id=bigref.bigid and
> big.name like '0ff%';
>
> The result is: 258 and it takes more than 20 seconds to be executed. Using
> mysql with the same
> configuration the result is produced in milliseconds.

For the record, I ran the code you provided on my machine, using only
the default settings for Derby and the JVM, and I see that the query
takes less than 150 ms:

ij> elapsedtime on;
ij> select count(*) from  big inner join bigref on big.id=bigref.bigid and big.name like '0ff%'; 
1          
-----------
255        

1 row selected
ELAPSED TIME = 133 milliseconds

This is with head of the Derby 10.5 branch, OpenSolaris 2009.06 snv_111a
X86, and

java version "1.6.0_13"
Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) Server VM (build 11.3-b02, mixed mode)

The runtime statistics on my machine are almost identical to the ones
you provided (some small differences in page count and row count due to
the randomness of the code that populates the tables).

-- 
Knut Anders

Re: Performance Tuning Problem ?

Posted by Knut Anders Hatlen <Kn...@Sun.COM>.

ganest <ga...@ced.tuc.gr> writes:

> select count(*) from  big inner join bigref on big.id=bigref.bigid and
> big.name like '0ff%';
>
> The result is: 258 and it takes more than 20 seconds to be executed. Using
> mysql with the same
> configuration the result is produced in milliseconds.
>
> It seems that the indexes are taken into account and the query plan is
> correct.
>
> Any idea about what is going wrong would be appreciated. 

Hi George,

The query plan shows that the optimizer picks a nested loop join
strategy. This means that for every qualifying row in the outer table
(big) it opens a new scan on the inner table (bigref). In this case it
opens 2407 index scans on the inner table, which may take some time,
especially if the scans need to go to disk (not unlikely with such a
large database and small page cache).

A hash join, which is the only alternative join strategy in Derby
currently, isn't likely to be more efficient since there's no
restriction that can be used to limit the number of rows to read from
bigref, so the entire bigref table will have to be read in that case, I
think.

A couple of comments to the tuning:

- derby.storage.pageSize must be set prior to creating the tables and
  indexes, otherwise it has no effect

- derby.storage.pageCacheSize is a system-wide property, so setting it
  with SYSCS_SET_DATABASE_PROPERTY has no effect. Use a system property
  or set it in derby.properties instead

-- 
Knut Anders