You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "David Van Couvering (JIRA)" <de...@db.apache.org> on 2006/08/11 00:29:14 UTC

[jira] Assigned: (DERBY-1664) Derby startup time is too slow

     [ http://issues.apache.org/jira/browse/DERBY-1664?page=all ]

David Van Couvering reassigned DERBY-1664:
------------------------------------------

    Assignee: David Van Couvering

> Derby startup time is too slow
> ------------------------------
>
>                 Key: DERBY-1664
>                 URL: http://issues.apache.org/jira/browse/DERBY-1664
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>            Reporter: David Van Couvering
>         Assigned To: David Van Couvering
>             Fix For: 10.2.0.0
>
>
> I know it's hard to measure what "too slow" is, but this is a common complaint and this affects overall perception of Derby.  This appears to be related to another common complaint that it takes too long to create tables.  I am marking this as Urgent because of the impact it has to Derby perception and the fact that the 10.2 release is going to get such wide distribution through the Sun JDK.
> For background, see http://www.nabble.com/Startup-time-tf2012748.html#a5531684

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
Ah, yes, the write cache, I wrote about it here:

http://weblogs.java.net/blog/davidvc/archive/2005/10/the_story_of_th.html

But I wrote this after hearing about this issue from folks in our 
performance team.  I have never personally had to investigate and 
enable/disable the write cache on a system, so I'm flying a bit blind. 
Checking with our Solaris guys, but if anybody has the answer for Linux 
that would be great.  Maybe we can put it on a Wiki page somewhere :)

David

Mike Matrigali wrote:
> 
> 
> David Van Couvering wrote:
>> Hi, Mike.  Thanks for wanting to participate.  My first step I was 
>> planning to do was to do some measurements, as you suggested.
>>
>> I was going to start with my own machine, which is a laptop running 
>> Solaris x86.  But I suspect a lot of folks care about XP and Linux.  I 
>> can create a test and we can run it on different machines and see what 
>> the variance is.
>>
>> I was thinking of doing a test that measures startup time with 
>> creating a new db and using an existing one as the first step.  I was 
>> then going to refine from there.
>>
>> Dumb sysadmin question: on Solaris, XP, and Linux, how do you find out 
>> if your system is syncing to disk or not?
> I am not sure on solaris/linux.  On XP it is a path through the hardware
> manager/device manager down the device drop downs - I will see if I have
> it. But probably the easiest is to write a test with one row keyed row
> in a table with an int data column and run an autocommit loop updating
> the single column.  If you get ~100 xacts a second (dependent on disk
> speed), then syncing is happening.  If you get much higher, like 1000 
> then no syncing.
> 
> Many of the db's we get compared to, don't even do syncs by default
> leading to the perception issue.
> 
> To understand the numbers this will catch also hardware where syncing
> is "correct" but abnormally fast.  we have a machine that hardware
> caches synced writes - so syncs are instantaneous unless the cache 
> fills, but since it has a battery backup and software
> to flush writes it is not "improper" - but still important to understand 
> as not a normal case for many low end systems.
>>
>> Thanks,
>>
>> David
>>
>> P.S. I'm not prepared to have the discussion about copying from a 
>> model database at this time.  Let's just first find out what's going 
>> on...
>>
>> Mike Matrigali wrote:
>>
> 

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Mike Matrigali <mi...@sbcglobal.net>.

David Van Couvering wrote:
> Hi, Mike.  Thanks for wanting to participate.  My first step I was 
> planning to do was to do some measurements, as you suggested.
> 
> I was going to start with my own machine, which is a laptop running 
> Solaris x86.  But I suspect a lot of folks care about XP and Linux.  I 
> can create a test and we can run it on different machines and see what 
> the variance is.
> 
> I was thinking of doing a test that measures startup time with creating 
> a new db and using an existing one as the first step.  I was then going 
> to refine from there.
> 
> Dumb sysadmin question: on Solaris, XP, and Linux, how do you find out 
> if your system is syncing to disk or not?
I am not sure on solaris/linux.  On XP it is a path through the hardware
manager/device manager down the device drop downs - I will see if I have
it. But probably the easiest is to write a test with one row keyed row
in a table with an int data column and run an autocommit loop updating
the single column.  If you get ~100 xacts a second (dependent on disk
speed), then syncing is happening.  If you get much higher, like 1000 
then no syncing.

Many of the db's we get compared to, don't even do syncs by default
leading to the perception issue.

To understand the numbers this will catch also hardware where syncing
is "correct" but abnormally fast.  we have a machine that hardware
caches synced writes - so syncs are instantaneous unless the cache 
fills, but since it has a battery backup and software
to flush writes it is not "improper" - but still important to understand 
as not a normal case for many low end systems.
> 
> Thanks,
> 
> David
> 
> P.S. I'm not prepared to have the discussion about copying from a model 
> database at this time.  Let's just first find out what's going on...
> 
> Mike Matrigali wrote:
> 


Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
Great, thanks!

David

Sunitha Kambhampati wrote:
> David Van Couvering wrote:
> 
>> Dumb sysadmin question: on Solaris, XP, and Linux, how do you find out 
>> if your system is syncing to disk or not?
>>
> For Linux:  Use hdparm -W0 on all your drives to turn write cache off.
> http://gentoo-wiki.com/HOWTO_Use_hdparm_to_improve_IDE_device_performance#Write-Caching_-W 
> 
> 
> For Windows (2k) I'd expect something similar for XP.
> go to My Computer -> Right click on Manage --> In computer management 
> --> Device Manager --> Pick Disk Drives ---> Right click on the Disk, go 
> to Properties --> Pick the disk properties tab and you should see a 
> checkbox with writecache enabled.
> 
> I dont know about Solaris.
> 
> HTH,
> Sunitha.
> 
> 
> 

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Sunitha Kambhampati <ks...@gmail.com>.
David Van Couvering wrote:

> Dumb sysadmin question: on Solaris, XP, and Linux, how do you find out 
> if your system is syncing to disk or not?
>
For Linux:  Use hdparm -W0 on all your drives to turn write cache off.
http://gentoo-wiki.com/HOWTO_Use_hdparm_to_improve_IDE_device_performance#Write-Caching_-W

For Windows (2k) I'd expect something similar for XP.
go to My Computer -> Right click on Manage --> In computer management 
--> Device Manager --> Pick Disk Drives ---> Right click on the Disk, go 
to Properties --> Pick the disk properties tab and you should see a 
checkbox with writecache enabled.

I dont know about Solaris.

HTH,
Sunitha.




Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
Hi, Mike.  Thanks for wanting to participate.  My first step I was 
planning to do was to do some measurements, as you suggested.

I was going to start with my own machine, which is a laptop running 
Solaris x86.  But I suspect a lot of folks care about XP and Linux.  I 
can create a test and we can run it on different machines and see what 
the variance is.

I was thinking of doing a test that measures startup time with creating 
a new db and using an existing one as the first step.  I was then going 
to refine from there.

Dumb sysadmin question: on Solaris, XP, and Linux, how do you find out 
if your system is syncing to disk or not?

Thanks,

David

P.S. I'm not prepared to have the discussion about copying from a model 
database at this time.  Let's just first find out what's going on...

Mike Matrigali wrote:
> I would like to participate in the discussion on this, let me know what
> you prefer (list, jira, wiki).  As dan said I believe there are many
> parts to this issue and would like to see them broken down.  I also do
> think there is a perception issue also and not quite sure how to
> address it, do the db's we are being compared to allow create and
> connect in the same jdbc statement?
> 
> My initial take is:
> 1) pick a platform and do some measurements, need to know if the 
> platform properly sync's the disk when asked.
> 2) compare creating empty database with copying an empty database
>    template from a jar file, with copying an empty database template
>    from disk.  I do not think derby should go to template based
>    approach for creating database.  I also witnessed a number of 
> bugs/harder to maintain code in 2 previous companies I worked at which 
> used this approach, where the current approach has been maintainable 
> from the start in derby.  I do think that if copying from a template is
> faster, then we should just document to users how/why they should do it
> if creating a db is a critical performance point for their application.
> Derby already fully supports this (at least from disk), it is exactly
> what restoring from a backup does and we already support restoring a 
> single db to multiple other data locations.
> 3) break up the "startup" cost issue into measureable understandable
>    pieces, as dan laid out -- just to make sure we are fixing the
>    right problem.  I really want to separate "startup" issues from
>    creating a new db issues.
> 
> David Van Couvering (JIRA) wrote:
>>      [ http://issues.apache.org/jira/browse/DERBY-1664?page=all ]
>>
>> David Van Couvering reassigned DERBY-1664:
>> ------------------------------------------
>>
>>     Assignee: David Van Couvering
>>
>>
>>> Derby startup time is too slow
>>> ------------------------------
>>>
>>>                Key: DERBY-1664
>>>                URL: http://issues.apache.org/jira/browse/DERBY-1664
>>>            Project: Derby
>>>         Issue Type: Improvement
>>>         Components: Store
>>>           Reporter: David Van Couvering
>>>        Assigned To: David Van Couvering
>>>            Fix For: 10.2.0.0
>>>
>>>
>>> I know it's hard to measure what "too slow" is, but this is a common 
>>> complaint and this affects overall perception of Derby.  This appears 
>>> to be related to another common complaint that it takes too long to 
>>> create tables.  I am marking this as Urgent because of the impact it 
>>> has to Derby perception and the fact that the 10.2 release is going 
>>> to get such wide distribution through the Sun JDK.
>>> For background, see 
>>> http://www.nabble.com/Startup-time-tf2012748.html#a5531684
>>
>>
> 

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Mike Matrigali <mi...@sbcglobal.net>.

David Van Couvering wrote:
> Oh, one more thing, the database I have heard us compared with is HSQL. 
>  Folks don't expect us to be equivalent, given that HSQL is in-memory, 
> but they do complain that the difference is just a little too much, from 
> basically instantaneous to 10s of seconds...

hsql is an in memory db which does not do transaction based syncing,
derby is often not going to compare favorably when it does real 
transaction based syncing, guaranteed recovery, can handle db's bigger
than memory, ...
> 
> The use cases I have heard of are unit testing (I think we've even 
> complained about this, our way of solving it is reusing the same 
> database :)), and during development where you're constantly rebooting 
> the server that has the database embedded.  Another area of concern for 
> me is when the database is being created in a browser-embedded 
> application.  Java already has a bad rap for slow boot-up time in those 
> environments...
> 
> David

definitely would like to see the db create issue separated from the
rebooting an existing db.  And in the reboot case we should look at
"clean shutdown" vs. "crash with recovery shutdown".  In the reboot
case last time we looked there was a big difference between 1st
load of derby classes and subsequent load, there was a linear penalty
per derby class (and unfortunately # of derby classes is growing
release to release); but it has been quite awhile since anyone looked
at this in detail.



Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Daniel John Debrunner <dj...@apache.org>.
David Van Couvering wrote:

> Well, yes, but one might argue that we're going about it the wrong way:
> "It takes too long to run derby tests because it takes so long to start
> up the database."  "Ok, well, let's fix it so that you don't have to
> restart the database for each  unit test."

I think for the tests it's really a case of "It takes too long to run
derby tests because we run a separate JVM for each test case".

My total guess is that making derby boot/create databases faster could
bring the tests down from 5+ hours to 4+ hours. My guess for moving to
JUnit wil bring them down to under 2 hours.

It's all good work, complementing each other. Just when I see someone
complain about somthing in open source, my automatic reaction is to get
them involved, complaining won't fix anything, "doing" does.

Seems like the JUnit work should have the option of either re-using &
cleaning the current database, or creating a new one. Might be a useful
comparision across a set of tests.

Dan.


Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
Well, yes, but one might argue that we're going about it the wrong way: 
"It takes too long to run derby tests because it takes so long to start 
up the database."  "Ok, well, let's fix it so that you don't have to 
restart the database for each  unit test."

Not that I'm saying that's a bad idea, I think it's great, and maybe we 
can provide some template code for our users who are using Derby for 
unit testing.  But it doesn't remove the fact that database startup is 
slow (except for on Dag's machine :)).

David

Daniel John Debrunner wrote:
> David Van Couvering wrote:
> 
>> I personally haven't seen a problem either (except that derbyall takes
>> for bloody ever). 
> 
> Easy way to fix that, get involved in the effort to switch to JUnit. :-)
> 
> http://wiki.apache.org/db-derby/KillDerbyTestHarness
> 
> Dan.
> 
> 

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Daniel John Debrunner <dj...@apache.org>.
David Van Couvering wrote:

> I personally haven't seen a problem either (except that derbyall takes
> for bloody ever). 

Easy way to fix that, get involved in the effort to switch to JUnit. :-)

http://wiki.apache.org/db-derby/KillDerbyTestHarness

Dan.



Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
I personally haven't seen a problem either (except that derbyall takes 
for bloody ever).  However, I do know that a number of our users have 
been complaining about this.  It could be a perception problem, but 
rather than just dismiss this I'd like to do some further investigation. 
  If it comes down to some tips like "turn on your write cache" and 
"increase the checkpoint interval" great, but we should definitely try 
to address this one way or the other.

Thanks!

David

Dag H. Wanvik wrote:
> David Van Couvering <Da...@Sun.COM> writes:
> 
>> Oh, one more thing, the database I have heard us compared with is
>> HSQL. Folks don't expect us to be equivalent, given that HSQL is
>> in-memory, but they do complain that the difference is just a little
>> too much, from basically instantaneous to 10s of seconds...
> 
> On my Solaris x86 box, the connect+create new db takes all of three
> seconds.. so where do the tens of seconds come from? Is that in a
> recovery situation?
> 
> Dag

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by "Dag H. Wanvik" <Da...@Sun.COM>.
David Van Couvering <Da...@Sun.COM> writes:

> Oh, one more thing, the database I have heard us compared with is
> HSQL. Folks don't expect us to be equivalent, given that HSQL is
> in-memory, but they do complain that the difference is just a little
> too much, from basically instantaneous to 10s of seconds...

On my Solaris x86 box, the connect+create new db takes all of three
seconds.. so where do the tens of seconds come from? Is that in a
recovery situation?

Dag

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by David Van Couvering <Da...@Sun.COM>.
Oh, one more thing, the database I have heard us compared with is HSQL. 
  Folks don't expect us to be equivalent, given that HSQL is in-memory, 
but they do complain that the difference is just a little too much, from 
basically instantaneous to 10s of seconds...

The use cases I have heard of are unit testing (I think we've even 
complained about this, our way of solving it is reusing the same 
database :)), and during development where you're constantly rebooting 
the server that has the database embedded.  Another area of concern for 
me is when the database is being created in a browser-embedded 
application.  Java already has a bad rap for slow boot-up time in those 
environments...

David

Mike Matrigali wrote:
> I would like to participate in the discussion on this, let me know what
> you prefer (list, jira, wiki).  As dan said I believe there are many
> parts to this issue and would like to see them broken down.  I also do
> think there is a perception issue also and not quite sure how to
> address it, do the db's we are being compared to allow create and
> connect in the same jdbc statement?
> 
> My initial take is:
> 1) pick a platform and do some measurements, need to know if the 
> platform properly sync's the disk when asked.
> 2) compare creating empty database with copying an empty database
>    template from a jar file, with copying an empty database template
>    from disk.  I do not think derby should go to template based
>    approach for creating database.  I also witnessed a number of 
> bugs/harder to maintain code in 2 previous companies I worked at which 
> used this approach, where the current approach has been maintainable 
> from the start in derby.  I do think that if copying from a template is
> faster, then we should just document to users how/why they should do it
> if creating a db is a critical performance point for their application.
> Derby already fully supports this (at least from disk), it is exactly
> what restoring from a backup does and we already support restoring a 
> single db to multiple other data locations.
> 3) break up the "startup" cost issue into measureable understandable
>    pieces, as dan laid out -- just to make sure we are fixing the
>    right problem.  I really want to separate "startup" issues from
>    creating a new db issues.
> 
> David Van Couvering (JIRA) wrote:
>>      [ http://issues.apache.org/jira/browse/DERBY-1664?page=all ]
>>
>> David Van Couvering reassigned DERBY-1664:
>> ------------------------------------------
>>
>>     Assignee: David Van Couvering
>>
>>
>>> Derby startup time is too slow
>>> ------------------------------
>>>
>>>                Key: DERBY-1664
>>>                URL: http://issues.apache.org/jira/browse/DERBY-1664
>>>            Project: Derby
>>>         Issue Type: Improvement
>>>         Components: Store
>>>           Reporter: David Van Couvering
>>>        Assigned To: David Van Couvering
>>>            Fix For: 10.2.0.0
>>>
>>>
>>> I know it's hard to measure what "too slow" is, but this is a common 
>>> complaint and this affects overall perception of Derby.  This appears 
>>> to be related to another common complaint that it takes too long to 
>>> create tables.  I am marking this as Urgent because of the impact it 
>>> has to Derby perception and the fact that the 10.2 release is going 
>>> to get such wide distribution through the Sun JDK.
>>> For background, see 
>>> http://www.nabble.com/Startup-time-tf2012748.html#a5531684
>>
>>
> 

Re: [jira] Assigned: (DERBY-1664) Derby startup time is too slow

Posted by Mike Matrigali <mi...@sbcglobal.net>.
I would like to participate in the discussion on this, let me know what
you prefer (list, jira, wiki).  As dan said I believe there are many
parts to this issue and would like to see them broken down.  I also do
think there is a perception issue also and not quite sure how to
address it, do the db's we are being compared to allow create and
connect in the same jdbc statement?

My initial take is:
1) pick a platform and do some measurements, need to know if the 
platform properly sync's the disk when asked.
2) compare creating empty database with copying an empty database
    template from a jar file, with copying an empty database template
    from disk.  I do not think derby should go to template based
    approach for creating database.  I also witnessed a number of 
bugs/harder to maintain code in 2 previous companies I worked at which 
used this approach, where the current approach has been maintainable 
from the start in derby.  I do think that if copying from a template is
faster, then we should just document to users how/why they should do it
if creating a db is a critical performance point for their application.
Derby already fully supports this (at least from disk), it is exactly
what restoring from a backup does and we already support restoring a 
single db to multiple other data locations.
3) break up the "startup" cost issue into measureable understandable
    pieces, as dan laid out -- just to make sure we are fixing the
    right problem.  I really want to separate "startup" issues from
    creating a new db issues.

David Van Couvering (JIRA) wrote:
>      [ http://issues.apache.org/jira/browse/DERBY-1664?page=all ]
> 
> David Van Couvering reassigned DERBY-1664:
> ------------------------------------------
> 
>     Assignee: David Van Couvering
> 
> 
>>Derby startup time is too slow
>>------------------------------
>>
>>                Key: DERBY-1664
>>                URL: http://issues.apache.org/jira/browse/DERBY-1664
>>            Project: Derby
>>         Issue Type: Improvement
>>         Components: Store
>>           Reporter: David Van Couvering
>>        Assigned To: David Van Couvering
>>            Fix For: 10.2.0.0
>>
>>
>>I know it's hard to measure what "too slow" is, but this is a common complaint and this affects overall perception of Derby.  This appears to be related to another common complaint that it takes too long to create tables.  I am marking this as Urgent because of the impact it has to Derby perception and the fact that the 10.2 release is going to get such wide distribution through the Sun JDK.
>>For background, see http://www.nabble.com/Startup-time-tf2012748.html#a5531684
> 
>