You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Kristian Waagan <Kr...@Sun.COM> on 2009/03/03 13:19:17 UTC

Re: inserts slowing down after 2.5m rows

Brian Peterson wrote:
> I thought I read in the documentation that 1000 was the max initial 
> pages you could allocate, and after that, Derby allocates a page at a 
> time. Is there some other setting for getting it to allocate more at a time?

Another question is why the maximum is set to 1000 pages.
Any takers?

If the property can be set higher and controlled on a per conglomerate 
(table or index) basis, it can be a nice tool for those who require or 
want to use such tuning.


-- 
Kristian

> 
>  
> 
> Brian
> 
>  
> 
> *From:* Michael Segel [mailto:msegel@segel.com] *On Behalf Of 
> *derby@segel.com
> *Sent:* Friday, February 27, 2009 9:59 PM
> *To:* 'Derby Discussion'
> *Subject:* RE: inserts slowing down after 2.5m rows
> 
>  
> 
> Ok,
> 
>  
> 
> For testing, if you allocate 2000 pages, then if my thinking is ok, then 
> you'll fly along until you get until 2100 pages.
> 
>  
> 
> It sounds like you're hitting a bit of a snag where after your initial 
> allocation of pages, Derby is only allocating a smaller number of pages 
> at a time.
> 
>  
> 
> I would hope that you could configure the number of pages to be 
> allocated in blocks as the table grows.
> 
>  
> 
>  
> 
> ------------------------------------------------------------------------
> 
> *From:* publicayers@verizon.net [mailto:publicayers@verizon.net]
> *Sent:* Friday, February 27, 2009 8:48 PM
> *To:* Derby Discussion
> *Subject:* Re: inserts slowing down after 2.5m rows
> 
>  
> 
>  I've increased the log size and the checkpoint interval, but it doesn't 
> seem to help.
> 
>  
> 
> It looks like the inserts begin to dramatically slow down once the table 
> reaches the initial allocation of pages. Things just fly along until it 
> gets to about 1100 pages (I've allocated an initial 1000 pages, pages 
> are 32k).
> 
>  
> 
> Any suggestions on how to keep the inserts moving quickly at this point?
> 
>  
> 
> Brian
> 
>  
> 
> On Fri, Feb 27, 2009 at  3:41 PM, publicayers@verizon.net wrote:
> 
>  
> 
>  The application is running on a client machine. I'm not sure how to 
> tell if there's a different disk available that I could log to.
> 
>  
> 
> If checkpoint is causing this delay, how to a manage that? Can I turn 
> checkpointing off? I already have durability set to test; I'm not 
> concerned about recovering from a crashed db.
> 
>  
> 
> Brian
> 
>  
> 
> On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:
> 
>  
> 
>>  Could be checkpoint.. BTW to speed up bulk load you may want to use
> 
> large log files located separately from data disks.
> 
>  
> 
> 2009/2/27, Brian Peterson < dianeayers@verizon.net  
> <ma...@verizon.net>>:
> 
>>  I have a big table that gets a lot of inserts. Rows are inserted 10k at a
> 
>>  time with a table function. At around 2.5 million rows, inserts slow down
> 
>>  from 2-7s to around 15-20s. The table's dat file is around 800-900M.
> 
>>
> 
>>
> 
>>
> 
>>  I have durability set to "test", table-level locks, a primary key 
> index and
> 
>>  another 2-column index on the table. Page size is at the max and page 
> cache
> 
>>  set to 4500 pages. The table gets compressed (inplace) every 500,000 
> rows.
> 
>>  I'm using Derby 10.4 with JDK 1.6.0_07, running on Windows XP. I've ruled
> 
>>  out anything from the rest of the application, including GC (memory usage
> 
>>  follows a consistent pattern during the whole load). It is a local file
> 
>>  system. The database has a fixed number of tables (so there's a fixed 
> number
> 
>>  of dat files in the database directory the whole time). The logs are 
> getting
> 
>>  cleaned up, so there's only a few dat files in the log directory as well.
> 
>>
> 
>>
> 
>>
> 
>>  Any ideas what might be causing the big slowdown after so many loads?
> 
>>
> 
>>
> 
>>
> 
>>  Brian
> 
>>
> 
>>
> 
>>
> 
>>
> 


RE: inserts slowing down after 2.5m rows

Posted by de...@segel.com.
> Another question is why the maximum is set to 1000 pages.
> Any takers?

Because it was a big nice round number?

And when I say big, I meant it as it was big enough at the time.

Remember that Cloudscape was designed to fill a niche. It was designed to be
a small, embeddable engine that was 100% Java.

It was not meant as a competitor to the General Purpose RDBMs engines.

This is why I've asked that those pushing the Cloudscape development to
consider what they want this engine to be. Adding more features creates a
larger footprint which has a negative impact on some users who want to embed
the engine in a small downloadable app. 

This is why I recommended that those behind Derby consider building a more
modular approach to the engine. Sort of a plug and play for deployment of
features. (An example, if you're going to use the engine in an embeddable
format, you don't load up a container class that has chunks, page spaces,
table spaces, etc. You don't allow containers that utilize raw disks. You
don't do portioning. However if someone is going to use this in a more
traditional role, you do create the engine and load those classes.)

HTH

-Mike

> -----Original Message-----
> From: Kristian.Waagan@Sun.COM [mailto:Kristian.Waagan@Sun.COM]
> Sent: Tuesday, March 03, 2009 6:19 AM
> To: Derby Discussion
> Subject: Re: inserts slowing down after 2.5m rows
> 
> Brian Peterson wrote:
> > I thought I read in the documentation that 1000 was the max initial
> > pages you could allocate, and after that, Derby allocates a page at a
> > time. Is there some other setting for getting it to allocate more at a
> time?
> 
> Another question is why the maximum is set to 1000 pages.
> Any takers?
> 
> If the property can be set higher and controlled on a per conglomerate
> (table or index) basis, it can be a nice tool for those who require or
> want to use such tuning.
> 
> 
> --
> Kristian
> 
> >
> >
> >
> > Brian
> >
> >
> >
> > *From:* Michael Segel [mailto:msegel@segel.com] *On Behalf Of
> > *derby@segel.com
> > *Sent:* Friday, February 27, 2009 9:59 PM
> > *To:* 'Derby Discussion'
> > *Subject:* RE: inserts slowing down after 2.5m rows
> >
> >
> >
> > Ok,
> >
> >
> >
> > For testing, if you allocate 2000 pages, then if my thinking is ok, then
> > you'll fly along until you get until 2100 pages.
> >
> >
> >
> > It sounds like you're hitting a bit of a snag where after your initial
> > allocation of pages, Derby is only allocating a smaller number of pages
> > at a time.
> >
> >
> >
> > I would hope that you could configure the number of pages to be
> > allocated in blocks as the table grows.
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > *From:* publicayers@verizon.net [mailto:publicayers@verizon.net]
> > *Sent:* Friday, February 27, 2009 8:48 PM
> > *To:* Derby Discussion
> > *Subject:* Re: inserts slowing down after 2.5m rows
> >
> >
> >
> >  I've increased the log size and the checkpoint interval, but it doesn't
> > seem to help.
> >
> >
> >
> > It looks like the inserts begin to dramatically slow down once the table
> > reaches the initial allocation of pages. Things just fly along until it
> > gets to about 1100 pages (I've allocated an initial 1000 pages, pages
> > are 32k).
> >
> >
> >
> > Any suggestions on how to keep the inserts moving quickly at this point?
> >
> >
> >
> > Brian
> >
> >
> >
> > On Fri, Feb 27, 2009 at  3:41 PM, publicayers@verizon.net wrote:
> >
> >
> >
> >  The application is running on a client machine. I'm not sure how to
> > tell if there's a different disk available that I could log to.
> >
> >
> >
> > If checkpoint is causing this delay, how to a manage that? Can I turn
> > checkpointing off? I already have durability set to test; I'm not
> > concerned about recovering from a crashed db.
> >
> >
> >
> > Brian
> >
> >
> >
> > On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:
> >
> >
> >
> >>  Could be checkpoint.. BTW to speed up bulk load you may want to use
> >
> > large log files located separately from data disks.
> >
> >
> >
> > 2009/2/27, Brian Peterson < dianeayers@verizon.net
> > <ma...@verizon.net>>:
> >
> >>  I have a big table that gets a lot of inserts. Rows are inserted 10k
> at a
> >
> >>  time with a table function. At around 2.5 million rows, inserts slow
> down
> >
> >>  from 2-7s to around 15-20s. The table's dat file is around 800-900M.
> >
> >>
> >
> >>
> >
> >>
> >
> >>  I have durability set to "test", table-level locks, a primary key
> > index and
> >
> >>  another 2-column index on the table. Page size is at the max and page
> > cache
> >
> >>  set to 4500 pages. The table gets compressed (inplace) every 500,000
> > rows.
> >
> >>  I'm using Derby 10.4 with JDK 1.6.0_07, running on Windows XP. I've
> ruled
> >
> >>  out anything from the rest of the application, including GC (memory
> usage
> >
> >>  follows a consistent pattern during the whole load). It is a local
> file
> >
> >>  system. The database has a fixed number of tables (so there's a fixed
> > number
> >
> >>  of dat files in the database directory the whole time). The logs are
> > getting
> >
> >>  cleaned up, so there's only a few dat files in the log directory as
> well.
> >
> >>
> >
> >>
> >
> >>
> >
> >>  Any ideas what might be causing the big slowdown after so many loads?
> >
> >>
> >
> >>
> >
> >>
> >
> >>  Brian
> >
> >>
> >
> >>
> >
> >>
> >
> >>
> >