You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by Mark Hellkamp <ma...@theplatform.com> on 2006/08/30 22:10:02 UTC

Concurrent access problems?

We are trying to use Derby embedded in our web application but are running into
some serious resource problems when processing concurrent requests. When we are
performing load testing on the application we are experiencing serious
slowdowns, even when running only a few concurrent threads. We are noticing that
the process handle count on our application keeps growing during our tests,
until the machine becomes completely unresponsive. We have isolated the problem
to Derby and I have moved Derby out of the application process by running it in
Network Server mode. In this configuration we can see that the handle count on
the Derby process will grow as we add more concurrent threads accessing the
application. Looking at the handles for the Derby process, it appears that the
number of Semaphores keeps increasing. I'm not sure if this is normal or this
indicates a problem, but it seems a bit suspicious. 

I also completely swapped out Derby with MySql to verify that the application
was not at fault (we are using Spring + Hibernate for data access). Under MySQl
we did not see a significant performance degradation with multiple threads, and
the handle count remained fairly steady on the database.

Does Derby have problems with concurrent access? Our tests are doing read only
queries against the database so read/write contention should not be a problem. I
am extremely perplexed by this problem. Anyone have any ideas?

Regards,
Mark

Re: Concurrent access problems?

Posted by Paul J DeCoursey <pa...@decoursey.net>.

I've seen this behavior with Hibrnate and DB2.  The memory usage would 
ramp up out of control if we didn't properly close our Hibernate 
sessions. We thought we were but they changed the way sessions were 
managed between the release we were using and the last time they updated 
their docs.  What was sad is we saw this behavior with as little as 2 
users connected at a time.  My guess is when he ran it with MySQL the 
sessions were getting closed becasue MySQL's driver was automaticaly 
closing old connections. I recall that from when I used MySQL. It's just 
lazy coding if you ask me, closing a connection should be the 
responsibility of the one who opened it.

I am certain that the issue is with Hibernate, or possibly 
Spring/Hibernate combination.  It can be difficult to track issues down 
with those frameworks.

p

derby@segel.com wrote:
> Well that's the thing.
>
> He's using Derby in conjunction with hibernate and spring.
>
> My guess is that its something in the combination of the tools that is
> causing the issue.
>
> That's why I was wondering if he just tried the queries against the database
> independent of the app.
>
>
>   
>> -----Original Message-----
>> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net]
>> Sent: Thursday, August 31, 2006 6:55 PM
>> To: Derby Discussion
>> Subject: Re: Concurrent access problems?
>>
>> Does anyone know how the resource described below as a "process handle"
>> or "Semaphore", relates to actual things that Derby actively uses?
>> Actual Derby locking is done by user object allocation so I don't think
>> it would show as such in this tool.  It uses java synchronization
>> blocks, but it is up to the jvm to release resources based on this usage.
>>
>> Once the page cache fills up Derby should not really accumulate much
>> more resources no matter what the number of rows.  For instance
>> (again once the page cache is filled), the accumulated resources to
>> scan a one row table are expected to be about the same as to scan
>> a 10 million row table.  Depending on datatypes there may be object
>> allocation per row, but the objects should be freed after the row
>> is done.  For most of the datatypes derby is coded to reuse the
>> objects for each row so the normal case is once the scan starts no
>> object allocation is necessary.
>>
>> Just to understand where the problem is, do observe that the resources
>> are still allocated after the query is finished and the transaction
>> has committed?  How about after the connection is closed (but the db
>> still opened)?  And finally how about after the db is closed (using
>> shutdown=true?).
>>
>> Mark Hellkamp wrote:
>>     
>>>  <de...@...> writes:
>>>
>>>       
>>>> [mjs]
>>>> The person who created the JIRA issue is the one who said "Large amounts
>>>>         
>> of
>>     
>>>> data".
>>>>
>>>> 10K rows of 9 columns, is not a "large amount" of data.
>>>>
>>>> While Mark posted that he tried using MySQL instead of Derby, one has to
>>>>         
>> ask
>>     
>>>> if he tried running the query independent of his app.
>>>>
>>>> The point is that there are a lot of variables that can impact
>>>>         
>> performance.
>>     
>>>> Spring and Hibernate are two of them.
>>>>
>>>> So unless someone introduced a "bug" in the latest release, I'm a bit
>>>> skeptical of the issue.
>>>>
>>>>
>>>>         
>>> The application is querying about 24 tables in all for each request that
>>>       
>> comes
>>     
>>> in. The number of rows in the tables range from 0 to over 30,000 in my
>>>       
>> current
>>     
>>> testing. As I add more concurrent request threads the database continues
>>>       
>> to
>>     
>>> allocate additional process handles (this is on Windows). Using Process
>>>       
>> Explorer
>>     
>>> (from www.sysinternals.com) I can see that the Derby process is
>>>       
>> allocating
>>     
>>> Semaphores and does not appear to ever release them. As more data is
>>>       
>> added to
>>     
>>> the tables, the process handle count continues to grow even though the
>>>       
>> results
>>     
>>> of the queries remain the same. Derby appears to be allocating resources
>>> (locks?) but never cleaning them up.
>>>
>>>
>>>
>>>
>>>       
>
>
>
>
>

RE: Concurrent access problems?

Posted by de...@segel.com.

Well that's the thing.

He's using Derby in conjunction with hibernate and spring.

My guess is that its something in the combination of the tools that is
causing the issue.

That's why I was wondering if he just tried the queries against the database
independent of the app.


> -----Original Message-----
> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net]
> Sent: Thursday, August 31, 2006 6:55 PM
> To: Derby Discussion
> Subject: Re: Concurrent access problems?
> 
> Does anyone know how the resource described below as a "process handle"
> or "Semaphore", relates to actual things that Derby actively uses?
> Actual Derby locking is done by user object allocation so I don't think
> it would show as such in this tool.  It uses java synchronization
> blocks, but it is up to the jvm to release resources based on this usage.
> 
> Once the page cache fills up Derby should not really accumulate much
> more resources no matter what the number of rows.  For instance
> (again once the page cache is filled), the accumulated resources to
> scan a one row table are expected to be about the same as to scan
> a 10 million row table.  Depending on datatypes there may be object
> allocation per row, but the objects should be freed after the row
> is done.  For most of the datatypes derby is coded to reuse the
> objects for each row so the normal case is once the scan starts no
> object allocation is necessary.
> 
> Just to understand where the problem is, do observe that the resources
> are still allocated after the query is finished and the transaction
> has committed?  How about after the connection is closed (but the db
> still opened)?  And finally how about after the db is closed (using
> shutdown=true?).
> 
> Mark Hellkamp wrote:
> >  <de...@...> writes:
> >
> >>[mjs]
> >>The person who created the JIRA issue is the one who said "Large amounts
> of
> >>data".
> >>
> >>10K rows of 9 columns, is not a "large amount" of data.
> >>
> >>While Mark posted that he tried using MySQL instead of Derby, one has to
> ask
> >>if he tried running the query independent of his app.
> >>
> >>The point is that there are a lot of variables that can impact
> performance.
> >>Spring and Hibernate are two of them.
> >>
> >>So unless someone introduced a "bug" in the latest release, I'm a bit
> >>skeptical of the issue.
> >>
> >>
> >
> >
> > The application is querying about 24 tables in all for each request that
> comes
> > in. The number of rows in the tables range from 0 to over 30,000 in my
> current
> > testing. As I add more concurrent request threads the database continues
> to
> > allocate additional process handles (this is on Windows). Using Process
> Explorer
> > (from www.sysinternals.com) I can see that the Derby process is
> allocating
> > Semaphores and does not appear to ever release them. As more data is
> added to
> > the tables, the process handle count continues to grow even though the
> results
> > of the queries remain the same. Derby appears to be allocating resources
> > (locks?) but never cleaning them up.
> >
> >
> >
> >

Re: Concurrent access problems?

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Does anyone know how the resource described below as a "process handle"
or "Semaphore", relates to actual things that Derby actively uses?
Actual Derby locking is done by user object allocation so I don't think
it would show as such in this tool.  It uses java synchronization 
blocks, but it is up to the jvm to release resources based on this usage.

Once the page cache fills up Derby should not really accumulate much 
more resources no matter what the number of rows.  For instance
(again once the page cache is filled), the accumulated resources to
scan a one row table are expected to be about the same as to scan
a 10 million row table.  Depending on datatypes there may be object
allocation per row, but the objects should be freed after the row
is done.  For most of the datatypes derby is coded to reuse the
objects for each row so the normal case is once the scan starts no
object allocation is necessary.

Just to understand where the problem is, do observe that the resources
are still allocated after the query is finished and the transaction
has committed?  How about after the connection is closed (but the db
still opened)?  And finally how about after the db is closed (using
shutdown=true?).

Mark Hellkamp wrote:
>  <de...@...> writes:
> 
>>[mjs] 
>>The person who created the JIRA issue is the one who said "Large amounts of
>>data".
>>
>>10K rows of 9 columns, is not a "large amount" of data.
>>
>>While Mark posted that he tried using MySQL instead of Derby, one has to ask
>>if he tried running the query independent of his app.
>>
>>The point is that there are a lot of variables that can impact performance.
>>Spring and Hibernate are two of them.
>>
>>So unless someone introduced a "bug" in the latest release, I'm a bit
>>skeptical of the issue.
>>
>>
> 
> 
> The application is querying about 24 tables in all for each request that comes
> in. The number of rows in the tables range from 0 to over 30,000 in my current
> testing. As I add more concurrent request threads the database continues to
> allocate additional process handles (this is on Windows). Using Process Explorer
> (from www.sysinternals.com) I can see that the Derby process is allocating
> Semaphores and does not appear to ever release them. As more data is added to
> the tables, the process handle count continues to grow even though the results
> of the queries remain the same. Derby appears to be allocating resources
> (locks?) but never cleaning them up. 
> 
> 
> 
>

RE: Concurrent access problems?

Posted by de...@segel.com.

Something doesn't sound right.

What you're describing is that the query is doing a sequential scan of the
table. 


> -----Original Message-----
> From: news [mailto:news@sea.gmane.org] On Behalf Of Mark Hellkamp
> Sent: Thursday, August 31, 2006 2:22 PM
> To: derby-user@db.apache.org
> Subject: Re: Concurrent access problems?
> 
>  <de...@...> writes:
> >
> > [mjs]
> > The person who created the JIRA issue is the one who said "Large amounts
> of
> > data".
> >
> > 10K rows of 9 columns, is not a "large amount" of data.
> >
> > While Mark posted that he tried using MySQL instead of Derby, one has to
> ask
> > if he tried running the query independent of his app.
> >
> > The point is that there are a lot of variables that can impact
> performance.
> > Spring and Hibernate are two of them.
> >
> > So unless someone introduced a "bug" in the latest release, I'm a bit
> > skeptical of the issue.
> >
> >
> 
> The application is querying about 24 tables in all for each request that
> comes
> in. The number of rows in the tables range from 0 to over 30,000 in my
> current
> testing. As I add more concurrent request threads the database continues
> to
> allocate additional process handles (this is on Windows). Using Process
> Explorer
> (from www.sysinternals.com) I can see that the Derby process is allocating
> Semaphores and does not appear to ever release them. As more data is added
> to
> the tables, the process handle count continues to grow even though the
> results
> of the queries remain the same. Derby appears to be allocating resources
> (locks?) but never cleaning them up.
>

Re: Concurrent access problems?

Posted by Mark Hellkamp <ma...@theplatform.com>.

 <de...@...> writes:
> 
> [mjs] 
> The person who created the JIRA issue is the one who said "Large amounts of
> data".
> 
> 10K rows of 9 columns, is not a "large amount" of data.
> 
> While Mark posted that he tried using MySQL instead of Derby, one has to ask
> if he tried running the query independent of his app.
> 
> The point is that there are a lot of variables that can impact performance.
> Spring and Hibernate are two of them.
> 
> So unless someone introduced a "bug" in the latest release, I'm a bit
> skeptical of the issue.
> 
> 

The application is querying about 24 tables in all for each request that comes
in. The number of rows in the tables range from 0 to over 30,000 in my current
testing. As I add more concurrent request threads the database continues to
allocate additional process handles (this is on Windows). Using Process Explorer
(from www.sysinternals.com) I can see that the Derby process is allocating
Semaphores and does not appear to ever release them. As more data is added to
the tables, the process handle count continues to grow even though the results
of the queries remain the same. Derby appears to be allocating resources
(locks?) but never cleaning them up.

Re: Concurrent access problems?

Posted by Daniel John Debrunner <dj...@apache.org>.

Alan wrote:

> See https://issues.apache.org/jira/browse/DERBY-1781 for the bug report.  It's 
> an 9 column table with 10,000 rows.

Derby can easily support that amount of data. I create and use databases
with the smallest table having 100,000 rows and about nine columns and
the others in the millions range. I doubt this issue is related to data
size.

This question was asked on the developer list for DERBY-1781

> I am not familar with the terminology here, what kind of thing is a "process handle"?  Any link available to describe it.

Maybe the reporter of DERBY-1781 could answer the question and also add
a comment indicating what they are doing to monitor the number of
"process handles".

Thanks,
Dan.

RE: Concurrent access problems?

Posted by de...@segel.com.


> -----Original Message-----
> From: news [mailto:news@sea.gmane.org] On Behalf Of Alan
> Sent: Thursday, August 31, 2006 3:07 AM
> To: derby-user@db.apache.org
> Subject: Re: Concurrent access problems?
> 
> See https://issues.apache.org/jira/browse/DERBY-1781 for the bug report.
> It's
> an 9 column table with 10,000 rows.
> 
> >
> > Define "large amounts of data".
> >
> > While its great that you're abstracting the database, you still have the
> > issue that not all databases are the same and that some require more
> tuning
> > than others.

[mjs] 
The person who created the JIRA issue is the one who said "Large amounts of
data".

10K rows of 9 columns, is not a "large amount" of data.

While Mark posted that he tried using MySQL instead of Derby, one has to ask
if he tried running the query independent of his app.

The point is that there are a lot of variables that can impact performance.
Spring and Hibernate are two of them.

So unless someone introduced a "bug" in the latest release, I'm a bit
skeptical of the issue.

Re: Concurrent access problems?

Posted by Alan <al...@hotmail.com>.

See https://issues.apache.org/jira/browse/DERBY-1781 for the bug report.  It's 
an 9 column table with 10,000 rows.

> 
> Define "large amounts of data".
> 
> While its great that you're abstracting the database, you still have the
> issue that not all databases are the same and that some require more tuning
> than others.

RE: Concurrent access problems?

Posted by de...@segel.com.

Define "large amounts of data".

While its great that you're abstracting the database, you still have the
issue that not all databases are the same and that some require more tuning
than others.


> -----Original Message-----
> From: news [mailto:news@sea.gmane.org] On Behalf Of Mark Hellkamp
> Sent: Wednesday, August 30, 2006 5:39 PM
> To: derby-user@db.apache.org
> Subject: Re: Concurrent access problems?
> 
> Here is a quick follow-up to this issue. The problem appears to get worse
> as the
> number of rows increases in the tables used in the queries, even though
> the
> additional data should not be a factor in the queries. I did find a paper
> that
> discusses Derby
> (https://twiki.informatik.tu-
> chemnitz.de/pub/DVS/SeminarDBMSImplementation/
> Ressourcenverwaltung_Derby.pdf)that implies that Derby does not handle
> large
> amounts of data too well.
>

Re: Concurrent access problems?

Posted by Mark Hellkamp <ma...@theplatform.com>.

Here is a quick follow-up to this issue. The problem appears to get worse as the
number of rows increases in the tables used in the queries, even though the
additional data should not be a factor in the queries. I did find a paper that
discusses Derby
(https://twiki.informatik.tu-chemnitz.de/pub/DVS/SeminarDBMSImplementation/
Ressourcenverwaltung_Derby.pdf)that implies that Derby does not handle large
amounts of data too well.