You are viewing a plain text version of this content. The canonical link for it is here.
Posted to slide-dev@jakarta.apache.org by Sven Steiniger <Sv...@newtron.net> on 2003/02/19 09:17:48 UTC

Performance issues

Hi!

I used Slide in project for half an year now and as it
get's finished now, I tried to tweak it for more performance
(settings: content store to local file system, others into database).
When inserting files in Slide, there are four hotspots consuming
about 80% of the used time:
  1 NodeLock
  2 ObjectNode
  3 NodeRevisionNumber
  4 JDBCDescriptorsStore
Optimizing those spots resulted in a speedup of factor three!

1-3 are really simple and have no side effects at all. Shall
I simply check-in the changes or is there someone who first
controls them?

The optimization on JDBCDescriptorsStore is also simple and
relates to the isConnect()-method. I have read a lot of discussions
on this topic but maybe it's worth to start this thread again.
This method causes a major slowdown because of executing a
statement for every simple operation.
I understand that it is necessary to ensure the 100% data integrity,
but in normal environment it's very unlikely that the database
connection is lost. Thus in general it should be safe enough to check
the connection about every minute.
So why not introducing an option in the domain-config where the
connection-checking can be configured? Thus every project can choose
the appropriate settings.

Sven.

---------------------------------------------------------------------
To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-dev-help@jakarta.apache.org


Re: Performance issues

Posted by Remy Maucherat <re...@apache.org>.
Sven Steiniger wrote:
> Hi!
> 
> I used Slide in project for half an year now and as it
> get's finished now, I tried to tweak it for more performance
> (settings: content store to local file system, others into database).
> When inserting files in Slide, there are four hotspots consuming
> about 80% of the used time:
>   1 NodeLock
>   2 ObjectNode
>   3 NodeRevisionNumber
>   4 JDBCDescriptorsStore
> Optimizing those spots resulted in a speedup of factor three!
> 
> 1-3 are really simple and have no side effects at all. Shall
> I simply check-in the changes or is there someone who first
> controls them?
> 
> The optimization on JDBCDescriptorsStore is also simple and
> relates to the isConnect()-method. I have read a lot of discussions
> on this topic but maybe it's worth to start this thread again.
> This method causes a major slowdown because of executing a
> statement for every simple operation.
> I understand that it is necessary to ensure the 100% data integrity,
> but in normal environment it's very unlikely that the database
> connection is lost. Thus in general it should be safe enough to check
> the connection about every minute.
> So why not introducing an option in the domain-config where the
> connection-checking can be configured? Thus every project can choose
> the appropriate settings.

All this look like good ideas IMO.

Remy


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-dev-help@jakarta.apache.org


Re: Performance issues

Posted by Michael Smith <ms...@speedlegal.com>.
Martin Holz wrote:
> "Sven Steiniger" <Sv...@newtron.net> writes:
> 
> 
>>The optimization on JDBCDescriptorsStore is also simple and
>>relates to the isConnect()-method. I have read a lot of discussions
>>on this topic but maybe it's worth to start this thread again.
>>This method causes a major slowdown because of executing a
>>statement for every simple operation.
>>I understand that it is necessary to ensure the 100% data integrity,
>>but in normal environment it's very unlikely that the database
>>connection is lost. Thus in general it should be safe enough to check
>>the connection about every minute.
>>So why not introducing an option in the domain-config where the
>>connection-checking can be configured? Thus every project can choose
>>the appropriate settings.
> 
> 
> J2EEStore uses a much isConnected method. At FCH we use a
> subclass of JDBCDescriptorStore, which overides the isConnected
> with J2EEStore's implementation. We found the original store
> to slow to be usable. 

Right. It's known that this is slow and (as you say) broken anyway. It's 
there as a quick hack that makes it work reasonably reliably in 
practice. Someone who actually knows how to use JDBC effectively really 
needs to look at this (though probably dumping the existing JDBC stores 
and adapting the proposed ones that were sent to this list a while ago 
(they're much more efficient, but last time I looked didn't actually 
work right in a few cases) - once fixed up, they'd be much better.

Mike


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-dev-help@jakarta.apache.org


Re: Performance issues

Posted by Martin Holz <ho...@fiz-chemie.de>.
"Sven Steiniger" <Sv...@newtron.net> writes:

> The optimization on JDBCDescriptorsStore is also simple and
> relates to the isConnect()-method. I have read a lot of discussions
> on this topic but maybe it's worth to start this thread again.
> This method causes a major slowdown because of executing a
> statement for every simple operation.
> I understand that it is necessary to ensure the 100% data integrity,
> but in normal environment it's very unlikely that the database
> connection is lost. Thus in general it should be safe enough to check
> the connection about every minute.
> So why not introducing an option in the domain-config where the
> connection-checking can be configured? Thus every project can choose
> the appropriate settings.

J2EEStore uses a much isConnected method. At FCH we use a
subclass of JDBCDescriptorStore, which overides the isConnected
with J2EEStore's implementation. We found the original store
to slow to be usable. 

How would the isConnected() method help for data integrity.
The connection could still get lost between a call 
to isConnected and the next database statement. In which 
case the database would rollback the connection, so data
integrity should be ensured. 

The database teststatement "select 1 from objects where uri is null" 
may be not as cheap as it looks. I got the impression, that
postgres 7.2 does a complete sequencial scan on the 'objects' table.
Looks like the query optimizer is not fool proof. 

Martin

  
 


---------------------------------------------------------------------
To unsubscribe, e-mail: slide-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: slide-dev-help@jakarta.apache.org