You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jackrabbit.apache.org by Christoph Kiehl <ki...@subshell.com> on 2006/04/28 17:11:54 UTC

Scalability and performance

Hi,

we are thinking about using jackrabbit as content repository for a CMS we are 
going to develop for a customer.

The current system which has to be replaced stores all content in an oracle DB. 
The hierarchical structure of the content (properties, nodes) is normalized into 
different tables for each property type (string, number, binary) and one table 
for for node-references. This structure requires extensive queries (joins) to 
aggregate all data needed to display a node.
The data structure used by jackrabbit (or jsr 170) seems more appropriate for 
storing this type of content because I just need to query by UUID and then 
access the complete data structure directly (might depend on the used 
persistencemanager). Additionally queries by properties should be much faster 
using jackrabbits integrated lucene index.

My only concern is how scalable jackrabbit is. The system currently has about 
2.5 million nodes and is constantly growing expecting another 500.000 nodes per 
year. As of now I would use an oracle DB as storage. Does anyone here have some 
experience with jackrabbit repositories as large as this, and if you do, which 
storage do you use?

Thanks,
Christoph