You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Bruce Li <bl...@tirawireless.com> on 2007/06/06 18:29:04 UTC

Jackrabbit performance issue

 

 

Hi all,

I am using Jackrabbit v1.3 and I got the performance issue.

The problem is when I visit object notes (for example, "Company"), it
costs more than 25 second to visit all "Company" notes.

 

The persistence manager is Oracle DB:

<PersistenceManager
class="org.apache.jackrabbit.core.state.db.OraclePersistenceManager">

The web container is Tomcat v5.5.15.

 

For the performance test, I list the following facts:

1. 

1. In my repository, there are 339 "Company" object instances, which are
organized as a notes under "/root/repo/company-note/". Each field of
"Company" is stored as property or child note.

2. When export data from "/root/repo/company-note/", I got the XML
output of "Company" with the size of 1513KB.

3. When I export data from "/", I got the XML of 111,674 KB (I think
that all data in my repository)

4. The "Company" note is "mix:referenceable".

 

The CacheManager always do "resize", when I am doing "search" or any
other operation. 

The console output is the following:

 

..

177888 [http-8080-Processor18] INFO
org.apache.jackrabbit.core.state.CacheManager  - resizeAll size=8

178917 [http-8080-Processor18] INFO
org.apache.jackrabbit.core.state.CacheManager  - resizeAll size=8

179947 [http-8080-Processor18] INFO
org.apache.jackrabbit.core.state.CacheManager  - resizeAll size=8

180961 [http-8080-Processor18] INFO
org.apache.jackrabbit.core.state.CacheManager  - resizeAll size=10

...

 

The configuration in repository.xml is using default values as the
following:

 

  <     <param name="useCompoundFile" value="true" /> 

  <param name="minMergeDocs" value="100" /> 

  <param name="volatileIdleTime" value="3" /> 

  <param name="maxMergeDocs" value="100000" /> 

  <param name="mergeFactor" value="10" /> 

  <param name="bufferSize" value="10" /> 

  <param name="cacheSize" value="1000" /> 

  <param name="forceConsistencyCheck" value="false" /> 

  <param name="autoRepair" value="true" /> 

  <param name="analyzer"
value="org.apache.lucene.analysis.standard.StandardAnalyzer" /> 

  <param name="queryClass"
value="org.apache.jackrabbit.core.query.QueryImpl" /> 

  <param name="idleTime" value="-1" /> 

 

 

This performance issue is critical for me. However, I did not see other
people report the performance issue like this. I am wondering what's
wrong with my configuration.

 

Thanks,

Bruce Li

t: 416.642.8472 ext. 241
f: 416.932.6299 

 


RE: Jackrabbit performance issue

Posted by Bruce Li <bl...@tirawireless.com>.
Hi Jukka,

Thank you very much for your feedback.

1) The DB latency is set for all application including repository. There
is no any performance issue with other application which has a lot of
transactions.

2) The performance test is using local repository server. The
"repository" and "session" is constructed by Spring Framework. The real
repository server is running at Tomcat container and all the access for
repository is going through web service. 

3) This "search" is not by query but by visiting all "Company" note
using " getNode".

I would like to share some "Company" code for further performance
digging:

public class Company {
	
	private String repositoryId;
	private int usage;
	private int totalVotes;
	private double ranking;
	private int type;
...
	
	private CompanyProfile[] applications;
	private WebReference[] webReferences;
	private CompanyProperty[] companyProperties;
...
}


All fileds with build-in type of "Company" instance is persisted to
repository by a recursive method using "java.lang.reflect".

{
...
	String nodePath =
itemNode.getCorrespondingNodePath(session.getWorkspace().getName());

	// Get the class name by naming convention
	//
	String modelName =
getNodeName(getSuffixName(nodePath,":"));
	String itemClassName = getClassName(modelName);

	Class itemClass = Class.forName(itemClassName);
	itemObj = itemClass.newInstance();

	PropertyIterator pi = itemNode.getProperties();
	while (pi.hasNext()) 
	{
			Property property = pi.nextProperty();
			// recursive method to populate data from
repository 				// property
			populateItemField(itemNode.getUUID(), itemObj,
property);
	}
}

When execute the "Property property = pi.nextProperty()", it always
makes the CacheManager resizing, which costs time.

I am using JDK 1.5.

Thanks again for your help.

Bruce Li







-----Original Message-----
From: Jukka Zitting [mailto:jukka.zitting@gmail.com] 
Sent: Saturday, June 09, 2007 4:23 AM
To: users@jackrabbit.apache.org
Subject: Re: Jackrabbit performance issue

Hi,

On 6/6/07, Bruce Li <bl...@tirawireless.com> wrote:
> I am using Jackrabbit v1.3 and I got the performance issue.
>
> The problem is when I visit object notes (for example, "Company"), it
> costs more than 25 second to visit all "Company" notes.

Could you share an example of the code you are using to access the
company nodes?

Some common performance pitfalls that could affect your setup are:

1) Database latency. Check that the network latency between the
Jackrabbit and the backend database is small. Jackrabbit typically
makes a number of database calls, and the latencies of individual
calls can easily add up since at the moment Jackrabbit doesn't perform
those calls in parallel.

2) RMI access. Are you using the RMI layer to access the repository?
RMI access performance is also highly sensitive to network latency and
there are some inherent design choises that make the RMI layer perfom
not that well.

3) Large search result sets. Are you using a query to access all the
company nodes? When you are simply iterating all nodes in a subtree
you get noticeably better performance if you simply traverse the tree
with Node.getNodes() instead of using a query. Query performance is
best when you are targetting a small subset of a large subtree or an
entire workspace.

BR,

Jukka Zitting

Re: Jackrabbit performance issue

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On 6/6/07, Bruce Li <bl...@tirawireless.com> wrote:
> I am using Jackrabbit v1.3 and I got the performance issue.
>
> The problem is when I visit object notes (for example, "Company"), it
> costs more than 25 second to visit all "Company" notes.

Could you share an example of the code you are using to access the
company nodes?

Some common performance pitfalls that could affect your setup are:

1) Database latency. Check that the network latency between the
Jackrabbit and the backend database is small. Jackrabbit typically
makes a number of database calls, and the latencies of individual
calls can easily add up since at the moment Jackrabbit doesn't perform
those calls in parallel.

2) RMI access. Are you using the RMI layer to access the repository?
RMI access performance is also highly sensitive to network latency and
there are some inherent design choises that make the RMI layer perfom
not that well.

3) Large search result sets. Are you using a query to access all the
company nodes? When you are simply iterating all nodes in a subtree
you get noticeably better performance if you simply traverse the tree
with Node.getNodes() instead of using a query. Query performance is
best when you are targetting a small subset of a large subtree or an
entire workspace.

BR,

Jukka Zitting