You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Nilay Parmar <ni...@cybage.com> on 2013/12/03 08:26:01 UTC

RE: Standalone JR takes long time to create indexes from Scratch

Thought it is delayed but Thank you for your reply Marek.

Following that, can you(anyone) share some idea about indexing of some(Uploaded in last 1 month) documents from a Datastore?
For Ex: I want to index only those documents uploaded after 20th Nov.

Please share your thoughts or experience.

Thanks and regards,
Nilay Parmar |Sr. Software Engineer
Cybage Software Pvt. Ltd. (An SEI-CMMI Level 5 assessed & ISO 27001 Company), India
Phone(O):91-79-66737000, Ext: 5336
Fax:91-79-66737001


-----Original Message-----
From: Marek Slama [mailto:mslama@email.cz] 
Sent: Saturday, November 23, 2013 3:04 PM
To: users@jackrabbit.apache.org
Subject: Re: Standalone JR takes long time to create indexes from Scratch

It seems JR creates index on documents itself too ie. for full text search. 
Do you need this? Anyway we have about 30GB repo and reindexing takes about 
2hrs. If you really need to reindex also documents you can try to 
investigate if it is possible to improve IO somehow. If datastore is in DB 
you can try clustering db. If on fs then some RAID. (You can also try to 
increase bundle cache but not sure if it may help as I assume JR loads 
document content just once for indexing).

Marek

"Hi All,

We have a client which is UK based legal firm. According to their business 
requirement, they need have very large set of documents to manage and it 
keeps increasing. To manage those documents, we use Apache Jackrabbit.

Now, as a part of project maintenance and performance improvement, we 
recommended to upgrade the jackrabbit from 2.2.7 to 2.6.0. Jackrabbit 
upgrade is successful but they still use old indexes generated from version 
2.2.7.

Datastore size is approx 700GB.
The problem is, jackrabbit takes 7-8 days to complete indexing process. 
Until indexing process finished, we cannot use it. In the production, we 
cannot afford to shutdown the production for those many days.
I have enabled the debug level logs in jackrabbit and observed lot of 
occurrences of following entries in log file.

- DEBUG [WrapperSimpleAppMain] AbstractBundlePersistenceManager.java:765 
Loading bundle e7abce77-578e-461a-8b9d-59ee4dfe5480

- DEBUG [WrapperSimpleAppMain] SessionState.java:213 Performing item.getPath
()

- DEBUG [WrapperSimpleAppMain] SessionState.java:229 Performed item.getPath(
) in 63696us

- DEBUG [1562158685@qtp-875788435-2] SessionState.java:213 Performing node.
getName()

- DEBUG [1562158685@qtp-875788435-2] SessionState.java:229 Performed node.
getName() in 20114us

- DEBUG [1562158685@qtp-875788435-2] SessionState.java:213 Performing node.
getProperty({internal}principalName)

- DEBUG [1562158685@qtp-875788435-2] SessionState.java:229 Performed node.
getProperty({internal}principalName) in 39112us

Please suggest, what all these activities are? What is the purpose of these 
activities? What, if we skip such activities? How to skip these?

Thanks and regards,
Nilay



"Legal Disclaimer: This electronic message and all contents contain 
information from Cybage Software Private Limited which may be privileged, 
confidential, or otherwise protected from disclosure. The information is 
intended to be for the addressee(s) only. If you are not an addressee, any 
disclosure, copy, distribution, or use of the contents of this message is 
strictly prohibited. If you have received this electronic message in error 
please notify the sender by reply e-mail to and destroy the original message
and all copies. Cybage has taken every reasonable precaution to minimize the
risk of malicious content in the mail, but is not liable for any damage you 
may sustain as a result of any malicious content in this e-mail. You should 
carry out your own malicious content checks before opening the e-mail or 
attachment." 
www.cybage.com"

"Legal Disclaimer: This electronic message and all contents contain information from Cybage Software Private Limited which may be privileged, confidential, or otherwise protected from disclosure. The information is intended to be for the addressee(s) only. If you are not an addressee, any disclosure, copy, distribution, or use of the contents of this message is strictly prohibited. If you have received this electronic message in error please notify the sender by reply e-mail to and destroy the original message and all copies. Cybage has taken every reasonable precaution to minimize the risk of malicious content in the mail, but is not liable for any damage you may sustain as a result of any malicious content in this e-mail. You should carry out your own malicious content checks before opening the e-mail or attachment." 
www.cybage.com