You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Michael Neale <mi...@gmail.com> on 2007/04/24 02:18:29 UTC

Index/data corruption

Hi All, kind of related to another issue I am working on (
https://issues.apache.org/jira/browse/JCR-778 )

When I get an index failure (as above), often afterwards I get:

javax.jcr.RepositoryException: already contains: _c: already contains: _c:
already contains: _c
    at org.apache.jackrabbit.core.SearchManager.initializeQueryHandler(
SearchManager.java:515)
    at org.apache.jackrabbit.core.SearchManager.<init>(SearchManager.java
:250)
    at org.apache.jackrabbit.core.RepositoryImpl.getSystemSearchManager(
RepositoryImpl.java:642)
    at org.apache.jackrabbit.core.RepositoryImpl.access$300(
RepositoryImpl.java:103)
    at
org.apache.jackrabbit.core.RepositoryImpl$WorkspaceInfo.getSearchManager(
RepositoryImpl.java:1638)
    at org.apache.jackrabbit.core.RepositoryImpl.initWorkspace(
RepositoryImpl.java:622)
    at org.apache.jackrabbit.core.RepositoryImpl.initStartupWorkspaces(
RepositoryImpl.java:375)
    at org.apache.jackrabbit.core.RepositoryImpl.<init>(RepositoryImpl.java
:293)
    at org.apache.jackrabbit.core.RepositoryImpl.create(RepositoryImpl.java
:573)
    at org.apache.jackrabbit.core.TransientRepository$2.getRepository(
TransientRepository.java:245)
    at org.apache.jackrabbit.core.TransientRepository.startRepository(
TransientRepository.java:265)
    at org.apache.jackrabbit.core.TransientRepository.login(
TransientRepository.java:333)
    at org.apache.jackrabbit.core.TransientRepository.login(
TransientRepository.java:363)
    at org.drools.repository.RulesRepositoryTest.getNoDelRepo(
RulesRepositoryTest.java:99)
    at org.drools.repository.RulesRepositoryTest.testLoadOnly(
RulesRepositoryTest.java:22)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at junit.framework.TestCase.runTest(TestCase.java:154)
    at junit.framework.TestCase.runBare(TestCase.java:127)
    at junit.framework.TestResult$1.protect(TestResult.java:106)
    at junit.framework.TestResult.runProtected(TestResult.java:124)
    at junit.framework.TestResult.run(TestResult.java:109)
    at junit.framework.TestCase.run(TestCase.java:118)
    at junit.framework.TestSuite.runTest(TestSuite.java:208)
    at junit.framework.TestSuite.run(TestSuite.java:203)
    at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(
JUnit3TestReference.java:128)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(
TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(
RemoteTestRunner.java:460)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(
RemoteTestRunner.java:673)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(
RemoteTestRunner.java:386)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(
RemoteTestRunner.java:196)
Caused by: java.lang.IllegalArgumentException: already contains: _c
    at org.apache.jackrabbit.core.query.lucene.IndexInfos.addName(
IndexInfos.java:170)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.deleteIndex(
MultiIndex.java:716)
    at
org.apache.jackrabbit.core.query.lucene.MultiIndex$DeleteIndex.execute(
MultiIndex.java:1553)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.executeAndLog(
MultiIndex.java:809)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.flush(
MultiIndex.java:740)
    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
:160)
    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
:85)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.<init>(
MultiIndex.java:268)
    at org.apache.jackrabbit.core.query.lucene.SearchIndex.doInit(
SearchIndex.java:245)
    at org.apache.jackrabbit.core.query.AbstractQueryHandler.init(
AbstractQueryHandler.java:44)
    at org.apache.jackrabbit.core.SearchManager.initializeQueryHandler(
SearchManager.java:513)
    ... 32 more
java.lang.IllegalArgumentException: already contains: _c
    at org.apache.jackrabbit.core.query.lucene.IndexInfos.addName(
IndexInfos.java:170)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.deleteIndex(
MultiIndex.java:716)
    at
org.apache.jackrabbit.core.query.lucene.MultiIndex$DeleteIndex.execute(
MultiIndex.java:1553)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.executeAndLog(
MultiIndex.java:809)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.flush(
MultiIndex.java:740)
    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
:160)
    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
:85)
    at org.apache.jackrabbit.core.query.lucene.MultiIndex.<init>(
MultiIndex.java:268)
    at org.apache.jackrabbit.core.query.lucene.SearchIndex.doInit(
SearchIndex.java:245)
    at org.apache.jackrabbit.core.query.AbstractQueryHandler.init(
AbstractQueryHandler.java:44)
    at org.apache.jackrabbit.core.SearchManager.initializeQueryHandler(
SearchManager.java:513)
    at org.apache.jackrabbit.core.SearchManager.<init>(SearchManager.java
:250)
    at org.apache.jackrabbit.core.RepositoryImpl.getSystemSearchManager(
RepositoryImpl.java:642)
    at org.apache.jackrabbit.core.RepositoryImpl.access$300(
RepositoryImpl.java:103)
    at
org.apache.jackrabbit.core.RepositoryImpl$WorkspaceInfo.getSearchManager(
RepositoryImpl.java:1638)
    at org.apache.jackrabbit.core.RepositoryImpl.initWorkspace(
RepositoryImpl.java:622)
    at org.apache.jackrabbit.core.RepositoryImpl.initStartupWorkspaces(
RepositoryImpl.java:375)
    at org.apache.jackrabbit.core.RepositoryImpl.<init>(RepositoryImpl.java
:293)
    at org.apache.jackrabbit.core.RepositoryImpl.create(RepositoryImpl.java
:573)
    at org.apache.jackrabbit.core.TransientRepository$2.getRepository(
TransientRepository.java:245)
    at org.apache.jackrabbit.core.TransientRepository.startRepository(
TransientRepository.java:265)
    at org.apache.jackrabbit.core.TransientRepository.login(
TransientRepository.java:333)
    at org.apache.jackrabbit.core.TransientRepository.login(
TransientRepository.java:363)
...

when trying to login to the repository the next time. This is extremely
alarming to me. If I go in and blow away the contents of the index
directories, it will then work for a while after that (once it has recreted)
but it only takes a dozen or so "cycles" (startup, run a query, shutdown) to
get back into this state.

This got me thinking, how are the lucene indexes kept in sync - they can't
be part of the transaction can they? I mean they can try to be. This worries
me, I don't like being worried.

Any advice much appreciated, this is pretty much stopping the show for me
now.

Also, I have uploaded a repository+test to show the "corrption":
http://www.users.on.net/~michaelneale/work/jackrabbit-JCR-778.-corrupt-index-or-data.zip(mentioned
in the JIRA)

Michael

Re: Index/data corruption

Posted by Willis Morse <wi...@mac.com>.
great, thanks.

I'm working on a few canonical test cases like this to see exactly  
what I can and can't get away with. If I find any problems I'll post  
them.

- will


On Apr 25, 2007, at 1:05 PM, Alexandru Popescu ☀ wrote:

> On 4/25/07, Willis Morse <wi...@mac.com> wrote:
>> Is the indexing done asynchronously?
>>
>
> Afaik indexing is not done asynch, so your way of handling the
> repository should be safe.
>
> ./alex
> --
> .w( the_mindstorm )p.
>
>> In my Swing app, I might do something like this:
>>
>>         session.save();
>>         session.logout();
>>         System.exit();
>>
>> Is this safe?
>>
>> If not, how do we programmatically determine when it is safe to exit
>> the app?
>>
>> - Willis Morse
>>
>>
>> On Apr 25, 2007, at 3:49 AM, Marcel Reutegger wrote:
>>
>> > the lucene indexes are updated after a jackrabbit transaction is
>> > committed. if the jvm process is killed during that update, the
>> > index may be ouf of sync. See: http://issues.apache.org/jira/ 
>> browse/
>> > JCR-204
>>
>>


Re: Index/data corruption

Posted by Alexandru Popescu ☀ <th...@gmail.com>.
On 4/25/07, Willis Morse <wi...@mac.com> wrote:
> Is the indexing done asynchronously?
>

Afaik indexing is not done asynch, so your way of handling the
repository should be safe.

./alex
--
.w( the_mindstorm )p.

> In my Swing app, I might do something like this:
>
>         session.save();
>         session.logout();
>         System.exit();
>
> Is this safe?
>
> If not, how do we programmatically determine when it is safe to exit
> the app?
>
> - Willis Morse
>
>
> On Apr 25, 2007, at 3:49 AM, Marcel Reutegger wrote:
>
> > the lucene indexes are updated after a jackrabbit transaction is
> > committed. if the jvm process is killed during that update, the
> > index may be ouf of sync. See: http://issues.apache.org/jira/browse/
> > JCR-204
>
>

Re: Index/data corruption

Posted by Marcel Reutegger <ma...@gmx.net>.
Michael Neale wrote:
> yeah it looks like it is a problem that happens when it is not shutdown
> properly (which mostly happens in my unit tests) - so once I make sure it
> shuts down, its ok.
> 
> but there is a problem with the recovery code not working - but that can be
> corrected by blowing away the indexes.

this is now fixed, see: https://issues.apache.org/jira/browse/JCR-778

please note that the problem during index recovery only occurs in a certain 
situation when an action from the index merger is present in the redo log.

michael, thank you for all your help in resolving this issue.

regards
  marcel

Re: Index/data corruption

Posted by Michael Neale <mi...@gmail.com>.
yeah it looks like it is a problem that happens when it is not shutdown
properly (which mostly happens in my unit tests) - so once I make sure it
shuts down, its ok.

but there is a problem with the recovery code not working - but that can be
corrected by blowing away the indexes.

On 4/26/07, Willis Morse <wi...@mac.com> wrote:
>
> Great, thanks for the info Marcel.
>
> - Will
>
>
> On Apr 25, 2007, at 4:19 PM, Marcel Reutegger wrote:
>
> > Willis Morse wrote:
> >> Is the indexing done asynchronously?
> >
> > no, but it is also not completely synchronously ;)
> >
> > the query handler writes a forward log with the index changes. then
> > the changes are indexed but not immediately written to disk. this
> > is done synchronously. you can configure an idle time for the in
> > memory part of the index to be written to disk.
> >
> > some part of the indexing may be asynchronous, but not by default:
> > text extraction from nt:resource nodes. if you configure the
> > extractorPoolSize parameter in the SearchIndex element.
> >
> > See: http://issues.apache.org/jira/browse/JCR-390
> >
> >> In my Swing app, I might do something like this:
> >>     session.save();
> >>     session.logout();
> >>     System.exit();
> >> Is this safe?
> >
> > this depends on how you instantiated the repository. if you are
> > using a TransientRepository it should be safe because once the
> > session (assuming it is the only session your application uses)
> > logs out the underlying repository is also shutdown.
> >
> > if you started jackrabbit manually you should also shut it down
> > before you exit the jvm. I say should because jackrabbit is able to
> > handle this case and also the derby database. even though this is
> > not the preferred way, because it forces  derby but also the query
> > handler to potentially process a forward log when jackrabbit is
> > started the next time. so, whenever possible you should shut down
> > jackrabbit properly before you exit the jvm (under the assumption
> > that you are using jackrabbit in-process and not via rmi).
> >
> >> If not, how do we programmatically determine when it is safe to
> >> exit the app?
> >
> > Call RepositoryImpl.shutdown() when you are finished using
> > jackrabbit and then exit.
> >
> > regards
> >  marcel
>
>

Re: Index/data corruption

Posted by Willis Morse <wi...@mac.com>.
Great, thanks for the info Marcel.

- Will


On Apr 25, 2007, at 4:19 PM, Marcel Reutegger wrote:

> Willis Morse wrote:
>> Is the indexing done asynchronously?
>
> no, but it is also not completely synchronously ;)
>
> the query handler writes a forward log with the index changes. then  
> the changes are indexed but not immediately written to disk. this  
> is done synchronously. you can configure an idle time for the in  
> memory part of the index to be written to disk.
>
> some part of the indexing may be asynchronous, but not by default:  
> text extraction from nt:resource nodes. if you configure the  
> extractorPoolSize parameter in the SearchIndex element.
>
> See: http://issues.apache.org/jira/browse/JCR-390
>
>> In my Swing app, I might do something like this:
>>     session.save();
>>     session.logout();
>>     System.exit();
>> Is this safe?
>
> this depends on how you instantiated the repository. if you are  
> using a TransientRepository it should be safe because once the  
> session (assuming it is the only session your application uses)  
> logs out the underlying repository is also shutdown.
>
> if you started jackrabbit manually you should also shut it down  
> before you exit the jvm. I say should because jackrabbit is able to  
> handle this case and also the derby database. even though this is  
> not the preferred way, because it forces  derby but also the query  
> handler to potentially process a forward log when jackrabbit is  
> started the next time. so, whenever possible you should shut down  
> jackrabbit properly before you exit the jvm (under the assumption  
> that you are using jackrabbit in-process and not via rmi).
>
>> If not, how do we programmatically determine when it is safe to  
>> exit the app?
>
> Call RepositoryImpl.shutdown() when you are finished using  
> jackrabbit and then exit.
>
> regards
>  marcel


Re: Index/data corruption

Posted by Marcel Reutegger <ma...@gmx.net>.
Willis Morse wrote:
> Is the indexing done asynchronously?

no, but it is also not completely synchronously ;)

the query handler writes a forward log with the index changes. then the changes 
are indexed but not immediately written to disk. this is done synchronously. you 
can configure an idle time for the in memory part of the index to be written to 
disk.

some part of the indexing may be asynchronous, but not by default: text 
extraction from nt:resource nodes. if you configure the extractorPoolSize 
parameter in the SearchIndex element.

See: http://issues.apache.org/jira/browse/JCR-390

> In my Swing app, I might do something like this:
> 
>     session.save();
>     session.logout();
>     System.exit();
> 
> Is this safe?

this depends on how you instantiated the repository. if you are using a 
TransientRepository it should be safe because once the session (assuming it is 
the only session your application uses) logs out the underlying repository is 
also shutdown.

if you started jackrabbit manually you should also shut it down before you exit 
the jvm. I say should because jackrabbit is able to handle this case and also 
the derby database. even though this is not the preferred way, because it forces 
  derby but also the query handler to potentially process a forward log when 
jackrabbit is started the next time. so, whenever possible you should shut down 
jackrabbit properly before you exit the jvm (under the assumption that you are 
using jackrabbit in-process and not via rmi).

> If not, how do we programmatically determine when it is safe to exit the 
> app?

Call RepositoryImpl.shutdown() when you are finished using jackrabbit and then exit.

regards
  marcel

Re: Index/data corruption

Posted by Willis Morse <wi...@mac.com>.
Is the indexing done asynchronously?

In my Swing app, I might do something like this:

	session.save();
	session.logout();
	System.exit();

Is this safe?

If not, how do we programmatically determine when it is safe to exit  
the app?

- Willis Morse


On Apr 25, 2007, at 3:49 AM, Marcel Reutegger wrote:

> the lucene indexes are updated after a jackrabbit transaction is  
> committed. if the jvm process is killed during that update, the  
> index may be ouf of sync. See: http://issues.apache.org/jira/browse/ 
> JCR-204


Re: Index/data corruption

Posted by Marcel Reutegger <ma...@gmx.net>.
Hi Michael,

the stack trace indicate that jackrabbit is not properly shutdown. when 
jackrabbit is started up again the query handler runs recovery code as can be 
seen in the stack trace:

Michael Neale wrote:
>    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
> :160)
>    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
> :85)
>    at org.apache.jackrabbit.core.query.lucene.MultiIndex.<init>(
> MultiIndex.java:268)

the following IllegalArgumentException is probably a bug. I'll create a jira 
issue and fix it.

> java.lang.IllegalArgumentException: already contains: _c
>    at org.apache.jackrabbit.core.query.lucene.IndexInfos.addName(
> IndexInfos.java:170)
>    at org.apache.jackrabbit.core.query.lucene.MultiIndex.deleteIndex(
> MultiIndex.java:716)
>    at
> org.apache.jackrabbit.core.query.lucene.MultiIndex$DeleteIndex.execute(
> MultiIndex.java:1553)
>    at org.apache.jackrabbit.core.query.lucene.MultiIndex.executeAndLog(
> MultiIndex.java:809)
>    at org.apache.jackrabbit.core.query.lucene.MultiIndex.flush(
> MultiIndex.java:740)
>    at org.apache.jackrabbit.core.query.lucene.Recovery.run(Recovery.java
> :160)
[...]
> when trying to login to the repository the next time. This is extremely
> alarming to me. If I go in and blow away the contents of the index
> directories, it will then work for a while after that (once it has 
> recreted)
> but it only takes a dozen or so "cycles" (startup, run a query, 
> shutdown) to
> get back into this state.

Can you please make sure that Jackrabbit is shutdown properly in your cycles and 
check if you still see those exceptions? I could imagine that there's an issue 
in the index recovery code that only comes into play when one repeatedly kills a 
jackrabbit instance and then starts it again.

> This got me thinking, how are the lucene indexes kept in sync - they can't
> be part of the transaction can they? I mean they can try to be. This 
> worries
> me, I don't like being worried.

the lucene indexes are updated after a jackrabbit transaction is committed. if 
the jvm process is killed during that update, the index may be ouf of sync. See: 
http://issues.apache.org/jira/browse/JCR-204

this is a consequence of the decoupled design of the persistence manager and the 
query handler. one solution would be to have a two-phase commit for both changes 
to the persistence manager and the query handler. but this would require a more 
sophisticated interface for the persistence manager.

> Any advice much appreciated, this is pretty much stopping the show for me
> now.

The more details I see from your issue, the more I think it is caused by your 
test setup. Please make sure that you do a clean shutdown in your test cycle.

regards
  marcel