You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by David Nuescheler <da...@gmail.com> on 2004/10/02 15:06:22 UTC

search index corruption?

hi guys,

i was looking into doing some very simple benchmarks to see 
how fast jackrabbit can create nodes.

my test script:
--------
import org.apache.jackrabbit.core.RepositoryFactory;
import javax.jcr.*;

public class PerfTest {

    public static void main (String[] args) {
        try {
            RepositoryFactory repof = RepositoryFactory.create(args[0]);
            Repository repo=repof.getRepository("localfs");
            Session session=repo.login(new
SimpleCredentials("uncled","".toCharArray()),"default");
            Node root=session.getRootNode();
            System.out.println(root.getProperty("jcr:primaryType").getString());
            if (root.hasNode("perftest")) {
                root.remove("perftest");
            }
            Node testroot=root.addNode("perftest", "nt:unstructured");
            root.save();
            long start=System.currentTimeMillis();
            int i=0;
            while (i<10) {
                Node testnode=testroot.addNode("test"+i,"nt:unstructured");
                System.out.println(testnode.getPath());
                i++;
            }
            testroot.save();
            long done=System.currentTimeMillis();
            System.out.println("time:"+(done-start)+"ms");
            session.logout();
        } catch (RepositoryException ex) {
            System.err.println(ex.toString());
        }
    }
}

----

if i clear (rm -rf) my "repositories" directory and start this test 
for the first time everything seems to work. if i start the same 
test repeatedly i get the following exception:

02.10.2004 14:53:54 *ERROR* [main] SearchManager: error indexing node.
(SearchManager.java, line 206)
java.io.IOException: Lock obtain timed out
	at org.apache.lucene.store.Lock.obtain(Lock.java:97)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:173)
	at org.apache.jackrabbit.core.search.lucene.AbstractIndex.getIndexWriter(AbstractIndex.java:101)

am i missing something? probably ;)

regards,
david
----------------------------------------------------------------------
standardize your content-repository !
                               http://www.jcp.org/en/jsr/detail?id=170
---------------------------------------< david.nuescheler@day.com >---

This message is a private communication. If you are not the intended
recipient, please do not read, copy, or use it, and do not disclose it
to others. Please notify the sender of the delivery error by replying
to this message, and then delete it from your system. Thank you.

The sender does not assume any liability for timely, trouble free,
complete, virus free, secure, error free or uninterrupted arrival of
this e-mail. For verification please request a hard copy version.


mailto:david.nuescheler@day.com
http://www.day.com

David Nuescheler
Chief Technology Officer
Day Software AG
Barfuesserplatz 6 / Postfach
4001 Basel
Switzerland

T  41 61 226 98 98
F  41 61 226 98 97

Re: search index corruption?

Posted by David Nuescheler <da...@gmail.com>.
hi paul,

thanks a lot for the post.

> ... or rather less elegantly by adding a 
> 'shutdown-hook' to the JVM using 
> Runtime.addShutdownHook(Thread).
> Using the 'finally' block would be the better way, and 
> would ensure that the repository is properly shutdown 
> regardless of what happens in the main loop of your test. 
> Even an OutOfMemoryError would trigger the repository being 
> shutdown -- that said, with an OOME, all bets are 
> off when it comes to running any code at all ;) The only time 
> this wouldn't shut the repository down properly is if the JVM 
> itself crashed -- does happen, but not very often, I hope! 
> Hopefully this will mean that the repository is not shutdown 
> properly so rarely that you don't /need/ (at least in the short-term) 
> to have an automatically recovering repository index.
i agree that there is no short-term need for the index to
recover and certainly the "finally" or the "jvm shutdown hook"
would both have solved my issue easily (or even as marcel
suggested just having repof.shutdown() ouside the try catch).

unfortunately, recently i am a bit burnt with vm's that crash or 
hang and unfortunately applications even more so. one application
deadlock or endless loop and a trigger-happy sysadmin stopping
your jvm with kill -9 is enough.

so i am just generally suspicious towards "infrastructure" like a 
repository that can get into an inconsistent state just because
it has been abnormally terminated.
especially if that "infrastructure" runs in the same jvm as 
the application and therefore restarting the jvm might happen 
very frequently and under unexpected circumstances.

anyhow, no immediate action necessary anyway ;)

regards,
david

Re: search index corruption?

Posted by Paul Russell <pr...@apache.org>.
On 3 Oct 2004, at 08:59, David Nuescheler wrote:
> is there a way to restart the index gracefully
> after abnormal termination, since something tells me that
> abnormal termination might happen quite frequently ;)

I'm not quite answering the question you're asking here, because I 
think it may be better to avoid the index being not being shutdown 
properly in the first place. I could well be teaching grandma to suck 
eggs here, so for apologies for that, but for the purposes of your 
test, you could substantially increase the chances of the of the 
shutdown method being executed by either putting the repository 
shutdown method call in a finally block:

         RepositoryFactory repof;
         try {
             repof = RepositoryFactory.create(args[0]);
             Repository repo=repof.getRepository("localfs");
             Session session=repo.login(new
SimpleCredentials("uncled","".toCharArray()),"default");
             Node root=session.getRootNode();
             
System.out.println(root.getProperty("jcr:primaryType").getString());
             if (root.hasNode("perftest")) {
                 root.remove("perftest");
             }
             Node testroot=root.addNode("perftest", "nt:unstructured");
             root.save();
             long start=System.currentTimeMillis();
             int i=0;
             while (i<10) {
                 Node 
testnode=testroot.addNode("test"+i,"nt:unstructured");
                 System.out.println(testnode.getPath());
                 i++;
             }
             testroot.save();
             long done=System.currentTimeMillis();
             System.out.println("time:"+(done-start)+"ms");
             session.logout();
         } catch (RepositoryException ex) {
             System.err.println(ex.toString());
         } finally {
             if ( repof != null ) { repof.shutdown(); }
         }

... or rather less elegantly by adding a 'shutdown-hook' to the JVM 
using Runtime.addShutdownHook(Thread).

Using the 'finally' block would be the better way, and would ensure 
that the repository is properly shutdown regardless of what happens in 
the main loop of your test. Even an OutOfMemoryError would trigger the 
repository being shutdown -- that said, with an OOME, all bets are off 
when it comes to running any code at all ;) The only time this wouldn't 
shut the repository down properly is if the JVM itself crashed -- does 
happen, but not very often, I hope! Hopefully this will mean that the 
repository is not shutdown properly so rarely that you don't /need/ (at 
least in the short-term) to have an automatically recovering repository 
index.

Hope that helps, and that I'm not wasting your time by telling you 
something you already knew!

Cheers,


Paul
-- 
Paul Russell
E-mail: prussell@apache.org
iChat/AIM: russelp@mac.com

Re: search index corruption?

Posted by David Nuescheler <da...@gmail.com>.
> you have to add the following line at the end to your test:
> repof.shutdown();

i thought it would be something like that.
thanks. 

is there a way to restart the index gracefully
after abnormal termination, since something tells me that
abnormal termination might happen quite frequently ;)

regards,
david

Re: search index corruption?

Posted by Marcel Reutegger <ma...@gmail.com>.
The search is currently implemented in a quite basic fashion. 
which means if the Repositry is not shutdown propertly, lucene might leave
a lock file in your temp folder.
that's why the lock timout exception occurs when you start your test
the second time. lucene assumes that another process is still using the index
and then gives up after some time.

you have to add the following line at the end to your test:

repof.shutdown();

this will shutdown the repository properly. 

On a windows machine the lock file is usually created in:
Documents and Settings/<userid>/Local Settings/Temp/luceneXXXX.lck
 
after you delete the file the repository will start again.

I will change the location where the lock file is created. it probably 
makes sense to create it in the index directory itself...

regards,
 marcel

On Sat, 2 Oct 2004 15:06:22 +0200, David Nuescheler
<da...@gmail.com> wrote:
> hi guys,
> 
> i was looking into doing some very simple benchmarks to see
> how fast jackrabbit can create nodes.
> 
> my test script:
> --------
> import org.apache.jackrabbit.core.RepositoryFactory;
> import javax.jcr.*;
> 
> public class PerfTest {
> 
>     public static void main (String[] args) {
>         try {
>             RepositoryFactory repof = RepositoryFactory.create(args[0]);
>             Repository repo=repof.getRepository("localfs");
>             Session session=repo.login(new
> SimpleCredentials("uncled","".toCharArray()),"default");
>             Node root=session.getRootNode();
>             System.out.println(root.getProperty("jcr:primaryType").getString());
>             if (root.hasNode("perftest")) {
>                 root.remove("perftest");
>             }
>             Node testroot=root.addNode("perftest", "nt:unstructured");
>             root.save();
>             long start=System.currentTimeMillis();
>             int i=0;
>             while (i<10) {
>                 Node testnode=testroot.addNode("test"+i,"nt:unstructured");
>                 System.out.println(testnode.getPath());
>                 i++;
>             }
>             testroot.save();
>             long done=System.currentTimeMillis();
>             System.out.println("time:"+(done-start)+"ms");
>             session.logout();
>         } catch (RepositoryException ex) {
>             System.err.println(ex.toString());
>         }
>     }
> }
> 
> ----
> 
> if i clear (rm -rf) my "repositories" directory and start this test
> for the first time everything seems to work. if i start the same
> test repeatedly i get the following exception:
> 
> 02.10.2004 14:53:54 *ERROR* [main] SearchManager: error indexing node.
> (SearchManager.java, line 206)
> java.io.IOException: Lock obtain timed out
>         at org.apache.lucene.store.Lock.obtain(Lock.java:97)
>         at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:173)
>         at org.apache.jackrabbit.core.search.lucene.AbstractIndex.getIndexWriter(AbstractIndex.java:101)
> 
> am i missing something? probably ;)
> 
> regards,
> david
> ----------------------------------------------------------------------
> standardize your content-repository !
>                                http://www.jcp.org/en/jsr/detail?id=170
> ---------------------------------------< david.nuescheler@day.com >---
> 
> This message is a private communication. If you are not the intended
> recipient, please do not read, copy, or use it, and do not disclose it
> to others. Please notify the sender of the delivery error by replying
> to this message, and then delete it from your system. Thank you.
> 
> The sender does not assume any liability for timely, trouble free,
> complete, virus free, secure, error free or uninterrupted arrival of
> this e-mail. For verification please request a hard copy version.
> 
> mailto:david.nuescheler@day.com
> http://www.day.com
> 
> David Nuescheler
> Chief Technology Officer
> Day Software AG
> Barfuesserplatz 6 / Postfach
> 4001 Basel
> Switzerland
> 
> T  41 61 226 98 98
> F  41 61 226 98 97
> 
> 



-- 
-----------------------------------------< marcel.reutegger@gmail.com >---
You can have it good, cheap, or fast. Any two. --Arthur C. Clarke
-----------------------------------------------< http://www.day.com >---