You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Trevor Boicey <tb...@brit.ca> on 2002/08/28 23:40:35 UTC
Is my index corrupt?
I have a typical app, running Lucene to index web pages, has been
working fine for a few months.
I've noticed that a lot of the lucene native methods are throwing
exceptions lately, always on the same document it seems. It is like
there is a document in my index that is internally broken.
If I call optimize, it throws:
java.lang.ArrayIndexOutOfBoundsException: 110 >= 6
...and I suspect doesn't optimize, more later.
I also have an IndexReader that goes from 0 to reader.maxDoc and
looks at one of the fields. It throws the same exception when it
attempts to view document #12367, although it works below and above that
number.
(ie: Document MyDocument = reader.document(i); // throws when i=12367)
It doesn't really affect my code since I can see every other
document, but I have the feeling that my index can never optimize since
it keeps failing on that record whenever it looks at it, either to
optimize or to read it.
Am I correct in guessing that that document was corrupt?
Anyways, I tried hard-coding a delete for that document, and it did
remove it, but now optimize fails with "java.io.IOException: read past EOF".
I think my index is getting messed up because it should be shrinking
quickly because my search scope is, but it's getting larger, likely due
to all the failed optimize attempts.
Any solution? Any way to stop it happening again?
--
Trevor Boicey, P. Eng.
Ottawa, Canada, tboicey@brit.ca
ICQ #17432933 http://www.brit.ca/~tboicey/
"I saw the Dipsy, but WHERE WAS THE DOODLE?" - Phil
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>