You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Rafael Rossini <ra...@gmail.com> on 2007/07/24 16:46:52 UTC

ArrayIndexOutOfBoundsException on TermScorer

Hello all,

I´m using solr in an app, but I´m getting an error that it might be a lucene
problem. When I perform a simple query like q = brasil I´m getting this
exception:

java.lang.ArrayIndexOutOfBoundsException: 1226511
   at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
   at org.apache.lucene.search.TermScorer.score(TermScorer.java:61)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:146)
   at org.apache.lucene.search.Searcher.search(Searcher.java:118)
   at org.apache.lucene.search.Searcher.search(Searcher.java:97)

I´m using a very recent build from lucene. In the TermScorer.class, line 74
is:

score *= normDecoder[norms[doc] & 0xFF]; // normalize for field

Thanks for any help, and sorry for cross-posting

Re: ArrayIndexOutOfBoundsException on TermScorer

Posted by Rafael Rossini <ra...@gmail.com>.
Got it,

    I don´t have a clue if this corruption was caused by hardware failure,
but that is possible because we suffer with a lot of power failures from
time to time. But the thing is that I´ve been using lucene for a long time
and I never got this kind of exception.

    The thing is that I´d like to delete this document, but I get now
another exception:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Array
index out of range: 106577
   at org.apache.lucene.util.BitVector.set(BitVector.java:53)
   at org.apache.lucene.index.SegmentReader.doDelete(SegmentReader.java:301)
   at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java
:674)
   at org.apache.lucene.index.MultiReader.doDelete(MultiReader.java:125)
   at org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java
:674)
   at teste.DeleteError.main(DeleteError.java:9)

Is there a way of fixing my index without having to rebuild it all from the
ground? It takes lots of hours to re-index my whole collection.

On 7/24/07, Yonik Seeley <yo...@apache.org> wrote:
>
> On 7/24/07, Rafael Rossini <ra...@gmail.com> wrote:
> > I did a litle debug and found that in the TermScorer, the byte[] norms
> has
> > size = 1.119.933, wich is the number of docs on my index, and there is a
> > docID = 1226511, that is if the "doc" variable in the method is the
> docID.
> >
> > I tried to access this document with reader.document() and got a *
> > java.io.IOException*: read past EOF.
> >
> > Any ideias how to fix or delete this document?
>
> That document does not exist (docids are just the index into the array
> of documents, which only goes up to 1.119.933 (if that's maxDoc()).
> So the big question is how the "doc" variable got set to 1226511.
>
> It sounds like perhaps index corruption to me.  The question is if
> it's due to a bug or a hardware fault.
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: ArrayIndexOutOfBoundsException on TermScorer

Posted by Yonik Seeley <yo...@apache.org>.
On 7/24/07, Rafael Rossini <ra...@gmail.com> wrote:
> I did a litle debug and found that in the TermScorer, the byte[] norms has
> size = 1.119.933, wich is the number of docs on my index, and there is a
> docID = 1226511, that is if the "doc" variable in the method is the docID.
>
> I tried to access this document with reader.document() and got a *
> java.io.IOException*: read past EOF.
>
> Any ideias how to fix or delete this document?

That document does not exist (docids are just the index into the array
of documents, which only goes up to 1.119.933 (if that's maxDoc()).
So the big question is how the "doc" variable got set to 1226511.

It sounds like perhaps index corruption to me.  The question is if
it's due to a bug or a hardware fault.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: ArrayIndexOutOfBoundsException on TermScorer

Posted by Rafael Rossini <ra...@gmail.com>.
I did a litle debug and found that in the TermScorer, the byte[] norms has
size = 1.119.933, wich is the number of docs on my index, and there is a
docID = 1226511, that is if the "doc" variable in the method is the docID.

I tried to access this document with reader.document() and got a *
java.io.IOException*: read past EOF.

Any ideias how to fix or delete this document?


On 7/24/07, Rafael Rossini <ra...@gmail.com> wrote:
>
> I don´t know the exact date of the build, but it is certainly before July
> 4, and before the LUCENE-843 patch was committed. My index has 1.119.934docs on it and is about
> 8.2G.
>
> I really don´t know how to reproduce this, the only query that I get this
> error, so far, is "brasil"... and I don´t know about the docID being too
> large, because in my app, I index daily more than 2000 docs, and I can
> access the newer with no problems...
>
> Do you have any ideia how can I debug better this, or how can I solve it?
>
> Thanks a lot
>
>
> On 7/24/07, Michael McCandless <lu...@mikemccandless.com> wrote:
> >
> >
> > That looks spooky.  It looks like either the norms array is not
> > large enough or that docID is too large.  Do you know how many
> > docs you have in your index?
> >
> > Is this easy to reproduce, maybe on a smaller index?
> >
> > There was a very large change recently (LUCENE-843) to speed
> > up indexing and it's possible that this introduced a bug.  Is
> > the build you are using after July 4?
> >
> > Mike
> >
> > "Rafael Rossini" <ra...@gmail.com> wrote:
> > > Hello all,
> > >
> > > I´m using solr in an app, but I´m getting an error that it might be a
> > > lucene
> > > problem. When I perform a simple query like q = brasil I´m getting
> > this
> > > exception:
> > >
> > > java.lang.ArrayIndexOutOfBoundsException: 1226511
> > >    at org.apache.lucene.search.TermScorer.score (TermScorer.java:74)
> > >    at org.apache.lucene.search.TermScorer.score(TermScorer.java:61)
> > >    at
> > >    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java
> > :146)
> > >    at org.apache.lucene.search.Searcher.search (Searcher.java:118)
> > >    at org.apache.lucene.search.Searcher.search(Searcher.java:97)
> > >
> > > I´m using a very recent build from lucene. In the TermScorer.class,
> > line
> > > 74
> > > is:
> > >
> > > score *= normDecoder[norms[doc] & 0xFF]; // normalize for field
> > >
> > > Thanks for any help, and sorry for cross-posting
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: ArrayIndexOutOfBoundsException on TermScorer

Posted by Rafael Rossini <ra...@gmail.com>.
I don´t know the exact date of the build, but it is certainly before July 4,
and before the LUCENE-843 patch was committed. My index has 1.119.934 docs
on it and is about 8.2G.

I really don´t know how to reproduce this, the only query that I get this
error, so far, is "brasil"... and I don´t know about the docID being too
large, because in my app, I index daily more than 2000 docs, and I can
access the newer with no problems...

Do you have any ideia how can I debug better this, or how can I solve it?

Thanks a lot


On 7/24/07, Michael McCandless <lu...@mikemccandless.com> wrote:
>
>
> That looks spooky.  It looks like either the norms array is not
> large enough or that docID is too large.  Do you know how many
> docs you have in your index?
>
> Is this easy to reproduce, maybe on a smaller index?
>
> There was a very large change recently (LUCENE-843) to speed
> up indexing and it's possible that this introduced a bug.  Is
> the build you are using after July 4?
>
> Mike
>
> "Rafael Rossini" <ra...@gmail.com> wrote:
> > Hello all,
> >
> > I´m using solr in an app, but I´m getting an error that it might be a
> > lucene
> > problem. When I perform a simple query like q = brasil I´m getting this
> > exception:
> >
> > java.lang.ArrayIndexOutOfBoundsException: 1226511
> >    at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
> >    at org.apache.lucene.search.TermScorer.score(TermScorer.java:61)
> >    at
> >    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:146)
> >    at org.apache.lucene.search.Searcher.search(Searcher.java:118)
> >    at org.apache.lucene.search.Searcher.search(Searcher.java:97)
> >
> > I´m using a very recent build from lucene. In the TermScorer.class, line
> > 74
> > is:
> >
> > score *= normDecoder[norms[doc] & 0xFF]; // normalize for field
> >
> > Thanks for any help, and sorry for cross-posting
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: ArrayIndexOutOfBoundsException on TermScorer

Posted by Michael McCandless <lu...@mikemccandless.com>.
That looks spooky.  It looks like either the norms array is not
large enough or that docID is too large.  Do you know how many
docs you have in your index?

Is this easy to reproduce, maybe on a smaller index?

There was a very large change recently (LUCENE-843) to speed
up indexing and it's possible that this introduced a bug.  Is
the build you are using after July 4?

Mike

"Rafael Rossini" <ra...@gmail.com> wrote:
> Hello all,
> 
> I´m using solr in an app, but I´m getting an error that it might be a
> lucene
> problem. When I perform a simple query like q = brasil I´m getting this
> exception:
> 
> java.lang.ArrayIndexOutOfBoundsException: 1226511
>    at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
>    at org.apache.lucene.search.TermScorer.score(TermScorer.java:61)
>    at
>    org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:146)
>    at org.apache.lucene.search.Searcher.search(Searcher.java:118)
>    at org.apache.lucene.search.Searcher.search(Searcher.java:97)
> 
> I´m using a very recent build from lucene. In the TermScorer.class, line
> 74
> is:
> 
> score *= normDecoder[norms[doc] & 0xFF]; // normalize for field
> 
> Thanks for any help, and sorry for cross-posting

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org