You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joe Berkovitz <jb...@ruckusnetwork.com> on 2004/04/27 21:00:15 UTC
Read past EOF and negative bufferLength problem (1.4 rc2)
Using Lucene 1.4 rc2 I've run into a fatal problem: certain
PhraseQueries cause a "Read Past EOF" exception (see below), while other
PhraseQueries enter an infinite loop due to a negative bufferLength
field in CSInputStream. Environment is WinXP, JDK 1.4.2. The index is
large, incorporating 1,000,000 documents each of which has 3 stored,
indexed fields of 10-100 chars.
The problem does not occur with Lucene 1.3 indexing the exact same set
of Documents. Nor does it occur with 1.4 rc2 using various smaller sets
of documents. Right now my workaround is to use Lucene 1.3.
For the PhraseQuery "a y" (that's right, two single-letter terms), the
read-past-EOF exception is as follows:
java.io.IOException: read past EOF
at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at
org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:59)
at
org.apache.lucene.index.SegmentTermDocs.skipTo(SegmentTermDocs.java:187)
at
org.apache.lucene.search.PhrasePositions.skipTo(PhrasePositions.java:47)
at org.apache.lucene.search.PhraseScorer.next(PhraseScorer.java:69)
at org.apache.lucene.search.Scorer.score(Scorer.java:37)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:81)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
at org.apache.lucene.search.Hits.<init>(Hits.java:43)
at org.apache.lucene.search.Searcher.search(Searcher.java:33)
at org.apache.lucene.search.Searcher.search(Searcher.java:27)
at...
For the phrase query "z y", an infinite loop is entered. The loop
occurs due to a similar condition to read-past-EOF: at line 153 of
org.apache.lucene.store.InputStream, the value of bufferLength goes
negative due to the value of start exceeding the value of end. This in
turn seems to be a consequence of a seek to a position past the end of
the stream.
Something is clearly corrupt somewhere in the index structure. I'd love
to post the files that reproduce the problem, but it's about 100 MB of
data. If someone on the Lucene dev team wants to give me an upload
destination, I can post the index somewhere and you can play with the
problem.
regards and thanks for any assistance,
Joe Berkovitz
Chief Architect
Ruckus Network, Inc.
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Read past EOF and negative bufferLength problem (1.4 rc2)
Posted by Joe Berkovitz <jb...@ruckusnetwork.com>.
Daniel,
Everything works fine with the latest CVS version of lucene. It looks
like the bug I hit was the one that you referenced in your email which
is now fixed.
Thanks for your help.
. . . . ...joe
Daniel Naber wrote:
>Am Dienstag, 27. April 2004 21:00 schrieb Joe Berkovitz:
>
>
>
>>Using Lucene 1.4 rc2 I've run into a fatal problem:
>>
>>
>
>Could you try with the latest version from CVS? Several severe problems have
>been fixed, but I'm not sure if yours was one of them. Also see
>http://issues.apache.org/bugzilla/show_bug.cgi?id=27587
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org