You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joe Berkovitz <jb...@ruckusnetwork.com> on 2004/04/27 21:00:15 UTC

Read past EOF and negative bufferLength problem (1.4 rc2)

Using Lucene 1.4 rc2 I've run into a fatal problem: certain 
PhraseQueries cause a "Read Past EOF" exception (see below), while other 
PhraseQueries enter an infinite loop due to a negative bufferLength 
field in CSInputStream.  Environment is WinXP, JDK 1.4.2.  The index is 
large, incorporating 1,000,000 documents each of which has 3 stored, 
indexed fields of 10-100 chars.

The problem does not occur with Lucene 1.3 indexing the exact same set 
of Documents.  Nor does it occur with 1.4 rc2 using various smaller sets 
of documents.  Right now my workaround is to use Lucene 1.3.

For the PhraseQuery "a y" (that's right, two single-letter terms), the 
read-past-EOF exception is as follows:

java.io.IOException: read past EOF
    at org.apache.lucene.store.InputStream.refill(InputStream.java:154)
    at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
    at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
    at 
org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:59)
    at 
org.apache.lucene.index.SegmentTermDocs.skipTo(SegmentTermDocs.java:187)
    at 
org.apache.lucene.search.PhrasePositions.skipTo(PhrasePositions.java:47)
    at org.apache.lucene.search.PhraseScorer.next(PhraseScorer.java:69)
    at org.apache.lucene.search.Scorer.score(Scorer.java:37)
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:81)
    at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:64)
    at org.apache.lucene.search.Hits.<init>(Hits.java:43)
    at org.apache.lucene.search.Searcher.search(Searcher.java:33)
    at org.apache.lucene.search.Searcher.search(Searcher.java:27)
    at...

For the phrase query "z y", an  infinite loop is entered.  The loop 
occurs due to a similar condition to read-past-EOF: at line 153 of 
org.apache.lucene.store.InputStream, the value of bufferLength goes 
negative due to the value of start exceeding the value of end.  This in 
turn seems to be a consequence of a seek to a position past the end of 
the stream.

Something is clearly corrupt somewhere in the index structure.  I'd love 
to post the files that reproduce the problem, but it's about 100 MB of 
data.  If someone on the Lucene dev team wants to give me an upload 
destination, I can post the index somewhere and you can play with the 
problem.

regards and thanks for any assistance,

Joe Berkovitz
Chief Architect
Ruckus Network, Inc.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Read past EOF and negative bufferLength problem (1.4 rc2)

Posted by Joe Berkovitz <jb...@ruckusnetwork.com>.
Daniel,

Everything works fine with the latest CVS version of lucene.  It looks 
like the bug I hit was the one that you referenced in your email which 
is now fixed.

Thanks for your help.

.       .    .  . ...joe


Daniel Naber wrote:

>Am Dienstag, 27. April 2004 21:00 schrieb Joe Berkovitz:
>
>  
>
>>Using Lucene 1.4 rc2 I've run into a fatal problem:
>>    
>>
>
>Could you try with the latest version from CVS? Several severe problems have 
>been fixed, but I'm not sure if yours was one of them. Also see
>http://issues.apache.org/bugzilla/show_bug.cgi?id=27587
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org