You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by petite_abeille <pe...@mac.com> on 2003/01/07 21:48:35 UTC

read past EOF?

Hello,

Here is a pretty fatal exception I get from time to time in Lucene...

java.io.IOException: read past EOF
         at  
org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:277)
         at org.apache.lucene.store.InputStream.readBytes(Unknown Source)
         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
         at org.apache.lucene.search.TermQuery.scorer(Unknown Source)
         at org.apache.lucene.search.BooleanQuery.scorer(Unknown Source)
         at org.apache.lucene.search.Query.scorer(Unknown Source)
         at org.apache.lucene.search.IndexSearcher.search(Unknown Source)
         at org.apache.lucene.search.Hits.getMoreDocs(Unknown Source)
         at org.apache.lucene.search.Hits.<init>(Unknown Source)
         at org.apache.lucene.search.Searcher.search(Unknown Source)
         at org.apache.lucene.search.Searcher.search(Unknown Source)

Any idea what could cause such, er, misbehavior?

PA.


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: read past EOF?

Posted by Doug Cutting <cu...@lucene.com>.
petite_abeille wrote:
> On Tuesday, Jan 7, 2003, at 22:46 Europe/Zurich, Doug Cutting wrote:
>> This could happen if Lucene's file locking is disabled or broken.
  [ ... ]
>>   File locking is known to be broken over NFS, and wasn't even present 
>> in early versions of Lucene. Are you using an ordinary FSDirectory to 
>> store your index?
> 
> Yes. Regular local HFS+ file system.

Have you tried using a UFS partition instead?  Lucene's file locking 
should work well on UFS, but it wouldn't surprise me if it has problems 
on HFS+.

Doug



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: read past EOF?

Posted by petite_abeille <pe...@mac.com>.
On Tuesday, Jan 7, 2003, at 22:46 Europe/Zurich, Doug Cutting wrote:

> It looks like the .fdx and one of the .f[0-9]* files are out of sync. 
> The .fdx file for each segment should be exactly eight times as long 
> as all of the .f[0-9] files for that segment.
>
> This could happen if Lucene's file locking is disabled or broken.

I see.

>   What version of Lucene are you using?

lucene-20020628.jar

>  What JVM?

Java HotSpot™ Client VM 1.3.1
English (United States)
Apple Computer, Inc.

Mac OS X 10.2.3
PPC

>   File locking is known to be broken over NFS, and wasn't even present 
> in early versions of Lucene. Are you using an ordinary FSDirectory to 
> store your index?

Yes. Regular local HFS+ file system.

>
> The scenario I can see that this would happen in is if two processes 
> or threads are permitted to modify an index at once.  If one were 
> optimizing and one were just adding a single document, then the same 
> segment name would be allocated to both, but one would write a much 
> larger segment.  If their operations were interleaved then some of the 
> segment's files would be written by one of them, and some by the 
> other, resulting in the sort of inconsistency you're seeing.
>
> You might start logging the start and end time of document additions 
> and index optimizations, to see if this sort of thing is happening...

Thanks. I will dig deeper in it.

PA.


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: read past EOF?

Posted by Doug Cutting <cu...@lucene.com>.
It looks like the .fdx and one of the .f[0-9]* files are out of sync. 
The .fdx file for each segment should be exactly eight times as long as 
all of the .f[0-9] files for that segment.

This could happen if Lucene's file locking is disabled or broken.  What 
version of Lucene are you using?  What JVM?  File locking is known to be 
broken over NFS, and wasn't even present in early versions of Lucene. 
Are you using an ordinary FSDirectory to store your index?

The scenario I can see that this would happen in is if two processes or 
threads are permitted to modify an index at once.  If one were 
optimizing and one were just adding a single document, then the same 
segment name would be allocated to both, but one would write a much 
larger segment.  If their operations were interleaved then some of the 
segment's files would be written by one of them, and some by the other, 
resulting in the sort of inconsistency you're seeing.

You might start logging the start and end time of document additions and 
index optimizations, to see if this sort of thing is happening...

Doug

petite_abeille wrote:
> Hello,
> 
> Here is a pretty fatal exception I get from time to time in Lucene...
> 
> java.io.IOException: read past EOF
>         at  
> org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:277)
>         at org.apache.lucene.store.InputStream.readBytes(Unknown Source)
>         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
>         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
>         at org.apache.lucene.search.TermQuery.scorer(Unknown Source)
>         at org.apache.lucene.search.BooleanQuery.scorer(Unknown Source)
>         at org.apache.lucene.search.Query.scorer(Unknown Source)
>         at org.apache.lucene.search.IndexSearcher.search(Unknown Source)
>         at org.apache.lucene.search.Hits.getMoreDocs(Unknown Source)
>         at org.apache.lucene.search.Hits.<init>(Unknown Source)
>         at org.apache.lucene.search.Searcher.search(Unknown Source)
>         at org.apache.lucene.search.Searcher.search(Unknown Source)
> 
> Any idea what could cause such, er, misbehavior?
> 
> PA.
> 
> 
> -- 
> To unsubscribe, e-mail:   
> <ma...@jakarta.apache.org>
> For additional commands, e-mail: 
> <ma...@jakarta.apache.org>
> 


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: RE : read past EOF?

Posted by petite_abeille <pe...@mac.com>.
On Sunday, Jan 12, 2003, at 15:43 Europe/Zurich, Rasik Pandey wrote:

> Are you using a MultiSearcher?

No.

PA.


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE : read past EOF?

Posted by Rasik Pandey <ra...@ajlsm.com>.
Are you using a MultiSearcher?

-----Message d'origine-----
De : petite_abeille [mailto:petite_abeille@mac.com] 
Envoyé : mardi 7 janvier 2003 21:49
À : lucene-user@jakarta.apache.org
Objet : read past EOF?


Hello,

Here is a pretty fatal exception I get from time to time in Lucene...

java.io.IOException: read past EOF
         at  
org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:277)
         at org.apache.lucene.store.InputStream.readBytes(Unknown
Source)
         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
         at org.apache.lucene.index.SegmentReader.norms(Unknown Source)
         at org.apache.lucene.search.TermQuery.scorer(Unknown Source)
         at org.apache.lucene.search.BooleanQuery.scorer(Unknown Source)
         at org.apache.lucene.search.Query.scorer(Unknown Source)
         at org.apache.lucene.search.IndexSearcher.search(Unknown
Source)
         at org.apache.lucene.search.Hits.getMoreDocs(Unknown Source)
         at org.apache.lucene.search.Hits.<init>(Unknown Source)
         at org.apache.lucene.search.Searcher.search(Unknown Source)
         at org.apache.lucene.search.Searcher.search(Unknown Source)

Any idea what could cause such, er, misbehavior?

PA.


--
To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
For additional commands, e-mail:
<ma...@jakarta.apache.org>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>