You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Jean-Michel RAMSEYER <jm...@greenivory.com> on 2010/03/23 22:36:42 UTC

java.io.IOException: read past EOF

Hi there,

I'm new in Lucene's world and I'm currently meeting a problem on an  
index.
I'm running Lucene 2.4.1 on a Linux server with a sun jvm version   
1.6.0.17b04, in which the issue http://issues.apache.org/jira/browse/LUCENE-1282 
  is solved.
I tried to open indexes on another computer with luke but it fails too.
Files segments* are empty, so is there a way to rebuild index from cfs  
files? Is there a way to recover this index?
Thank you for your answers.

Exception trace :
java.io.IOException: read past EOF
	at  
org 
.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java: 
151)
	at  
org 
.apache 
.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
	at  
org 
.apache 
.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:36)
	at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:68)
	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:221)
	at org.apache.lucene.index.DirectoryIndexReader 
$1.doBody(DirectoryIndexReader.java:95)
	at org.apache.lucene.index.SegmentInfos 
$FindSegmentsFile.run(SegmentInfos.java:653)
	at  
org 
.apache 
.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
	at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
	at org.apache.lucene.index.IndexReader.open(IndexReader.java:206)
	at org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:47)

ls -lah result :
total 18G
drwxr-xr-x   2 tomcat tomcat 4.0K 2010-03-22 16:29 .
drwxr-xr-x 121 tomcat tomcat  12K 2010-03-23 14:22 ..
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 13:57 _1gg2.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-20 21:45 _1yhj.cfs
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-21 04:16 _2gdz.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-21 15:00 _2y9u.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 03:21 _3ghg.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 07:09 _3xty.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 12:24 _4ekl.cfs
-rw-r--r--   1 tomcat tomcat 192M 2010-03-22 13:25 _4gn2.cfs
-rw-r--r--   1 tomcat tomcat 198M 2010-03-22 14:23 _4ief.cfs
-rw-r--r--   1 tomcat tomcat 195M 2010-03-22 15:14 _4kbm.cfs
-rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:18 _4kil.cfs
-rw-r--r--   1 tomcat tomcat  23M 2010-03-22 15:22 _4kop.cfs
-rw-r--r--   1 tomcat tomcat  22M 2010-03-22 15:27 _4ku0.cfs
-rw-r--r--   1 tomcat tomcat  25M 2010-03-22 15:31 _4kzb.cfs
-rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:36 _4l56.cfs
-rw-r--r--   1 tomcat tomcat 1.9M 2010-03-22 15:36 _4l5r.cfs
-rw-r--r--   1 tomcat tomcat 2.0M 2010-03-22 15:37 _4l6c.cfs
-rw-r--r--   1 tomcat tomcat 165K 2010-03-22 15:37 _4l6d.cfs
-rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 _4l6e.cfs
-rw-r--r--   1 tomcat tomcat  80K 2010-03-22 15:37 _4l6f.cfs
-rw-r--r--   1 tomcat tomcat 149K 2010-03-22 15:37 _4l6g.cfs
-rw-r--r--   1 tomcat tomcat 218K 2010-03-22 15:37 _4l6h.cfs
-rw-r--r--   1 tomcat tomcat 198K 2010-03-22 15:37 _4l6i.cfs
-rw-r--r--   1 tomcat tomcat  45K 2010-03-22 15:37 _4l6j.cfs
-rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 _4l6k.cfs
-rw-r--r--   1 tomcat tomcat 158K 2010-03-22 15:37 _4l6l.cfs
-rw-r--r--   1 tomcat tomcat 116K 2010-03-22 15:37 _4l6m.cfs
-rw-r--r--   1 tomcat tomcat 1.1M 2010-03-22 15:37 _4l6n.cfs
-rw-r--r--   1 tomcat tomcat 128K 2010-03-22 15:37 _4l6o.cfs
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 04:12 _hnt.cfs
-rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments_44o3
-rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments_44o4
-rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments.gen
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 07:52 _ywu.cfs


Re: java.io.IOException: read past EOF

Posted by Michael McCandless <lu...@mikemccandless.com>.
It can be tricky.... eg if segments share doc stores, I think you
can't always recover that.

But this index seems not to have separate doc stores (no *.cfx), so, I
think in theory one could regenerate the segment metadata
(SegmentInfo) from the index files, but I don't know that anyone has
created this yet.

Also, it could in general result in re-attaching segments that had
been merged away (ie, causing duplicates in the index).

Mike

On Wed, Mar 24, 2010 at 2:39 AM, Ted Dunning <te...@gmail.com> wrote:
> The documentation (
> http://lucene.apache.org/java/2_4_0/fileformats.html#File%20Naming) makes it
> seem that the cfs files could be used to recover most of the information
> from the index.  Is that not so?
>
>
> On Tue, Mar 23, 2010 at 11:30 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Your index is in serious trouble -- you have 2 segments_N files, both
>> of which are 0 length.
>>
>> This won't be easy to recover (CheckIndex won't be able to).
>>
>

Re: java.io.IOException: read past EOF

Posted by Ted Dunning <te...@gmail.com>.
The documentation (
http://lucene.apache.org/java/2_4_0/fileformats.html#File%20Naming) makes it
seem that the cfs files could be used to recover most of the information
from the index.  Is that not so?


On Tue, Mar 23, 2010 at 11:30 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Your index is in serious trouble -- you have 2 segments_N files, both
> of which are 0 length.
>
> This won't be easy to recover (CheckIndex won't be able to).
>

Re: java.io.IOException: read past EOF

Posted by Michael McCandless <lu...@mikemccandless.com>.
Your index is in serious trouble -- you have 2 segments_N files, both
of which are 0 length.

This won't be easy to recover (CheckIndex won't be able to).

Any idea how this happened?  Was this index created using 2.4.x?

Mike

On Tue, Mar 23, 2010 at 5:36 PM, Jean-Michel RAMSEYER
<jm...@greenivory.com> wrote:
> Hi there,
>
> I'm new in Lucene's world and I'm currently meeting a problem on an index.
> I'm running Lucene 2.4.1 on a Linux server with a sun jvm version
>  1.6.0.17b04, in which the issue
> http://issues.apache.org/jira/browse/LUCENE-1282 is solved.
> I tried to open indexes on another computer with luke but it fails too.
> Files segments* are empty, so is there a way to rebuild index from cfs
> files? Is there a way to recover this index?
> Thank you for your answers.
>
> Exception trace :
> java.io.IOException: read past EOF
>        at
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151)
>        at
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>        at
> org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:36)
>        at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:68)
>        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:221)
>        at
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)
>        at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
>        at
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:206)
>        at
> org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:47)
>
> ls -lah result :
> total 18G
> drwxr-xr-x   2 tomcat tomcat 4.0K 2010-03-22 16:29 .
> drwxr-xr-x 121 tomcat tomcat  12K 2010-03-23 14:22 ..
> -rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 13:57 _1gg2.cfs
> -rw-r--r--   1 tomcat tomcat 2.0G 2010-03-20 21:45 _1yhj.cfs
> -rw-r--r--   1 tomcat tomcat 1.9G 2010-03-21 04:16 _2gdz.cfs
> -rw-r--r--   1 tomcat tomcat 2.0G 2010-03-21 15:00 _2y9u.cfs
> -rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 03:21 _3ghg.cfs
> -rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 07:09 _3xty.cfs
> -rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 12:24 _4ekl.cfs
> -rw-r--r--   1 tomcat tomcat 192M 2010-03-22 13:25 _4gn2.cfs
> -rw-r--r--   1 tomcat tomcat 198M 2010-03-22 14:23 _4ief.cfs
> -rw-r--r--   1 tomcat tomcat 195M 2010-03-22 15:14 _4kbm.cfs
> -rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:18 _4kil.cfs
> -rw-r--r--   1 tomcat tomcat  23M 2010-03-22 15:22 _4kop.cfs
> -rw-r--r--   1 tomcat tomcat  22M 2010-03-22 15:27 _4ku0.cfs
> -rw-r--r--   1 tomcat tomcat  25M 2010-03-22 15:31 _4kzb.cfs
> -rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:36 _4l56.cfs
> -rw-r--r--   1 tomcat tomcat 1.9M 2010-03-22 15:36 _4l5r.cfs
> -rw-r--r--   1 tomcat tomcat 2.0M 2010-03-22 15:37 _4l6c.cfs
> -rw-r--r--   1 tomcat tomcat 165K 2010-03-22 15:37 _4l6d.cfs
> -rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 _4l6e.cfs
> -rw-r--r--   1 tomcat tomcat  80K 2010-03-22 15:37 _4l6f.cfs
> -rw-r--r--   1 tomcat tomcat 149K 2010-03-22 15:37 _4l6g.cfs
> -rw-r--r--   1 tomcat tomcat 218K 2010-03-22 15:37 _4l6h.cfs
> -rw-r--r--   1 tomcat tomcat 198K 2010-03-22 15:37 _4l6i.cfs
> -rw-r--r--   1 tomcat tomcat  45K 2010-03-22 15:37 _4l6j.cfs
> -rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 _4l6k.cfs
> -rw-r--r--   1 tomcat tomcat 158K 2010-03-22 15:37 _4l6l.cfs
> -rw-r--r--   1 tomcat tomcat 116K 2010-03-22 15:37 _4l6m.cfs
> -rw-r--r--   1 tomcat tomcat 1.1M 2010-03-22 15:37 _4l6n.cfs
> -rw-r--r--   1 tomcat tomcat 128K 2010-03-22 15:37 _4l6o.cfs
> -rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 04:12 _hnt.cfs
> -rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments_44o3
> -rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments_44o4
> -rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 segments.gen
> -rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 07:52 _ywu.cfs
>
>

Re: java.io.IOException: read past EOF

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Jean-Michael,

java-user@lucene is a better place to ask.

I'd do this:
* back up your index
* use CheckIndex tool (if it existed in your version of Lucene?)

Maybe Luke version you are using has a mismatching Lucene version?

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Jean-Michel RAMSEYER <jm...@greenivory.com>
> To: general@lucene.apache.org
> Sent: Tue, March 23, 2010 5:36:42 PM
> Subject: java.io.IOException: read past EOF
> 
> Hi there,

I'm new in Lucene's world and I'm currently meeting a problem 
> on an index.
I'm running Lucene 2.4.1 on a Linux server with a sun jvm 
> version  1.6.0.17b04, in which the issue 
> href="http://issues.apache.org/jira/browse/LUCENE-1282" target=_blank 
> >http://issues.apache.org/jira/browse/LUCENE-1282 is solved.
I tried to 
> open indexes on another computer with luke but it fails too.
Files segments* 
> are empty, so is there a way to rebuild index from cfs files? Is there a way to 
> recover this index?
Thank you for your answers.

Exception trace 
> :
java.io.IOException: read past EOF
    at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151)
    
> at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
    
> at 
> org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:36)
    
> at 
> org.apache.lucene.store.IndexInput.readInt(IndexInput.java:68)
    
> at 
> org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:221)
    
> at 
> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:95)
    
> at 
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
    
> at 
> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
    
> at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
    
> at 
> org.apache.lucene.index.IndexReader.open(IndexReader.java:206)
    
> at 
> org.apache.lucene.search.IndexSearcher.<init>(IndexSearcher.java:47)

ls 
> -lah result :
total 18G
drwxr-xr-x   2 tomcat tomcat 4.0K 2010-03-22 
> 16:29 .
drwxr-xr-x 121 tomcat tomcat  12K 2010-03-23 14:22 
> ..
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 13:57 
> _1gg2.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-20 21:45 
> _1yhj.cfs
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-21 04:16 
> _2gdz.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-21 15:00 
> _2y9u.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 03:21 
> _3ghg.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 07:09 
> _3xty.cfs
-rw-r--r--   1 tomcat tomcat 2.0G 2010-03-22 12:24 
> _4ekl.cfs
-rw-r--r--   1 tomcat tomcat 192M 2010-03-22 13:25 
> _4gn2.cfs
-rw-r--r--   1 tomcat tomcat 198M 2010-03-22 14:23 
> _4ief.cfs
-rw-r--r--   1 tomcat tomcat 195M 2010-03-22 15:14 
> _4kbm.cfs
-rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:18 
> _4kil.cfs
-rw-r--r--   1 tomcat tomcat  23M 2010-03-22 15:22 
> _4kop.cfs
-rw-r--r--   1 tomcat tomcat  22M 2010-03-22 15:27 
> _4ku0.cfs
-rw-r--r--   1 tomcat tomcat  25M 2010-03-22 15:31 
> _4kzb.cfs
-rw-r--r--   1 tomcat tomcat  21M 2010-03-22 15:36 
> _4l56.cfs
-rw-r--r--   1 tomcat tomcat 1.9M 2010-03-22 15:36 
> _4l5r.cfs
-rw-r--r--   1 tomcat tomcat 2.0M 2010-03-22 15:37 
> _4l6c.cfs
-rw-r--r--   1 tomcat tomcat 165K 2010-03-22 15:37 
> _4l6d.cfs
-rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 
> _4l6e.cfs
-rw-r--r--   1 tomcat tomcat  80K 2010-03-22 15:37 
> _4l6f.cfs
-rw-r--r--   1 tomcat tomcat 149K 2010-03-22 15:37 
> _4l6g.cfs
-rw-r--r--   1 tomcat tomcat 218K 2010-03-22 15:37 
> _4l6h.cfs
-rw-r--r--   1 tomcat tomcat 198K 2010-03-22 15:37 
> _4l6i.cfs
-rw-r--r--   1 tomcat tomcat  45K 2010-03-22 15:37 
> _4l6j.cfs
-rw-r--r--   1 tomcat tomcat  58K 2010-03-22 15:37 
> _4l6k.cfs
-rw-r--r--   1 tomcat tomcat 158K 2010-03-22 15:37 
> _4l6l.cfs
-rw-r--r--   1 tomcat tomcat 116K 2010-03-22 15:37 
> _4l6m.cfs
-rw-r--r--   1 tomcat tomcat 1.1M 2010-03-22 15:37 
> _4l6n.cfs
-rw-r--r--   1 tomcat tomcat 128K 2010-03-22 15:37 
> _4l6o.cfs
-rw-r--r--   1 tomcat tomcat 1.9G 2010-03-20 04:12 
> _hnt.cfs
-rw-r--r--   1 tomcat tomcat    0 2010-03-22 15:37 
> segments_44o3
-rw-r--r--   1 tomcat tomcat    0 2010-03-22 
> 15:37 segments_44o4
-rw-r--r--   1 tomcat tomcat    0 
> 2010-03-22 15:37 segments.gen
-rw-r--r--   1 tomcat tomcat 1.9G 
> 2010-03-20 07:52 _ywu.cfs