You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Aleksander M. Stensby" <al...@integrasco.no> on 2006/11/23 12:04:42 UTC

Re: Reviving a dead index

Hey, saw this old thread, and was just wondering if any of you solved the  
problem? Same has happened to me now. Couldn't really trace back to the  
origin of the problem, but the segments file references a segment that is  
obviously corrupt/not complete...

I thought i might remove the uncomplete segment, but then i guess this  
would f* up my segments file. Any taker on how to remove the segment in  
question, and make the rest of the index work again?.. Since i guess its  
not as straightforward to just remove the segmentname from the segments  
file without changing some of the other crytic bytes aswell..?

- Aleksander


On Thu, 31 Aug 2006 03:42:28 +0200, Michael McCandless  
<lu...@mikemccandless.com> wrote:

> Stanislav Jordanov wrote:
>
>> For a moment I wondered what exactly do you mean by "compound file"?
>> Then I read http://lucene.apache.org/java/docs/fileformats.html and got  
>> the idea.
>> I do not have access to that specific machine that all this is  
>> happening at.
>> It is a 80x86 machine running Win 2003 server;
>> Sorry, but they neglected my question about the index is stored on a  
>> Local FS or on a NFS.
>> I was only able to obtain a directory listing of the index dir and  
>> guess what - there's no a /*_1j8s.cfs * /file at all!
>> Pitty, I can't have a look at segments file, but I guess it lists the  
>> _1j8s
>> Given these scarce resources, can you give me some further advise about  
>> what has happened and what can be done to prevent it from happening  
>> again?
>
> I'm assuming this is easily repeated (question from my last email) and
> not a transient error?  If it's transient, this could be explained by
> locking not working properly.
>
> If it's not transient (ie, happens every time you open this index),
> it sounds like indeed the segments file is referencing a segment that
> does not exist.
>
> But, how the index got into this state is a mystery.  I don't know of
> any existing Lucene bugs that can do this.  Furthermore, crashing
> an indexing process should not lead to this (it can lead to other things
> like only have a segments.new file and no segments file).
>
> Were there any earlier exceptions (before indexing hit an "improper
> shutdown") in your indexing process that could give a clue as to root
> cause?  Or for example was the machine rebooted and did windows to run
> a "filesystem check" on rebooting this box (which can remove corrupt
> files)?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Aleksander M. Stensby
Software Developer
Integrasco A/S
aleksander.stensby@integrasco.no
Tlf.: +47 41 22 82 72

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reviving a dead index

Posted by "Aleksander M. Stensby" <al...@integrasco.no>.
works like a charm Michael! (only thing is that SegmentInfos / SegmentInfo  
are final classe, (which I didnt know) so i was bugging around to really  
find the classes:) heh.

I was able to remove the broken segment. I must now get the MAX(id) from  
the clean remaining segments, then just regenerate from that point on.  
(ive indexed the ids, so thats a good pinpoint.)

Anyways, thanx! My logrotation was broken so was unable to trace back the  
root cause, but will report if it happens again!

On Thu, 23 Nov 2006 12:39:19 +0100, Michael McCandless  
<lu...@mikemccandless.com> wrote:

> Aleksander M. Stensby wrote:
>> Hey, saw this old thread, and was just wondering if any of you solved  
>> the problem? Same has happened to me now. Couldn't really trace back to  
>> the origin of the problem, but the segments file references a segment  
>> that is obviously corrupt/not complete...
>>  I thought i might remove the uncomplete segment, but then i guess this  
>> would f* up my segments file. Any taker on how to remove the segment in  
>> question, and make the rest of the index work again?.. Since i guess  
>> its not as straightforward to just remove the segmentname from the  
>> segments file without changing some of the other crytic bytes aswell..?
>
> I don't think the root cause was ever uncovered on this thread.
>
> Do you have any copying process to move your index from one machine to
> another, or across mount points on the same machine, or anything?  A
> copying step has been the cause of similar corruption in past issues.
> I would really like to get to the root cause of any and all
> corruption.
>
> To recover your index, it would be fairly simple to write a tool that
> reads the segments files, removes the known bad segments, and writes
> the segments file back out.  Something like this (I haven't tested!):
>
>    Directory dir = FSDirectory.getDirectory("/path/to/my/index", false);
>    SegmentInfos sis = new SegmentInfos();
>    sis.read(dir);
>    for(int i=0;i<sis.size();i++) {
>      SegmentInfo si = (SegmentInfo) sis.elementAt(i);
>      if (si.name.equals("_XXXX")) {
>        sis.removeElementAt(i);
>        break;
>      }
>    }
>    sis.write(dir);
>
> Please make a backup copy of your index before running this in case
> it messes anything up!!
>
> And note that by removing an entire segment, many documents are now
> gone from your index.  Determining which ones are gone and must be
> reindexed is not particularly easy unless you have a way to do so in
> your application...
>
> Also note that Lucene will not delete the now un-referenced (bad)
> segment file (ie, _XXXX.cfs, and other _XXXX files like _XXXX.del if
> it exists) so you will have to do that step manually.  (The current
> "trunk" version of Lucene does in fact remove unreferenced index
> files correctly but this hasn't been released yet).
>
> Mike
>
>>  - Aleksander
>>   On Thu, 31 Aug 2006 03:42:28 +0200, Michael McCandless  
>> <lu...@mikemccandless.com> wrote:
>>
>>> Stanislav Jordanov wrote:
>>>
>>>> For a moment I wondered what exactly do you mean by "compound file"?
>>>> Then I read http://lucene.apache.org/java/docs/fileformats.html and  
>>>> got the idea.
>>>> I do not have access to that specific machine that all this is  
>>>> happening at.
>>>> It is a 80x86 machine running Win 2003 server;
>>>> Sorry, but they neglected my question about the index is stored on a  
>>>> Local FS or on a NFS.
>>>> I was only able to obtain a directory listing of the index dir and  
>>>> guess what - there's no a /*_1j8s.cfs * /file at all!
>>>> Pitty, I can't have a look at segments file, but I guess it lists the  
>>>> _1j8s
>>>> Given these scarce resources, can you give me some further advise  
>>>> about what has happened and what can be done to prevent it from  
>>>> happening again?
>>>
>>> I'm assuming this is easily repeated (question from my last email) and
>>> not a transient error?  If it's transient, this could be explained by
>>> locking not working properly.
>>>
>>> If it's not transient (ie, happens every time you open this index),
>>> it sounds like indeed the segments file is referencing a segment that
>>> does not exist.
>>>
>>> But, how the index got into this state is a mystery.  I don't know of
>>> any existing Lucene bugs that can do this.  Furthermore, crashing
>>> an indexing process should not lead to this (it can lead to other  
>>> things
>>> like only have a segments.new file and no segments file).
>>>
>>> Were there any earlier exceptions (before indexing hit an "improper
>>> shutdown") in your indexing process that could give a clue as to root
>>> cause?  Or for example was the machine rebooted and did windows to run
>>> a "filesystem check" on rebooting this box (which can remove corrupt
>>> files)?
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>    --Aleksander M. Stensby
>> Software Developer
>> Integrasco A/S
>> aleksander.stensby@integrasco.no
>> Tlf.: +47 41 22 82 72
>>  ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Aleksander M. Stensby
Software Developer
Integrasco A/S
aleksander.stensby@integrasco.no
Tlf.: +47 41 22 82 72

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reviving a dead index

Posted by Michael McCandless <lu...@mikemccandless.com>.
Aleksander M. Stensby wrote:

 > works like a charm Michael! (only thing is that SegmentInfos /
 > SegmentInfo are final classe, (which I didnt know) so i was bugging
 > around to really find the classes:) heh.
 >
 > I was able to remove the broken segment. I must now get the MAX(id) from
 > the clean remaining segments, then just regenerate from that point on.
 > (ive indexed the ids, so thats a good pinpoint.)
 >
 > Anyways, thanx! My logrotation was broken so was unable to trace back
 > the root cause, but will report if it happens again!

Super!  Phew.  Please reply back if you make any progress on the root
cause.  If it comes down to a bug in Lucene we've got to ferret it out
and get it fixed.

> One last thing... can i be sure that the latest inserted documents in 
> fact was inserted into that broken segment? Or are they placed randomly 
> in the different segments?

Well, when documents are added they are always added into the "latest"
segment (when the writer flushes its RAMDirectory buffer).  Then, on
merging/optimizing, this segment will be merged with others before
it.  If you are certain it is the "newest" (ie, biggest name in base
36) segment(s) that you've lost, and you add your docs in increasing ID
order, then I believe getting the max(ID) that's in the index and
re-indexing the docs above that will make your index complete
again.  Best to do some serious testing to be sure though :)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reviving a dead index

Posted by "Aleksander M. Stensby" <al...@integrasco.no>.
One last thing... can i be sure that the latest inserted documents in fact  
was inserted into that broken segment? Or are they placed randomly in the  
different segments?

- Aleks

On Thu, 23 Nov 2006 12:39:19 +0100, Michael McCandless  
<lu...@mikemccandless.com> wrote:

> Aleksander M. Stensby wrote:
>> Hey, saw this old thread, and was just wondering if any of you solved  
>> the problem? Same has happened to me now. Couldn't really trace back to  
>> the origin of the problem, but the segments file references a segment  
>> that is obviously corrupt/not complete...
>>  I thought i might remove the uncomplete segment, but then i guess this  
>> would f* up my segments file. Any taker on how to remove the segment in  
>> question, and make the rest of the index work again?.. Since i guess  
>> its not as straightforward to just remove the segmentname from the  
>> segments file without changing some of the other crytic bytes aswell..?
>
> I don't think the root cause was ever uncovered on this thread.
>
> Do you have any copying process to move your index from one machine to
> another, or across mount points on the same machine, or anything?  A
> copying step has been the cause of similar corruption in past issues.
> I would really like to get to the root cause of any and all
> corruption.
>
> To recover your index, it would be fairly simple to write a tool that
> reads the segments files, removes the known bad segments, and writes
> the segments file back out.  Something like this (I haven't tested!):
>
>    Directory dir = FSDirectory.getDirectory("/path/to/my/index", false);
>    SegmentInfos sis = new SegmentInfos();
>    sis.read(dir);
>    for(int i=0;i<sis.size();i++) {
>      SegmentInfo si = (SegmentInfo) sis.elementAt(i);
>      if (si.name.equals("_XXXX")) {
>        sis.removeElementAt(i);
>        break;
>      }
>    }
>    sis.write(dir);
>
> Please make a backup copy of your index before running this in case
> it messes anything up!!
>
> And note that by removing an entire segment, many documents are now
> gone from your index.  Determining which ones are gone and must be
> reindexed is not particularly easy unless you have a way to do so in
> your application...
>
> Also note that Lucene will not delete the now un-referenced (bad)
> segment file (ie, _XXXX.cfs, and other _XXXX files like _XXXX.del if
> it exists) so you will have to do that step manually.  (The current
> "trunk" version of Lucene does in fact remove unreferenced index
> files correctly but this hasn't been released yet).
>
> Mike
>
>>  - Aleksander
>>   On Thu, 31 Aug 2006 03:42:28 +0200, Michael McCandless  
>> <lu...@mikemccandless.com> wrote:
>>
>>> Stanislav Jordanov wrote:
>>>
>>>> For a moment I wondered what exactly do you mean by "compound file"?
>>>> Then I read http://lucene.apache.org/java/docs/fileformats.html and  
>>>> got the idea.
>>>> I do not have access to that specific machine that all this is  
>>>> happening at.
>>>> It is a 80x86 machine running Win 2003 server;
>>>> Sorry, but they neglected my question about the index is stored on a  
>>>> Local FS or on a NFS.
>>>> I was only able to obtain a directory listing of the index dir and  
>>>> guess what - there's no a /*_1j8s.cfs * /file at all!
>>>> Pitty, I can't have a look at segments file, but I guess it lists the  
>>>> _1j8s
>>>> Given these scarce resources, can you give me some further advise  
>>>> about what has happened and what can be done to prevent it from  
>>>> happening again?
>>>
>>> I'm assuming this is easily repeated (question from my last email) and
>>> not a transient error?  If it's transient, this could be explained by
>>> locking not working properly.
>>>
>>> If it's not transient (ie, happens every time you open this index),
>>> it sounds like indeed the segments file is referencing a segment that
>>> does not exist.
>>>
>>> But, how the index got into this state is a mystery.  I don't know of
>>> any existing Lucene bugs that can do this.  Furthermore, crashing
>>> an indexing process should not lead to this (it can lead to other  
>>> things
>>> like only have a segments.new file and no segments file).
>>>
>>> Were there any earlier exceptions (before indexing hit an "improper
>>> shutdown") in your indexing process that could give a clue as to root
>>> cause?  Or for example was the machine rebooted and did windows to run
>>> a "filesystem check" on rebooting this box (which can remove corrupt
>>> files)?
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>    --Aleksander M. Stensby
>> Software Developer
>> Integrasco A/S
>> aleksander.stensby@integrasco.no
>> Tlf.: +47 41 22 82 72
>>  ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Aleksander M. Stensby
Software Developer
Integrasco A/S
aleksander.stensby@integrasco.no
Tlf.: +47 41 22 82 72

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Reviving a dead index

Posted by Michael McCandless <lu...@mikemccandless.com>.
Aleksander M. Stensby wrote:
> Hey, saw this old thread, and was just wondering if any of you solved 
> the problem? Same has happened to me now. Couldn't really trace back to 
> the origin of the problem, but the segments file references a segment 
> that is obviously corrupt/not complete...
> 
> I thought i might remove the uncomplete segment, but then i guess this 
> would f* up my segments file. Any taker on how to remove the segment in 
> question, and make the rest of the index work again?.. Since i guess its 
> not as straightforward to just remove the segmentname from the segments 
> file without changing some of the other crytic bytes aswell..?

I don't think the root cause was ever uncovered on this thread.

Do you have any copying process to move your index from one machine to
another, or across mount points on the same machine, or anything?  A
copying step has been the cause of similar corruption in past issues.
I would really like to get to the root cause of any and all
corruption.

To recover your index, it would be fairly simple to write a tool that
reads the segments files, removes the known bad segments, and writes
the segments file back out.  Something like this (I haven't tested!):

   Directory dir = FSDirectory.getDirectory("/path/to/my/index", false);
   SegmentInfos sis = new SegmentInfos();
   sis.read(dir);
   for(int i=0;i<sis.size();i++) {
     SegmentInfo si = (SegmentInfo) sis.elementAt(i);
     if (si.name.equals("_XXXX")) {
       sis.removeElementAt(i);
       break;
     }
   }
   sis.write(dir);

Please make a backup copy of your index before running this in case
it messes anything up!!

And note that by removing an entire segment, many documents are now
gone from your index.  Determining which ones are gone and must be
reindexed is not particularly easy unless you have a way to do so in
your application...

Also note that Lucene will not delete the now un-referenced (bad)
segment file (ie, _XXXX.cfs, and other _XXXX files like _XXXX.del if
it exists) so you will have to do that step manually.  (The current
"trunk" version of Lucene does in fact remove unreferenced index
files correctly but this hasn't been released yet).

Mike

> 
> - Aleksander
> 
> 
> On Thu, 31 Aug 2006 03:42:28 +0200, Michael McCandless 
> <lu...@mikemccandless.com> wrote:
> 
>> Stanislav Jordanov wrote:
>>
>>> For a moment I wondered what exactly do you mean by "compound file"?
>>> Then I read http://lucene.apache.org/java/docs/fileformats.html and 
>>> got the idea.
>>> I do not have access to that specific machine that all this is 
>>> happening at.
>>> It is a 80x86 machine running Win 2003 server;
>>> Sorry, but they neglected my question about the index is stored on a 
>>> Local FS or on a NFS.
>>> I was only able to obtain a directory listing of the index dir and 
>>> guess what - there's no a /*_1j8s.cfs * /file at all!
>>> Pitty, I can't have a look at segments file, but I guess it lists the 
>>> _1j8s
>>> Given these scarce resources, can you give me some further advise 
>>> about what has happened and what can be done to prevent it from 
>>> happening again?
>>
>> I'm assuming this is easily repeated (question from my last email) and
>> not a transient error?  If it's transient, this could be explained by
>> locking not working properly.
>>
>> If it's not transient (ie, happens every time you open this index),
>> it sounds like indeed the segments file is referencing a segment that
>> does not exist.
>>
>> But, how the index got into this state is a mystery.  I don't know of
>> any existing Lucene bugs that can do this.  Furthermore, crashing
>> an indexing process should not lead to this (it can lead to other things
>> like only have a segments.new file and no segments file).
>>
>> Were there any earlier exceptions (before indexing hit an "improper
>> shutdown") in your indexing process that could give a clue as to root
>> cause?  Or for example was the machine rebooted and did windows to run
>> a "filesystem check" on rebooting this box (which can remove corrupt
>> files)?
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 
> 
> --Aleksander M. Stensby
> Software Developer
> Integrasco A/S
> aleksander.stensby@integrasco.no
> Tlf.: +47 41 22 82 72
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org