You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Edwin Tang <em...@yahoo.com> on 2004/09/29 18:23:46 UTC

"Orphan" segment files

Hello,

I'm seeing in my index directory some segment files that are not included in the segments or
deletable files. These segment files show their last modified date to be anywhere between a couple
of days ago to a few weeks ago. I'm wondering why these files are not contained inside segments or
deletable. Is it safe to delete them manually?

Thanks,
Ed


		
__________________________________
Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.
http://promotions.yahoo.com/new_mail 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: "Orphan" segment files

Posted by Edwin Tang <em...@yahoo.com>.
Dmitry,

Thanks for the help and pointers thus far. I know (or believe at the least)
that the files are not referenced by opening "segments" and "deletable" with a
hex editor.

I've explored the possibility of an exception that is not recorded to a log
file or written out to screen, so have double checked my code for exception
handling. I do not see any exceptions being logged, nor do I see them being
written out to the screen.

These segments about the size of my full index. Based on that, I have been
wondering it may be during optimization that something fails. But strangely, I
do not see exceptions as mentioned above. The only other thing I can think of
that may cause this is if someone shuts the indexer application down while
optimization is taking place.

Thanks again,
Ed

--- Dmitry Serebrennikov <dm...@earthlink.net> wrote:

> Edwin Tang wrote:
> 
> >Hello,
> >
> >I am running on Windows XP. Since our indices are constantly changing, we
> close
> >the searchers after each search.
> >
> >I was running Lucene 1.4 and 1.4.1, and saw this problem. I replaced 1.4.1
> with
> >1.4.2 on Friday, but haven't had a chance to observe its behavior since.
> >
> >Would you say it's safe to delete these files?
> >  
> >
> As far as I know, yes. The only segments that matter are those 
> referenced from the "segments" file or those that may still be open by 
> the application. By the way, how do you know they are not referenced? 
> Did you write some custom code to parse the segments file (or just using 
> a hex editor or something)?
> 
> Is it possible that your index update code sometimes fails by does not 
> record its exceptions? That could explain why you have abandoned 
> segments lying around.
> Also, are these segments about the size of your full index or about the 
> size of an incremental segment you might find when new documents are 
> added? If you are able to look into the "segments" file manually, 
> perhaps you can also put a new one together that references these 
> segments? This would allow you to use something like Luke 
> (http://www.getopt.org/luke/) to look into them and see what documents 
> they contain?
> 
> Good luck!
> Dmitry.
> 
> >Thanks,
> >Ed
> >
> >--- Dmitry Serebrennikov <dm...@earthlink.net> wrote:
> >
> >  
> >
> >>This is not a normal behavior, unless you are running on Windows and 
> >>have searchers open for that long that are still locking the segments 
> >>(but then they would be in deletable...).
> >>
> >>What version of Lucene are you running? At some point during the past 
> >>two months there were a few days when CVS snapshot would have had this 
> >>problem. If you are running from CVS, try the latest release and see if 
> >>this occurs again.
> >>
> >>Dmitry.
> >>
> >>
> >>Edwin Tang wrote:
> >>
> >>    
> >>
> >>>Hello,
> >>>
> >>>I'm seeing in my index directory some segment files that are not included
> in
> >>>      
> >>>
> >>the segments or
> >>    
> >>
> >>>deletable files. These segment files show their last modified date to be
> >>>      
> >>>
> >>anywhere between a couple
> >>    
> >>
> >>>of days ago to a few weeks ago. I'm wondering why these files are not
> >>>      
> >>>
> >>contained inside segments or
> >>    
> >>
> >>>deletable. Is it safe to delete them manually?
> >>>
> >>>Thanks,
> >>>Ed
> >>>
> >>>
> >>>		
> >>>__________________________________
> >>>Do you Yahoo!?
> >>>Yahoo! Mail Address AutoComplete - You start. We finish.
> >>>http://promotions.yahoo.com/new_mail 
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>>
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>
> >>
> >>    
> >>
> >
> >
> >__________________________________________________
> >Do You Yahoo!?
> >Tired of spam?  Yahoo! Mail has the best spam protection around 
> >http://mail.yahoo.com 
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: "Orphan" segment files

Posted by Dmitry Serebrennikov <dm...@earthlink.net>.
Edwin Tang wrote:

>Hello,
>
>I am running on Windows XP. Since our indices are constantly changing, we close
>the searchers after each search.
>
>I was running Lucene 1.4 and 1.4.1, and saw this problem. I replaced 1.4.1 with
>1.4.2 on Friday, but haven't had a chance to observe its behavior since.
>
>Would you say it's safe to delete these files?
>  
>
As far as I know, yes. The only segments that matter are those 
referenced from the "segments" file or those that may still be open by 
the application. By the way, how do you know they are not referenced? 
Did you write some custom code to parse the segments file (or just using 
a hex editor or something)?

Is it possible that your index update code sometimes fails by does not 
record its exceptions? That could explain why you have abandoned 
segments lying around.
Also, are these segments about the size of your full index or about the 
size of an incremental segment you might find when new documents are 
added? If you are able to look into the "segments" file manually, 
perhaps you can also put a new one together that references these 
segments? This would allow you to use something like Luke 
(http://www.getopt.org/luke/) to look into them and see what documents 
they contain?

Good luck!
Dmitry.

>Thanks,
>Ed
>
>--- Dmitry Serebrennikov <dm...@earthlink.net> wrote:
>
>  
>
>>This is not a normal behavior, unless you are running on Windows and 
>>have searchers open for that long that are still locking the segments 
>>(but then they would be in deletable...).
>>
>>What version of Lucene are you running? At some point during the past 
>>two months there were a few days when CVS snapshot would have had this 
>>problem. If you are running from CVS, try the latest release and see if 
>>this occurs again.
>>
>>Dmitry.
>>
>>
>>Edwin Tang wrote:
>>
>>    
>>
>>>Hello,
>>>
>>>I'm seeing in my index directory some segment files that are not included in
>>>      
>>>
>>the segments or
>>    
>>
>>>deletable files. These segment files show their last modified date to be
>>>      
>>>
>>anywhere between a couple
>>    
>>
>>>of days ago to a few weeks ago. I'm wondering why these files are not
>>>      
>>>
>>contained inside segments or
>>    
>>
>>>deletable. Is it safe to delete them manually?
>>>
>>>Thanks,
>>>Ed
>>>
>>>
>>>		
>>>__________________________________
>>>Do you Yahoo!?
>>>Yahoo! Mail Address AutoComplete - You start. We finish.
>>>http://promotions.yahoo.com/new_mail 
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>>
>>>
>>> 
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>    
>>
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around 
>http://mail.yahoo.com 
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: "Orphan" segment files

Posted by Edwin Tang <em...@yahoo.com>.
Hello,

I am running on Windows XP. Since our indices are constantly changing, we close
the searchers after each search.

I was running Lucene 1.4 and 1.4.1, and saw this problem. I replaced 1.4.1 with
1.4.2 on Friday, but haven't had a chance to observe its behavior since.

Would you say it's safe to delete these files?

Thanks,
Ed

--- Dmitry Serebrennikov <dm...@earthlink.net> wrote:

> This is not a normal behavior, unless you are running on Windows and 
> have searchers open for that long that are still locking the segments 
> (but then they would be in deletable...).
> 
> What version of Lucene are you running? At some point during the past 
> two months there were a few days when CVS snapshot would have had this 
> problem. If you are running from CVS, try the latest release and see if 
> this occurs again.
> 
> Dmitry.
> 
> 
> Edwin Tang wrote:
> 
> >Hello,
> >
> >I'm seeing in my index directory some segment files that are not included in
> the segments or
> >deletable files. These segment files show their last modified date to be
> anywhere between a couple
> >of days ago to a few weeks ago. I'm wondering why these files are not
> contained inside segments or
> >deletable. Is it safe to delete them manually?
> >
> >Thanks,
> >Ed
> >
> >
> >		
> >__________________________________
> >Do you Yahoo!?
> >Yahoo! Mail Address AutoComplete - You start. We finish.
> >http://promotions.yahoo.com/new_mail 
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> >  
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: "Orphan" segment files

Posted by Dmitry Serebrennikov <dm...@earthlink.net>.
This is not a normal behavior, unless you are running on Windows and 
have searchers open for that long that are still locking the segments 
(but then they would be in deletable...).

What version of Lucene are you running? At some point during the past 
two months there were a few days when CVS snapshot would have had this 
problem. If you are running from CVS, try the latest release and see if 
this occurs again.

Dmitry.


Edwin Tang wrote:

>Hello,
>
>I'm seeing in my index directory some segment files that are not included in the segments or
>deletable files. These segment files show their last modified date to be anywhere between a couple
>of days ago to a few weeks ago. I'm wondering why these files are not contained inside segments or
>deletable. Is it safe to delete them manually?
>
>Thanks,
>Ed
>
>
>		
>__________________________________
>Do you Yahoo!?
>Yahoo! Mail Address AutoComplete - You start. We finish.
>http://promotions.yahoo.com/new_mail 
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org