You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Tony Schwartz <to...@simpleobjects.com> on 2005/07/20 14:04:57 UTC
Re: if delete all docs in segment - when is segment deleted
I added the following code to the close() method of IndexWriter to detect a segment that
has all documents deleted upon close. Does anyone see any problem with this?
=================================
public synchronized void close() throws IOException {
flushRamSegments();
ramDirectory.close();
if ( directory instanceof FSDirectory && closeDir ) {
///////////////////////////
// check for any segments that have all docs deleted and remove it.
final Vector deletable = new Vector();
int len = segmentInfos.size();
SegmentReader reader;
for ( int i = 0 ; i < len ; i++ ) {
reader = SegmentReader.get( segmentInfos.info( i ) );
if ( reader.numDocs() <= 0 ) { //numDocs excludes deleted docs
deletable.add( reader );
}
}
synchronized (directory) { // in- & inter-process sync
new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), COMMIT_LOCK_TIMEOUT) {
public Object doBody() throws IOException {
segmentInfos.write( directory ); // commit before deleting
deleteSegments( deletable ); // delete now-unused segments
return null;
}
}.run();
}
}
if (writeLock != null) {
writeLock.release(); // release write lock
writeLock = null;
}
if(closeDir)
directory.close();
}
=================================
Tony Schwartz
tony@simpleobjects.com
"What we need is more cowbell."
> If every doc in a segment is deleted, when does the segment go away?
> Without me having to dig too deep, I was hoping someone could help me prepare for this
> eventuality. I have an index that grows infinitely. Old docs are deleted each day just
> before new docs for that day are added. If I set MaxMergeDocs to some number, say 1
> million, and the segment has 1 million docs in it, and every doc in that segment is
> deleted, will the segment ever be deleted? If not, how difficult would it be to add
> some type of trigger to detect this "all deleted in segment" condition so lucene could
> remove the huge segment to free disk space. I'm concerned the segment will never be
> deleted.
>
> Tony Schwartz
> tony@simpleobjects.com
> There are 10 types of people in this world. Ones that understand binary and ones that
> don't.
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: if delete all docs in segment - when is segment deleted
Posted by Tony Schwartz <to...@simpleobjects.com>.
Actually, there is no need for the closeDir check in the line:
if ( directory instanceof FSDirectory && closeDir ) {
I could also check the size of the deletable before locking for actual deletes.
Tony Schwartz
tony@simpleobjects.com
"What we need is more cowbell."
> I added the following code to the close() method of IndexWriter to detect a segment that
> has all documents deleted upon close. Does anyone see any problem with this?
>
> =================================
> public synchronized void close() throws IOException {
> flushRamSegments();
> ramDirectory.close();
>
> if ( directory instanceof FSDirectory && closeDir ) {
> ///////////////////////////
> // check for any segments that have all docs deleted and remove it.
> final Vector deletable = new Vector();
> int len = segmentInfos.size();
> SegmentReader reader;
> for ( int i = 0 ; i < len ; i++ ) {
> reader = SegmentReader.get( segmentInfos.info( i ) );
> if ( reader.numDocs() <= 0 ) { //numDocs excludes deleted docs
> deletable.add( reader );
> }
> }
> synchronized (directory) { // in- & inter-process sync
> new Lock.With(directory.makeLock(COMMIT_LOCK_NAME), COMMIT_LOCK_TIMEOUT) {
> public Object doBody() throws IOException {
> segmentInfos.write( directory ); // commit before deleting
> deleteSegments( deletable ); // delete now-unused segments
> return null;
> }
> }.run();
> }
> }
>
> if (writeLock != null) {
> writeLock.release(); // release write lock
> writeLock = null;
> }
> if(closeDir)
> directory.close();
> }
> =================================
>
> Tony Schwartz
> tony@simpleobjects.com
> "What we need is more cowbell."
>
>
>
>
>> If every doc in a segment is deleted, when does the segment go away?
>> Without me having to dig too deep, I was hoping someone could help me prepare for this
>> eventuality. I have an index that grows infinitely. Old docs are deleted each day
>> just
>> before new docs for that day are added. If I set MaxMergeDocs to some number, say 1
>> million, and the segment has 1 million docs in it, and every doc in that segment is
>> deleted, will the segment ever be deleted? If not, how difficult would it be to add
>> some type of trigger to detect this "all deleted in segment" condition so lucene could
>> remove the huge segment to free disk space. I'm concerned the segment will never be
>> deleted.
>>
>> Tony Schwartz
>> tony@simpleobjects.com
>> There are 10 types of people in this world. Ones that understand binary and ones that
>> don't.
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org