You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/03/01 18:21:52 UTC

Re: 2 strange behaviours with DIH full-import.

: 2.)I run a full-import and everythins works fine... I run another
: full-import in the same core and everything seems so work find. But I have
: noticed that the index in  /data/index dir is two times bigger. I have seen
: that Solr uses this indexwriter constructor when executes a deleteAll at the
: begining of the full import :
: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/IndexWriter.html#IndexWriter(org.apache.lucene.store.Directory,%20org.apache.lucene.analysis.Analyzer,%20boolean,%20org.apache.lucene.index.IndexDeletionPolicy,%20org.apache.lucene.index.IndexWriter.MaxFieldLength)
: 
: Why lucene is not deleteing the data of the old index if the boolean var of
: the constructor is set to true? (the results are not duplicated but
: phisically the directory /index is double size). Has this something to do
: with de deletionPolicy that is saving commits or a lucenes 2.9-dev bug or
: something like that???

this is not unusual, the documents have logically been deleted, but the 
files containing them are still on disk because the "old seracher" is 
still refrencing them, when the "new searcher" is swaped in for hte old 
searcher, those files can be deleted.

on unix filesystems, the old files will actually get deleted immediately 
(even while hte old searcher is still open) becaues unix filesystems let 
you do that.

windows filesystems won't let you delete files while they are open, so 
Lucene keeps track of the fact that the files *can* be deleted, and then 
next time you do a commit, it cleans them up them.



-Hoss