You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Pascal Heraud <pa...@kelkoo.com> on 2004/01/30 17:15:46 UTC
docFreq and deleting documents
Hi all,
Does IndexReader#docFreq should be aware of deleted documents, if the index has not been optimized ?
If I have two documents with same term T:
I call docFreq(T) and it returns 2.
I delete the first document.
I call docFreq(T) again and it returns 2.
In our cases, indexes are very big and it costs to optimize them.
Here is a code snippet pointing out the problem :
---------------------------------------
public class Test {
public static void main(String[] args) {
String tmp = System.getProperty("java.io.tmpdir")+File.separator+"tst";
try {
IndexWriter wri = new IndexWriter(tmp, new WhitespaceAnalyzer(),true);
Document doc =new Document();
doc.add(Field.Text("field1","value"));
doc.add(Field.Text("field2","value2"));
wri.addDocument(doc);
doc = new Document();
doc.add(Field.Text("field1","value"));
doc.add(Field.Text("field2","value3"));
wri.addDocument(doc);
wri.optimize();
wri.close();
IndexReader reader = IndexReader.open(tmp);
System.out.println(reader.docFreq(new Term("field1","value")));
reader.delete(0);
reader.close();
reader = IndexReader.open(tmp);
System.out.println(reader.docFreq(new Term("field1","value")));
}
catch (IOException e) {
e.printStackTrace();
}
}
}
---------------------------------------
Thanks.
Pascal.
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org