You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Pascal Heraud <pa...@kelkoo.com> on 2004/01/30 17:15:46 UTC

docFreq and deleting documents

Hi all,

Does IndexReader#docFreq should be aware of deleted documents, if the index has not been optimized ?

If I have two documents with same term T:
I call docFreq(T) and it returns 2.

I delete the first document.

I call docFreq(T) again and it returns 2.

In our cases, indexes are very big and it costs to optimize them.



Here is a code snippet pointing out the problem :

---------------------------------------
public class Test {

    public static void main(String[] args) {
       String tmp = System.getProperty("java.io.tmpdir")+File.separator+"tst";

       try {
          IndexWriter wri = new IndexWriter(tmp, new WhitespaceAnalyzer(),true);
          Document doc =new Document();
          doc.add(Field.Text("field1","value"));
          doc.add(Field.Text("field2","value2"));
          wri.addDocument(doc);
          doc = new Document();
          doc.add(Field.Text("field1","value"));
          doc.add(Field.Text("field2","value3"));
          wri.addDocument(doc);
          wri.optimize();
          wri.close();

          IndexReader reader = IndexReader.open(tmp);
          System.out.println(reader.docFreq(new Term("field1","value")));
          reader.delete(0);
          reader.close();
          reader = IndexReader.open(tmp);
          System.out.println(reader.docFreq(new Term("field1","value")));
       }
       catch (IOException e) {
          e.printStackTrace();
       }

       }
}
---------------------------------------

Thanks.
Pascal.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org