You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by mi...@apache.org on 2007/07/19 03:34:00 UTC

svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/document/Field.java src/test/org/apache/lucene/document/TestDocument.java

Author: mikemccand
Date: Wed Jul 18 18:33:59 2007
New Revision: 557445

URL: http://svn.apache.org/viewvc?view=rev&rev=557445
Log:
LUCENE-963: add setters to Field to allow re-using Field instances during indexing (for better performance)

Modified:
    lucene/java/trunk/CHANGES.txt
    lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
    lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java

Modified: lucene/java/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/CHANGES.txt (original)
+++ lucene/java/trunk/CHANGES.txt Wed Jul 18 18:33:59 2007
@@ -65,6 +65,10 @@
 
  4. LUCENE-959: Remove synchronization in Document (yonik)
 
+ 5. LUCENE-963: Add setters to Field to allow for re-using a single
+    Field instance during indexing.  This is a sizable performance
+    gain, especially for small documents.  (Mike McCandless)
+
 Documentation
 
 Build

Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Field.java?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/document/Field.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/document/Field.java Wed Jul 18 18:33:59 2007
@@ -155,6 +155,35 @@
    * readerValue(), binaryValue(), and tokenStreamValue() must be set. */
   public TokenStream tokenStreamValue()   { return fieldsData instanceof TokenStream ? (TokenStream)fieldsData : null; }
   
+
+  /** Expert: change the value of this field.  This can be
+   *  used during indexing to re-use a single Field instance
+   *  to improve indexing speed. */
+  public void setValue(String value) {
+    fieldsData = value;
+  }
+
+  /** Expert: change the value of this field.  This can be
+   *  used during indexing to re-use a single Field instance
+   *  to improve indexing speed. */
+  public void setValue(Reader value) {
+    fieldsData = value;
+  }
+
+  /** Expert: change the value of this field.  This can be
+   *  used during indexing to re-use a single Field instance
+   *  to improve indexing speed. */
+  public void setValue(byte[] value) {
+    fieldsData = value;
+  }
+
+  /** Expert: change the value of this field.  This can be
+   *  used during indexing to re-use a single Field instance
+   *  to improve indexing speed. */
+  public void setValue(TokenStream value) {
+    fieldsData = value;
+  }
+
   /**
    * Create a field by specifying its name, value and how it will
    * be saved in the index. Term vectors will not be stored in the index.

Modified: lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java (original)
+++ lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java Wed Jul 18 18:33:59 2007
@@ -222,4 +222,45 @@
             assertTrue(unstoredFieldValues[1].equals("test2"));
         }
     }
+
+    public void testFieldSetValue() throws Exception {
+
+      Field field = new Field("id", "id1", Field.Store.YES, Field.Index.UN_TOKENIZED);
+      Document doc = new Document();
+      doc.add(field);
+      doc.add(new Field("keyword", "test", Field.Store.YES, Field.Index.UN_TOKENIZED));
+
+      RAMDirectory dir = new RAMDirectory();
+      IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(), true);
+      writer.addDocument(doc);
+      field.setValue("id2");
+      writer.addDocument(doc);
+      field.setValue("id3");
+      writer.addDocument(doc);
+      writer.close();
+
+      Searcher searcher = new IndexSearcher(dir);
+
+      Query query = new TermQuery(new Term("keyword", "test"));
+
+      // ensure that queries return expected results without DateFilter first
+      Hits hits = searcher.search(query);
+      assertEquals(3, hits.length());
+      int result = 0;
+      for(int i=0;i<3;i++) {
+        Document doc2 = hits.doc(i);
+        Field f = doc2.getField("id");
+        if (f.stringValue().equals("id1"))
+          result |= 1;
+        else if (f.stringValue().equals("id2"))
+          result |= 2;
+        else if (f.stringValue().equals("id3"))
+          result |= 4;
+        else
+          fail("unexpected id field");
+      }
+      searcher.close();
+      dir.close();
+      assertEquals("did not see all IDs", 7, result);
+    }
 }



Re: svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/document/Field.java src/test/org/apache/lucene/document/TestDocument.java

Posted by Michael McCandless <lu...@mikemccandless.com>.
I agree.  I will add wording to that effect, and also link over to the Wiki page for details (and update the Wiki page with these details!).

Mike

"Doron Cohen" <DO...@il.ibm.com> wrote:
> mikemccand wrote:
> > +  /** Expert: change the value of this field.  This can be
> > +   *  used during indexing to re-use a single Field instance
> > +   *  to improve indexing speed. */
> > +  public void setValue(String value) {
> 
> Would it make sense to warn from modifying the field
> value before the doc was added?
> Something like:
>   Note that fields reuse means adding the same field instance
>   to multiple documents. You cannot reuse a field instance
>   for adding multiple fields to the same document."
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/document/Field.java src/test/org/apache/lucene/document/TestDocument.java

Posted by Doron Cohen <DO...@il.ibm.com>.
mikemccand wrote:
> +  /** Expert: change the value of this field.  This can be
> +   *  used during indexing to re-use a single Field instance
> +   *  to improve indexing speed. */
> +  public void setValue(String value) {

Would it make sense to warn from modifying the field
value before the doc was added?
Something like:
  Note that fields reuse means adding the same field instance
  to multiple documents. You cannot reuse a field instance
  for adding multiple fields to the same document."


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org