You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by mi...@apache.org on 2007/07/19 03:34:00 UTC
svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt
src/java/org/apache/lucene/document/Field.java
src/test/org/apache/lucene/document/TestDocument.java
Author: mikemccand
Date: Wed Jul 18 18:33:59 2007
New Revision: 557445
URL: http://svn.apache.org/viewvc?view=rev&rev=557445
Log:
LUCENE-963: add setters to Field to allow re-using Field instances during indexing (for better performance)
Modified:
lucene/java/trunk/CHANGES.txt
lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java
Modified: lucene/java/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/CHANGES.txt (original)
+++ lucene/java/trunk/CHANGES.txt Wed Jul 18 18:33:59 2007
@@ -65,6 +65,10 @@
4. LUCENE-959: Remove synchronization in Document (yonik)
+ 5. LUCENE-963: Add setters to Field to allow for re-using a single
+ Field instance during indexing. This is a sizable performance
+ gain, especially for small documents. (Mike McCandless)
+
Documentation
Build
Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Field.java?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/document/Field.java (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/document/Field.java Wed Jul 18 18:33:59 2007
@@ -155,6 +155,35 @@
* readerValue(), binaryValue(), and tokenStreamValue() must be set. */
public TokenStream tokenStreamValue() { return fieldsData instanceof TokenStream ? (TokenStream)fieldsData : null; }
+
+ /** Expert: change the value of this field. This can be
+ * used during indexing to re-use a single Field instance
+ * to improve indexing speed. */
+ public void setValue(String value) {
+ fieldsData = value;
+ }
+
+ /** Expert: change the value of this field. This can be
+ * used during indexing to re-use a single Field instance
+ * to improve indexing speed. */
+ public void setValue(Reader value) {
+ fieldsData = value;
+ }
+
+ /** Expert: change the value of this field. This can be
+ * used during indexing to re-use a single Field instance
+ * to improve indexing speed. */
+ public void setValue(byte[] value) {
+ fieldsData = value;
+ }
+
+ /** Expert: change the value of this field. This can be
+ * used during indexing to re-use a single Field instance
+ * to improve indexing speed. */
+ public void setValue(TokenStream value) {
+ fieldsData = value;
+ }
+
/**
* Create a field by specifying its name, value and how it will
* be saved in the index. Term vectors will not be stored in the index.
Modified: lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java?view=diff&rev=557445&r1=557444&r2=557445
==============================================================================
--- lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java (original)
+++ lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java Wed Jul 18 18:33:59 2007
@@ -222,4 +222,45 @@
assertTrue(unstoredFieldValues[1].equals("test2"));
}
}
+
+ public void testFieldSetValue() throws Exception {
+
+ Field field = new Field("id", "id1", Field.Store.YES, Field.Index.UN_TOKENIZED);
+ Document doc = new Document();
+ doc.add(field);
+ doc.add(new Field("keyword", "test", Field.Store.YES, Field.Index.UN_TOKENIZED));
+
+ RAMDirectory dir = new RAMDirectory();
+ IndexWriter writer = new IndexWriter(dir, new StandardAnalyzer(), true);
+ writer.addDocument(doc);
+ field.setValue("id2");
+ writer.addDocument(doc);
+ field.setValue("id3");
+ writer.addDocument(doc);
+ writer.close();
+
+ Searcher searcher = new IndexSearcher(dir);
+
+ Query query = new TermQuery(new Term("keyword", "test"));
+
+ // ensure that queries return expected results without DateFilter first
+ Hits hits = searcher.search(query);
+ assertEquals(3, hits.length());
+ int result = 0;
+ for(int i=0;i<3;i++) {
+ Document doc2 = hits.doc(i);
+ Field f = doc2.getField("id");
+ if (f.stringValue().equals("id1"))
+ result |= 1;
+ else if (f.stringValue().equals("id2"))
+ result |= 2;
+ else if (f.stringValue().equals("id3"))
+ result |= 4;
+ else
+ fail("unexpected id field");
+ }
+ searcher.close();
+ dir.close();
+ assertEquals("did not see all IDs", 7, result);
+ }
}
Re: svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt
src/java/org/apache/lucene/document/Field.java
src/test/org/apache/lucene/document/TestDocument.java
Posted by Michael McCandless <lu...@mikemccandless.com>.
I agree. I will add wording to that effect, and also link over to the Wiki page for details (and update the Wiki page with these details!).
Mike
"Doron Cohen" <DO...@il.ibm.com> wrote:
> mikemccand wrote:
> > + /** Expert: change the value of this field. This can be
> > + * used during indexing to re-use a single Field instance
> > + * to improve indexing speed. */
> > + public void setValue(String value) {
>
> Would it make sense to warn from modifying the field
> value before the doc was added?
> Something like:
> Note that fields reuse means adding the same field instance
> to multiple documents. You cannot reuse a field instance
> for adding multiple fields to the same document."
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: svn commit: r557445 - in /lucene/java/trunk: CHANGES.txt src/java/org/apache/lucene/document/Field.java
src/test/org/apache/lucene/document/TestDocument.java
Posted by Doron Cohen <DO...@il.ibm.com>.
mikemccand wrote:
> + /** Expert: change the value of this field. This can be
> + * used during indexing to re-use a single Field instance
> + * to improve indexing speed. */
> + public void setValue(String value) {
Would it make sense to warn from modifying the field
value before the doc was added?
Something like:
Note that fields reuse means adding the same field instance
to multiple documents. You cannot reuse a field instance
for adding multiple fields to the same document."
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org