You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sudarshan Gaikaiwari <su...@acm.org> on 2012/03/01 08:20:40 UTC

How to add DocValues Field to a document in an optimal manner.

Hi

https://builds.apache.org/job/Lucene-trunk/javadoc/core/org/apache/lucene/document/DocValuesField.html

The documentation at the above link indicates that the optimal way to
add a DocValues field is to create it once and change the value as we
are indexing multiple documents.
It also mentions that the Document should be created only once and re-used.

Does this mean that the optimal way of adding non DocValues fields for now is

doc.removeField(fieldName);
doc.add(new Field(fieldName, newValue, fieldType);

If this is the pattern that users should follow while creating
documents, would it be possible to augment the Document class to do
this in a single method?

regards
Sudarshan

-- 
Sudarshan Gaikaiwari
www.sudarshan.org
sudarshan@acm.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to add DocValues Field to a document in an optimal manner.

Posted by Trejkaz <tr...@trypticon.org>.
On Thu, Mar 1, 2012 at 6:20 PM, Sudarshan Gaikaiwari <su...@acm.org> wrote:
> Hi
>
> https://builds.apache.org/job/Lucene-trunk/javadoc/core/org/apache/lucene/document/DocValuesField.html
>
> The documentation at the above link indicates that the optimal way to
> add a DocValues field is to create it once and change the value as we
> are indexing multiple documents.
> It also mentions that the Document should be created only once and re-used.
>
> Does this mean that the optimal way of adding non DocValues fields for now is
>
> doc.removeField(fieldName);
> doc.add(new Field(fieldName, newValue, fieldType);

I'm pretty sure you're supposed to reuse *all* Field instances, for
optimum performance.

Though admittedly this is quite tricky to do right if you have
multiple fields with the same name in the document where the number of
fields might change for each document you add.

If you only have one of each, just add all the fields to the document
once and keep a reference to them, then just set the value on each
before doing your addDocument.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to add DocValues Field to a document in an optimal manner.

Posted by Michael McCandless <lu...@mikemccandless.com>.
You shouldn't use doc.removeField -- it's costly (the fields are a
list internally so we walk that list looking for which field(s) to
remove).

To reuse you can just use Field.setValue, and leave the Field instance
on the Document.

But: you should only do this if you really have a meaningful
performance problem during indexing...

Mike McCandless

http://blog.mikemccandless.com

On Thu, Mar 1, 2012 at 2:20 AM, Sudarshan Gaikaiwari <su...@acm.org> wrote:
> Hi
>
> https://builds.apache.org/job/Lucene-trunk/javadoc/core/org/apache/lucene/document/DocValuesField.html
>
> The documentation at the above link indicates that the optimal way to
> add a DocValues field is to create it once and change the value as we
> are indexing multiple documents.
> It also mentions that the Document should be created only once and re-used.
>
> Does this mean that the optimal way of adding non DocValues fields for now is
>
> doc.removeField(fieldName);
> doc.add(new Field(fieldName, newValue, fieldType);
>
> If this is the pattern that users should follow while creating
> documents, would it be possible to augment the Document class to do
> this in a single method?
>
> regards
> Sudarshan
>
> --
> Sudarshan Gaikaiwari
> www.sudarshan.org
> sudarshan@acm.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org