You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Chuck Williams (JIRA)" <ji...@apache.org> on 2006/07/09 19:23:30 UTC

[jira] Commented: (LUCENE-509) Performance optimization when retrieving a single field from a document

    [ http://issues.apache.org/jira/browse/LUCENE-509?page=comments#action_12419926 ] 

Chuck Williams commented on LUCENE-509:
---------------------------------------

LUCENE-545 does resolve this in a more general way, although the code to get precisely one field value efficiently is slightly clunky, requiring something like this (for a single-valued field):

final seekfield = retrievefield.intern();
String value = reader.document(doc, new FieldSelector(){
    FieldSelectorResult accept(String field) {
        if (field==seekfield)
            return FieldSelectorResult.LOAD_AND_BREAK;
        else return FieldSelectorResult.NO_LOAD;
    }).get(seekfield);

Even with this, a Document, a Field and a FieldSelector are created unnecessarily.

There are important cases where fast single-field-access is important.  E.g., I have cases where it is necessary to obtain the id field for all results of a query, leading to (an obviously refactored version of) the above code in a HitCollector.

I think some special optimization for the single-field access case makes sense if benchmarks show it is material, but that it should be integrated with the mechanism of LUCENE-545.

$0.02,

Chuck


> Performance optimization when retrieving a single field from a document
> -----------------------------------------------------------------------
>
>          Key: LUCENE-509
>          URL: http://issues.apache.org/jira/browse/LUCENE-509
>      Project: Lucene - Java
>         Type: Improvement

>   Components: Index
>     Versions: 1.9, 2.0.0
>     Reporter: Steven Tamm
>     Assignee: Otis Gospodnetic
>  Attachments: DocField.patch, DocField_2.patch, DocField_3.patch, DocField_4.patch, DocField_4b.patch
>
> If you just want to retrieve a single field from a Document, the only way to do it is to retrieve all the fields from the Document and then search it.  This patch is an optimization that allows you retrieve a specific field from a document without instantiating a lot of field and string objects.  This reduces our memory consumption on a per query basis by around around 20% when a lot of documents are returned.
> I've added a lot of comments saying you should only call it if you only ever need one field.  There's also a unit test.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org