You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Chuck Williams (JIRA)" <ji...@apache.org> on 2006/07/09 19:23:30 UTC
[jira] Commented: (LUCENE-509) Performance optimization when
retrieving a single field from a document
[ http://issues.apache.org/jira/browse/LUCENE-509?page=comments#action_12419926 ]
Chuck Williams commented on LUCENE-509:
---------------------------------------
LUCENE-545 does resolve this in a more general way, although the code to get precisely one field value efficiently is slightly clunky, requiring something like this (for a single-valued field):
final seekfield = retrievefield.intern();
String value = reader.document(doc, new FieldSelector(){
FieldSelectorResult accept(String field) {
if (field==seekfield)
return FieldSelectorResult.LOAD_AND_BREAK;
else return FieldSelectorResult.NO_LOAD;
}).get(seekfield);
Even with this, a Document, a Field and a FieldSelector are created unnecessarily.
There are important cases where fast single-field-access is important. E.g., I have cases where it is necessary to obtain the id field for all results of a query, leading to (an obviously refactored version of) the above code in a HitCollector.
I think some special optimization for the single-field access case makes sense if benchmarks show it is material, but that it should be integrated with the mechanism of LUCENE-545.
$0.02,
Chuck
> Performance optimization when retrieving a single field from a document
> -----------------------------------------------------------------------
>
> Key: LUCENE-509
> URL: http://issues.apache.org/jira/browse/LUCENE-509
> Project: Lucene - Java
> Type: Improvement
> Components: Index
> Versions: 1.9, 2.0.0
> Reporter: Steven Tamm
> Assignee: Otis Gospodnetic
> Attachments: DocField.patch, DocField_2.patch, DocField_3.patch, DocField_4.patch, DocField_4b.patch
>
> If you just want to retrieve a single field from a Document, the only way to do it is to retrieve all the fields from the Document and then search it. This patch is an optimization that allows you retrieve a specific field from a document without instantiating a lot of field and string objects. This reduces our memory consumption on a per query basis by around around 20% when a lot of documents are returned.
> I've added a lot of comments saying you should only call it if you only ever need one field. There's also a unit test.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org