You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Fuad Efendi (JIRA)" <ji...@apache.org> on 2008/07/31 03:06:31 UTC
[jira] Commented: (SOLR-669) SOLR currently does not support
caching for (Query, FacetFieldList)
[ https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618574#action_12618574 ]
Fuad Efendi commented on SOLR-669:
----------------------------------
This piece of code in SimpleFacets:
{code}
if (sf.multiValued() || ft.isTokenized() || ft instanceof BoolField) {
// Always use filters for booleans... we know the number of values is very small.
counts = getFacetTermEnumCounts(searcher, docs, field, offset, limit, mincount,missing,sort,prefix);
} else {
// TODO: future logic could use filters instead of the fieldcache if
// the number of terms in the field is small enough.
counts = getFieldCacheCounts(searcher, docs, field, offset,limit, mincount, missing, sort, prefix);
}
{code}
- optimization for single-valued non-tokenized... 'Lucene FieldCache to get counts for each unique field value in docs'
We should implement *additional* caching to support this _the FilterCache to get the intersection_; FilterCache stores DocSet only and does not store NamedList of field-intersections:
{code}
/**
* Returns a list of terms in the specified field along with the
* corresponding count of documents in the set that match that constraint.
* This method uses the FilterCache to get the intersection count between <code>docs</code>
* and the DocSet for each term in the filter.
*
* @see FacetParams#FACET_LIMIT
* @see FacetParams#FACET_ZEROS
* @see FacetParams#FACET_MISSING
*/
public NamedList getFacetTermEnumCounts(SolrIndexSearcher searcher, DocSet docs, String field, int offset, int limit, int mincount, boolean missing, boolean sort, String prefix)
throws IOException {
...
}
{code}
> SOLR currently does not support caching for (Query, FacetFieldList)
> -------------------------------------------------------------------
>
> Key: SOLR-669
> URL: https://issues.apache.org/jira/browse/SOLR-669
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 1.3
> Reporter: Fuad Efendi
> Original Estimate: 1680h
> Remaining Estimate: 1680h
>
> It is huge performance bottleneck and it describes huge difference between qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches only (Key, DocSet/DocList <Lucene Ids>) key-value pairs and it does not have cache for (Query, FacetFieldList).
> filterCache stores DocList for each 'filter' and is used for constant recalculations...
> This would be significant performance improvement.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.