You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alessandro Benedetti (JIRA)" <ji...@apache.org> on 2016/05/24 16:43:15 UTC
[jira] [Commented] (SOLR-8096) Major faceting performance
regressions
[ https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298469#comment-15298469 ]
Alessandro Benedetti commented on SOLR-8096:
--------------------------------------------
Just adding some additional information as I just incurred on the issue with Solr 6.0 :
Static index, around 50 *10^6 docs, 20 fields to facet, 1 of them with high cardinality on top of grouping.
Groping was not affecting at all.
All the symptoms are there, Solr 4.10.2 around 150 ms and Solr 6.0 around 550 ms .
The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr 6.0.
In Solr 4.10 the 'fieldValueCache' is in heavy use with a cumulative_hitratio of 0.96 .
Switching from enum to fc to fcs to uif did not change that much.
Moving to DocValues didn't improve that much the situation ( but I was on an optimized index, so I need to try the multi-segmented one according to [~mkhludnev] contribution in Solr 5.4.0 ) .
Moving to field collapsing moved down the query to 110-120 ms ( but this is normal, we were faceting on 260 /1 million orignal docs)
Adding facet.threads=NCores moved down the queryTime to 100 ms, in combination with field collapsing we reached 80-90 ms when warmed.
What are the plan for the future related this ?
Do we want to deprecate the legacy facets implementation and move everything to Json facets ( like it happened with the UIF ) ?
So backward compatible but different implementation ?
Cheers
> Major faceting performance regressions
> --------------------------------------
>
> Key: SOLR-8096
> URL: https://issues.apache.org/jira/browse/SOLR-8096
> Project: Solr
> Issue Type: Bug
> Affects Versions: 5.0, 5.1, 5.2, 5.3, 6.0
> Reporter: Yonik Seeley
> Priority: Critical
> Attachments: simple_facets.diff
>
>
> Use of the highly optimized faceting that Solr had for multi-valued fields over relatively static indexes was removed as part of LUCENE-5666, causing severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index, with each field having between 0 and 5 values per document. *Higher numbers represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time
> ||...................................|| Percent of index being faceted
> ||num_unique_values|| 10% || 50% || 90% ||
> |10 | 351.17% | 1587.08% | 3057.28% |
> |100 | 158.10% | 203.61% | 1421.93% |
> |1000 | 143.78% | 168.01% | 1325.87% |
> |10000 | 137.98% | 175.31% | 1233.97% |
> |100000 | 142.98% | 159.42% | 1252.45% |
> |1000000 | 255.15% | 165.17% | 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting with 5x took 143% of the 4x time, when ~10% of the docs in the index were faceted.
> One user who brought the performance problem to our attention: http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in SOLR-7190, but we didn't know just how bad the problem was at that time.
> edit: removed "secret" adverb by request
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org