You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jamie Johnson (JIRA)" <ji...@apache.org> on 2015/12/19 03:44:47 UTC

[jira] [Commented] (SOLR-8096) Major faceting performance regressions

    [ https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065185#comment-15065185 ] 

Jamie Johnson commented on SOLR-8096:
-------------------------------------

While some (all?) of the performance issues are addressed, would it not still be useful to add an option to support either faceting approach?  I understand the benefits of DocValues but we have a case where the facets need to be calculated based on an access level the user has.  Simply storing in a separate field is not an option because the access controls are complex.  Given that the JSON Facet API allows developers to choose the faceting method it would seem reasonable to provide similar functionality here, no?

> Major faceting performance regressions
> --------------------------------------
>
>                 Key: SOLR-8096
>                 URL: https://issues.apache.org/jira/browse/SOLR-8096
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.0, 5.1, 5.2, 5.3, Trunk
>            Reporter: Yonik Seeley
>            Priority: Critical
>
> Use of the highly optimized faceting that Solr had for multi-valued fields over relatively static indexes was removed as part of LUCENE-5666, causing severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index, with each field having between 0 and 5 values per document.  *Higher numbers represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time		
> ||...................................|| Percent of index being faceted
> ||num_unique_values||	10%	|| 50% || 90% ||
> |10	        | 351.17%	| 1587.08%	| 3057.28% |
> |100   	| 158.10%	| 203.61%	| 1421.93% |
> |1000	| 143.78%	| 168.01%	| 1325.87% |
> |10000	| 137.98%	| 175.31%	| 1233.97% |
> |100000	| 142.98%	| 159.42%	| 1252.45% |
> |1000000	| 255.15%	| 165.17%	| 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting with 5x took 143% of the 4x time, when ~10% of the docs in the index were faceted.
> One user who brought the performance problem to our attention: http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in SOLR-7190, but we didn't know just how bad the problem was at that time.
> edit: removed "secret" adverb by request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org