You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Arcadius Ahouansou (JIRA)" <ji...@apache.org> on 2015/06/29 15:07:05 UTC
[jira] [Commented] (SOLR-5444) Slow response on facet search, lots of facets, asking for few facets in response

    [ https://issues.apache.org/jira/browse/SOLR-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605581#comment-14605581 ] 

Arcadius Ahouansou commented on SOLR-5444:
------------------------------------------

We are having some performance issue on Solr 5.2.1  with slow facets on large data set.
A lot of time is being spent in 
{code}
…g.apache.solr.handler.component.FacetComponent.process (FacetComponent.java:116)
…solr.handler.component.SearchHandler.handleRequestBody (SearchHandler.java:255)
…g.apache.solr.handler.RequestHandlerBase.handleRequest (RequestHandlerBase.java:143)
{code}

Not sure [~rcmuir] whether these are related though.


Arcadius

> Slow response on facet search, lots of facets, asking for few facets in response
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-5444
>                 URL: https://issues.apache.org/jira/browse/SOLR-5444
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>    Affects Versions: 4.4
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: docvalue, faceted-search, performance
>             Fix For: 4.9, Trunk
>
>         Attachments: Profiiling_SimpleFacets_getListedTermCounts_path.png, Profiling_SimpleFacets_getTermCounts_path.png, Responsetime_func_of_facets_asked_for-Simple_DocSetCollector_fix.png, Responsetime_func_of_facets_asked_for.png, SOLR-5444_ExpandingIntArray_DocSetCollector_4_4_0.patch, SOLR-5444_simple_DocSetCollector_4_4_0.patch
>
>
> h5. Setup
> We have a 6-Solr-node (release 4.4.0) setup with 12 billion "small" documents loaded across 3 collections. The documents have the following fields
> * a_dlng_doc_sto (docvalue long)
> * b_dlng_doc_sto (docvalue long)
> * c_dstr_doc_sto (docvalue string)
> * timestamp_lng_ind_sto  (indexed long)
> * d_lng_ind_sto (indexed long)
> From schema.xml
> {code}
>     <dynamicField name="*_dstr_doc_sto" type="dstring" indexed="false" stored="true" required="true" docValues="true"/>
>     <dynamicField name="*_lng_ind_sto" type="long" indexed="true" stored="true"/>
>     <dynamicField name="*_dlng_doc_sto" type="dlng" indexed="false" stored="true" required="true" docValues="true"/>
> ...
>     <fieldType name="dstring" class="solr.StrField" sortMissingLast="true" docValuesFormat="Disk"/>
>     <fieldType name="dlng" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0" docValuesFormat="Disk"/>
> {code}
> timestamp_lng_ind_sto decides which collection documents go into
> We execute queries on the following format:
> * q=timestamp_lng_ind_sto:\[x TO y\] AND d_lng_ind_sto:(a OR b OR ... OR n)
> * facet=true&facet.field=a_dlng_doc_sto&facet.zeros=false&facet.mincount=1&facet.limit=<asked-for-facets>&rows=0&start=0
> h5. Problem 
> We see very slow response-time when hitting large number of rows, spanning lots of facets, but only ask for "a few" of those facets
> h5. Concrete example of query to get some concrete numbers to look at
> With x and y plus a, b ... n set to values so that
> * The timestamp_lng_ind_sto:\[x TO y\] part of the search-criteria alone hit about 1.7 billion documents (actually all in one (containing 4.5 billion docs) of the three collections - but that is not important)
> * The d_lng_ind_sto:(a OR b OR ... OR n) part of the search-criteria alone hit about 500000 documents
> * The combined search-criteria (timestamp_lng_ind_sto AND'ed with d_lng_ind_sto) hit about 200000 documents
> The following graph shows responsetime as a function of <asked-for-facets> (in query)
> !Responsetime_func_of_facets_asked_for.png!
> Note that responsetime is high for "low" <asked-for-facets>, and that it increases fast (but linearly) in <asked-for-facets> up until <asked-for-facets> is somewhere inbetween 5000 (where responsetime is close to 1000 secs) and 10000 (where responsetime is about 5 secs). For values of <asked-for-facets> above 10000 responsetime stays "low" at between 1-10 secs
> Looking at the code and profiling it is clear that the change to better responsetime occurs when SimpleFacets.getFacetFieldCounts changes from using getListedTermCounts to using getTermCounts.
> The following image shows profiling information during a request with <asked-for-facets> at about 2000.
> !Profiiling_SimpleFacets_getListedTermCounts_path.png!
> Note that
> * SimpleFacets.getListedTermCounts is used (green box)
> * 91% of the time spent performing the query is spent in DocSetCollector-constructor (red box). During this concrete query 125000 DocSetCollection-objects are created spending 710 secs all in all. Additional investigations show that the time is spent allocating huge int-arrays for the "scratch"-int-array. Several thousands of those DocSetCollection-constructors create int-arrays at size above 1 million - that takes time, and also leaves a nice little job of the GC'er afterwards.
> * The actual search-part of the query takes only 0.5% (4 secs) of the combined time executing the query (blue box)
> The following image shows profiling information during a request with <asked-for-facets> at about 10000
> !Profiling_SimpleFacets_getTermCounts_path.png!
> Note that
> * SimpleFacets.getTermCounts is used (green box)
> * The actual search-part of the query now takes 70% (11 secs) of the combined time executing the query (blue box)
> h5. What to do about this?
> * I am not sure why there are two paths that SimpleFacets.getFacetFieldCounts can take (getListedTermCounts or getTermCounts) - but I am pretty sure there is a good reason. It seems like getListedTermCounts is used when <asked-for-facets> is noticeable lower than the total number of facets hit (believe it is when <asked-for-facets> * 1.5 + 10 is below actual number of facets hit)
> * *One solution* could be to just drop the getListedTermCounts-path and always go getTermCounts, but that is probably not at good idea, because getListedTermCounts is probably there for a performance reason (in other scenarios)
> * The comment above DocSetCollection.scratch says
> {code}
>   // in case there aren't that many hits, we may not want a very sparse
>   // bit array.  Optimistically collect the first few docs in an array
>   // in case there are only a few.
>   final int[] scratch;
> {code}
> The comment seems reasonable. But when we look at what values are used as "smallSetSize" for the DocSetCollection-constructor, it is always "maxDoc >> 6" (basically dividing by 64) - this value depends on maxDoc and will be high if maxDoc is high. In my case maxDoc is 50+ million a lot of the times resulting in "smallSetSize"s of 1+ million (that is not "a few"). I am very much in doubt why you want "smallSetSize" to increase as maxDoc increase - why not just always a low (fixed or something) value for "smallSetSize"? Is it ever a good idea with huge int-arrays for the "scratch"-array?
> * *Another solution* would be to never create "scratch"-arrays with size above e.g. 50
> * *There are probably several other potential solutions*
> I would really want your opinion on what solution to make, so that I do not unintentionally break good performance-optimizations, just because I missed some points explaining why the code is as it is today!?
> *Note* I have filed this as a 4.4 issues, because that is the platform I use for my tests etc. But I am sure the problem also exists on 4.5.1 (or whatever the latest 4.x release is)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org