You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2016/11/23 22:18:58 UTC
[jira] [Commented] (SOLR-7036) Faster method for group.facet

    [ https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691507#comment-15691507 ] 

Erick Erickson commented on SOLR-7036:
--------------------------------------

I'm thoroughly confused about the state of these two JIRAs, this one and SOLR-4763.

1> Do JSON facets supersede this? Should we just be moving to JSON facets? If yes, has the refinement step been added to JSON facets? Or is it even necessary/relevant?

2> does enabling DocValues sidestep this problem? We're recommending docValues for grouping and faceting after all. On some tests I did having DocValues for these fields sped made the timings roughly equal, but that may just mean I'm not testing correctly.

3> Last time I was in here on 23-Oct, there were some problems with the patch. Any progress on that front? I just ran the test that was failing so it looks like maybe the changes for SOLR-9654 [~yonik@apache.org] might have addressed point <1>. Not sure there's really anything to be done for <2>.



> Faster method for group.facet
> -----------------------------
>
>                 Key: SOLR-7036
>                 URL: https://issues.apache.org/jira/browse/SOLR-7036
>             Project: Solr
>          Issue Type: Improvement
>          Components: faceting
>    Affects Versions: 4.10.3
>            Reporter: Jim Musil
>            Assignee: Erick Erickson
>             Fix For: 5.5, 6.0
>
>         Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036_zipped.zip, jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with group.facet=true. The original code that collects and counts unique facet values for each group does not use the same improved field cache methods that have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which provides a much faster way to look up docs that contain a term. I've also added a simple grouping map so that when a term is found for a doc, it can quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or terms was high, the latency spiked to multiple second requests. This solution provides better overall performance -- from an average of 54ms to 32ms. It also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or the original. Add group.facet.method=fc to use the improved method or group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org