You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Solr User <so...@gmail.com> on 2017/02/06 22:56:10 UTC

Re: Faceting and Grouping Performance Degradation in Solr 5

I am pleased to report that we are in Production on Solr 5.5.3 with
comparable performance to Solr 4.8.1 through leveraging facet.method=uif as
well as https://issues.apache.org/jira/browse/SOLR-9176.  Thanks to
everyone who worked on these!

On Mon, Oct 3, 2016 at 3:55 PM, Solr User <so...@gmail.com> wrote:

> Below is some further testing.  This was done in an environment that had
> no other queries or updates during testing.  We ran through several
> scenarios so I pasted this with HTML formatting below so you may view this
> as a table.  Sorry if you have to pull this out into a different file for
> viewing, but I did not want the formatting to be messed up.  The times are
> average times in milliseconds.  Same test methodology as above except there
> was a 5 minute warmup and a 15 minute test.
>
> Note that both the segment and deletions were recorded from only 1 out of
> 2 of the shards so we cannot try to extrapolate a function between them and
> the outcome.  In other words, just view them as "non-optimized" versus
> "optimized" and "has deletions" versus "no deletions".  The only exceptions
> are the 0 deletes were true for both shards and the 1 segment and 8 segment
> cases were true for both shards.  A few of the tests were repeated as well.
>
> The only conclusion that I could draw is that the number of segments and
> the number of deletes appear to greatly influence the response times, at
> least more than any difference in Solr version.  There also appears to be
> some external contributor to variance....maybe network, etc.
>
> Thoughts?
>
>
> <table><tbody><tr><td>Date</td><td>9/29/2016</td><td>9/29/
> 2016</td><td>9/29/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>10/3/
> 2016</td><td>10/3/2016</td><td>10/3/2016</td><td>10/3/2016</td></tr><tr><td>Solr
> Version</td><td>5.5.2</td><td>5.5.2</td><td>4.8.1</td><td>4.
> 8.1</td><td>4.8.1</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2<
> /td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</
> td><td>5.5.2</td><td>4.8.1</td><td>4.8.1</td><td>4.8.1</
> td><td>4.8.1</td></tr><tr><td>Deleted Docs</td><td>57873</td><td>
> 57873</td><td>176958</td><td>593694</td><td>593694</td><td>
> 57873</td><td>57873</td><td>57873</td><td>57873</td><td>0<
> /td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0<
> /td><td>0</td></tr><tr><td>Segment Count</td><td>34</td><td>34</
> td><td>18</td><td>27</td><td>27</td><td>34</td><td>34</td><
> td>34</td><td>34</td><td>8</td><td>8</td><td>1</td><td>1</
> td><td>8</td><td>8</td><td>1</td><td>1</td></tr><tr><td>
> facet.method=uif</td><td>YES</td><td>YES</td><td>N/A</td><
> td>N/A</td><td>N/A</td><td>YES</td><td>YES</td><td>NO</
> td><td>NO</td><td>NO</td><td>YES</td><td>YES</td><td>NO</
> td><td>N/A</td><td>N/A</td><td>N/A</td><td>N/A</td></tr><tr><td>Scenario
> #1</td><td>198</td><td>210</td><td>145</td><td>186</td><
> td>190</td><td>208</td><td>209</td><td>210</td><td>206</
> td><td>109</td><td>142</td><td>73</td><td>70</td><td>160</
> td><td>109</td><td>83</td><td>85</td></tr><tr><td>Scenario
> #2</td><td>92</td><td>88</td><td>59</td><td>62</td><td>58</
> td><td>72</td><td>70</td><td>77</td><td>74</td><td>68</td><
> td>73</td><td>63</td><td>61</td><td>66</td><td>54</td><td>
> 52</td><td>51</td></tr></tbody></table>
>
>
>
>
> On Wed, Sep 28, 2016 at 4:44 PM, Solr User <so...@gmail.com> wrote:
>
>> I plan to re-test this in a separate environment that I have more control
>> over and will share the results when I can.
>>
>> On Wed, Sep 28, 2016 at 3:37 PM, Solr User <so...@gmail.com> wrote:
>>
>>> Certainly.  And I would of course welcome anyone else to test this for
>>> themselves especially with facet.method=uif to see if that has indeed
>>> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
>>> testing is invalid due to variance, problem in process, etc.  One thing I
>>> was pondering is if I should force merge the index to a certain amount of
>>> segments because indexing yields a random number of segments and
>>> deletions.  The only thing stopping me short of doing that were
>>> observations of longer Solr 4 times even with more deletions and similar
>>> number of segments.
>>>
>>> We use Soasta as our testing tool.  Before testing, load is sent for
>>> 10-15 minutes to make sure any Solr caches have stabilized.  Then the test
>>> is run for 30 minutes of steady volume with Scenario #1 tested at 15
>>> req/sec and Scenario #2 tested at 100 req/sec.  Each request is different
>>> with input being pulled from data files.  The requests are repeatable test
>>> to test.
>>>
>>> The numbers posted above are average response times as reported by
>>> Soasta.  However, respective time differences are supported by Splunk which
>>> indexes the Solr logs and Dynatrace which is instrumented on one of the
>>> JVM's.
>>>
>>> The versions are deployed to the same machines thereby overlaying the
>>> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
>>> the same input data.  Being in SolrCloud mode, the full indexing comprises
>>> of indexing all documents and then deleting any that were not touched.
>>> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
>>> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
>>> results as the previous Solr 4 test.
>>>
>>>
>>> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
>>> wrote:
>>>
>>>> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
>>>> > Further testing indicates that any performance difference is not due
>>>> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
>>>> > deletes.
>>>>
>>>> Sanity check: Could you describe how you test?
>>>>
>>>> * How many queries do you issue for each test?
>>>> * Are each query a new one or do you re-use the same query?
>>>> * Do you discard the first X calls?
>>>> * Are the numbers averages, medians or something third?
>>>> * What do you do about disk cache?
>>>> * Are both Solr's on the same machine?
>>>> * Do they use the same index?
>>>> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>>>>
>>>> - Toke Eskildsen, State and University Library, Denmark
>>>>
>>>
>>>
>>
>