You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Solr User <so...@gmail.com> on 2016/10/03 20:55:44 UTC

Re: Faceting and Grouping Performance Degradation in Solr 5

Below is some further testing.  This was done in an environment that had no
other queries or updates during testing.  We ran through several scenarios
so I pasted this with HTML formatting below so you may view this as a
table.  Sorry if you have to pull this out into a different file for
viewing, but I did not want the formatting to be messed up.  The times are
average times in milliseconds.  Same test methodology as above except there
was a 5 minute warmup and a 15 minute test.

Note that both the segment and deletions were recorded from only 1 out of 2
of the shards so we cannot try to extrapolate a function between them and
the outcome.  In other words, just view them as "non-optimized" versus
"optimized" and "has deletions" versus "no deletions".  The only exceptions
are the 0 deletes were true for both shards and the 1 segment and 8 segment
cases were true for both shards.  A few of the tests were repeated as well.

The only conclusion that I could draw is that the number of segments and
the number of deletes appear to greatly influence the response times, at
least more than any difference in Solr version.  There also appears to be
some external contributor to variance....maybe network, etc.

Thoughts?


<table><tbody><tr><td>Date</td><td>9/29/2016</td><td>9/29/2016</td><td>9/29/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/2016</td><td>10/3/2016</td><td>10/3/2016</td><td>10/3/2016</td><td>10/3/2016</td></tr><tr><td>Solr
Version</td><td>5.5.2</td><td>5.5.2</td><td>4.8.1</td><td>4.8.1</td><td>4.8.1</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>4.8.1</td><td>4.8.1</td><td>4.8.1</td><td>4.8.1</td></tr><tr><td>Deleted
Docs</td><td>57873</td><td>57873</td><td>176958</td><td>593694</td><td>593694</td><td>57873</td><td>57873</td><td>57873</td><td>57873</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr><tr><td>Segment
Count</td><td>34</td><td>34</td><td>18</td><td>27</td><td>27</td><td>34</td><td>34</td><td>34</td><td>34</td><td>8</td><td>8</td><td>1</td><td>1</td><td>8</td><td>8</td><td>1</td><td>1</td></tr><tr><td>facet.method=uif</td><td>YES</td><td>YES</td><td>N/A</td><td>N/A</td><td>N/A</td><td>YES</td><td>YES</td><td>NO</td><td>NO</td><td>NO</td><td>YES</td><td>YES</td><td>NO</td><td>N/A</td><td>N/A</td><td>N/A</td><td>N/A</td></tr><tr><td>Scenario
#1</td><td>198</td><td>210</td><td>145</td><td>186</td><td>190</td><td>208</td><td>209</td><td>210</td><td>206</td><td>109</td><td>142</td><td>73</td><td>70</td><td>160</td><td>109</td><td>83</td><td>85</td></tr><tr><td>Scenario
#2</td><td>92</td><td>88</td><td>59</td><td>62</td><td>58</td><td>72</td><td>70</td><td>77</td><td>74</td><td>68</td><td>73</td><td>63</td><td>61</td><td>66</td><td>54</td><td>52</td><td>51</td></tr></tbody></table>




On Wed, Sep 28, 2016 at 4:44 PM, Solr User <so...@gmail.com> wrote:

> I plan to re-test this in a separate environment that I have more control
> over and will share the results when I can.
>
> On Wed, Sep 28, 2016 at 3:37 PM, Solr User <so...@gmail.com> wrote:
>
>> Certainly.  And I would of course welcome anyone else to test this for
>> themselves especially with facet.method=uif to see if that has indeed
>> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
>> testing is invalid due to variance, problem in process, etc.  One thing I
>> was pondering is if I should force merge the index to a certain amount of
>> segments because indexing yields a random number of segments and
>> deletions.  The only thing stopping me short of doing that were
>> observations of longer Solr 4 times even with more deletions and similar
>> number of segments.
>>
>> We use Soasta as our testing tool.  Before testing, load is sent for
>> 10-15 minutes to make sure any Solr caches have stabilized.  Then the test
>> is run for 30 minutes of steady volume with Scenario #1 tested at 15
>> req/sec and Scenario #2 tested at 100 req/sec.  Each request is different
>> with input being pulled from data files.  The requests are repeatable test
>> to test.
>>
>> The numbers posted above are average response times as reported by
>> Soasta.  However, respective time differences are supported by Splunk which
>> indexes the Solr logs and Dynatrace which is instrumented on one of the
>> JVM's.
>>
>> The versions are deployed to the same machines thereby overlaying the
>> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
>> the same input data.  Being in SolrCloud mode, the full indexing comprises
>> of indexing all documents and then deleting any that were not touched.
>> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
>> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
>> results as the previous Solr 4 test.
>>
>>
>> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
>> wrote:
>>
>>> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
>>> > Further testing indicates that any performance difference is not due
>>> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
>>> > deletes.
>>>
>>> Sanity check: Could you describe how you test?
>>>
>>> * How many queries do you issue for each test?
>>> * Are each query a new one or do you re-use the same query?
>>> * Do you discard the first X calls?
>>> * Are the numbers averages, medians or something third?
>>> * What do you do about disk cache?
>>> * Are both Solr's on the same machine?
>>> * Do they use the same index?
>>> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>>>
>>> - Toke Eskildsen, State and University Library, Denmark
>>>
>>
>>
>

Re: Faceting and Grouping Performance Degradation in Solr 5

Posted by Solr User <so...@gmail.com>.
I am pleased to report that we are in Production on Solr 5.5.3 with
comparable performance to Solr 4.8.1 through leveraging facet.method=uif as
well as https://issues.apache.org/jira/browse/SOLR-9176.  Thanks to
everyone who worked on these!

On Mon, Oct 3, 2016 at 3:55 PM, Solr User <so...@gmail.com> wrote:

> Below is some further testing.  This was done in an environment that had
> no other queries or updates during testing.  We ran through several
> scenarios so I pasted this with HTML formatting below so you may view this
> as a table.  Sorry if you have to pull this out into a different file for
> viewing, but I did not want the formatting to be messed up.  The times are
> average times in milliseconds.  Same test methodology as above except there
> was a 5 minute warmup and a 15 minute test.
>
> Note that both the segment and deletions were recorded from only 1 out of
> 2 of the shards so we cannot try to extrapolate a function between them and
> the outcome.  In other words, just view them as "non-optimized" versus
> "optimized" and "has deletions" versus "no deletions".  The only exceptions
> are the 0 deletes were true for both shards and the 1 segment and 8 segment
> cases were true for both shards.  A few of the tests were repeated as well.
>
> The only conclusion that I could draw is that the number of segments and
> the number of deletes appear to greatly influence the response times, at
> least more than any difference in Solr version.  There also appears to be
> some external contributor to variance....maybe network, etc.
>
> Thoughts?
>
>
> <table><tbody><tr><td>Date</td><td>9/29/2016</td><td>9/29/
> 2016</td><td>9/29/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>9/30/
> 2016</td><td>9/30/2016</td><td>9/30/2016</td><td>10/3/
> 2016</td><td>10/3/2016</td><td>10/3/2016</td><td>10/3/2016</td></tr><tr><td>Solr
> Version</td><td>5.5.2</td><td>5.5.2</td><td>4.8.1</td><td>4.
> 8.1</td><td>4.8.1</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2<
> /td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</td><td>5.5.2</
> td><td>5.5.2</td><td>4.8.1</td><td>4.8.1</td><td>4.8.1</
> td><td>4.8.1</td></tr><tr><td>Deleted Docs</td><td>57873</td><td>
> 57873</td><td>176958</td><td>593694</td><td>593694</td><td>
> 57873</td><td>57873</td><td>57873</td><td>57873</td><td>0<
> /td><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td><td>0<
> /td><td>0</td></tr><tr><td>Segment Count</td><td>34</td><td>34</
> td><td>18</td><td>27</td><td>27</td><td>34</td><td>34</td><
> td>34</td><td>34</td><td>8</td><td>8</td><td>1</td><td>1</
> td><td>8</td><td>8</td><td>1</td><td>1</td></tr><tr><td>
> facet.method=uif</td><td>YES</td><td>YES</td><td>N/A</td><
> td>N/A</td><td>N/A</td><td>YES</td><td>YES</td><td>NO</
> td><td>NO</td><td>NO</td><td>YES</td><td>YES</td><td>NO</
> td><td>N/A</td><td>N/A</td><td>N/A</td><td>N/A</td></tr><tr><td>Scenario
> #1</td><td>198</td><td>210</td><td>145</td><td>186</td><
> td>190</td><td>208</td><td>209</td><td>210</td><td>206</
> td><td>109</td><td>142</td><td>73</td><td>70</td><td>160</
> td><td>109</td><td>83</td><td>85</td></tr><tr><td>Scenario
> #2</td><td>92</td><td>88</td><td>59</td><td>62</td><td>58</
> td><td>72</td><td>70</td><td>77</td><td>74</td><td>68</td><
> td>73</td><td>63</td><td>61</td><td>66</td><td>54</td><td>
> 52</td><td>51</td></tr></tbody></table>
>
>
>
>
> On Wed, Sep 28, 2016 at 4:44 PM, Solr User <so...@gmail.com> wrote:
>
>> I plan to re-test this in a separate environment that I have more control
>> over and will share the results when I can.
>>
>> On Wed, Sep 28, 2016 at 3:37 PM, Solr User <so...@gmail.com> wrote:
>>
>>> Certainly.  And I would of course welcome anyone else to test this for
>>> themselves especially with facet.method=uif to see if that has indeed
>>> bridged the gap between Solr 4 and Solr 5.  I would be very happy if my
>>> testing is invalid due to variance, problem in process, etc.  One thing I
>>> was pondering is if I should force merge the index to a certain amount of
>>> segments because indexing yields a random number of segments and
>>> deletions.  The only thing stopping me short of doing that were
>>> observations of longer Solr 4 times even with more deletions and similar
>>> number of segments.
>>>
>>> We use Soasta as our testing tool.  Before testing, load is sent for
>>> 10-15 minutes to make sure any Solr caches have stabilized.  Then the test
>>> is run for 30 minutes of steady volume with Scenario #1 tested at 15
>>> req/sec and Scenario #2 tested at 100 req/sec.  Each request is different
>>> with input being pulled from data files.  The requests are repeatable test
>>> to test.
>>>
>>> The numbers posted above are average response times as reported by
>>> Soasta.  However, respective time differences are supported by Splunk which
>>> indexes the Solr logs and Dynatrace which is instrumented on one of the
>>> JVM's.
>>>
>>> The versions are deployed to the same machines thereby overlaying the
>>> previous installation.  Going Solr 4 to Solr 5, full indexing is run with
>>> the same input data.  Being in SolrCloud mode, the full indexing comprises
>>> of indexing all documents and then deleting any that were not touched.
>>> Going Solr 5 back to Solr 4, the snapshot is restored since Solr 4 will not
>>> load with a Solr 5 index.  Testing Solr 4 after reverting yields the same
>>> results as the previous Solr 4 test.
>>>
>>>
>>> On Wed, Sep 28, 2016 at 4:02 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
>>> wrote:
>>>
>>>> On Tue, 2016-09-27 at 15:08 -0500, Solr User wrote:
>>>> > Further testing indicates that any performance difference is not due
>>>> > to deletes.  Both Solr 4.8.1 and Solr 5.5.2 benefited from removing
>>>> > deletes.
>>>>
>>>> Sanity check: Could you describe how you test?
>>>>
>>>> * How many queries do you issue for each test?
>>>> * Are each query a new one or do you re-use the same query?
>>>> * Do you discard the first X calls?
>>>> * Are the numbers averages, medians or something third?
>>>> * What do you do about disk cache?
>>>> * Are both Solr's on the same machine?
>>>> * Do they use the same index?
>>>> * Do you alternate between testing 4.8.1 and 5.5.2 first?
>>>>
>>>> - Toke Eskildsen, State and University Library, Denmark
>>>>
>>>
>>>
>>
>