You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Lorenzo Fundaró <lo...@dawandamail.com> on 2015/10/14 12:03:03 UTC

slow queries

Hello,

I have following conf for filters and commits :

Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)

     <autoCommit>
       <!-- Every 15 seconds -->
       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
       <openSearcher>false</openSearcher>
     </autoCommit>

     <autoSoftCommit>
       <!-- Every 10 minutes -->
       <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
     </autoSoftCommit>

and the following stats for filters:

lookups = 3602
hits  =  3148
hit ratio = 0.87
inserts = 455
evictions = 400
size = 63
warmupTime = 770

*Problem: *a lot of slow queries, for example:

{q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl=map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
desc&fq={!cost=1 cache=true}type_s:Product AND is_valid_b:true&fq={!cost=50
cache=true}in_languages_t:de&fq={!cost=99
cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
(cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378

I could increase the size of the filter so I would decrease the amount of
evictions, but it seems to me this would not be solving the root problem.

Some ideas on where/how to start for optimisation ? Is it actually normal
that this query takes this time ?

We have an index of ~14 million docs. 4 replicas with two cores and 1 shard
each.

thank you.


-- 

-- 
Lorenzo Fundaro
Backend Engineer
E-Mail: lorenzo.fundaro@dawandamail.com

Fax       + 49 - (0)30 - 25 76 08 52
Tel        + 49 - (0)179 - 51 10 982

DaWanda GmbH
Windscheidstraße 18
10627 Berlin

Geschäftsführer: Claudia Helming, Michael Pütz
Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Erick Erickson <er...@gmail.com>.

bq: They are definetly cached. The second time runs in no time.

That's not what I was referring to. Submitting the same query
over will certainly hit the queryResultCache and return in
almost no time.

What I meant was do things like vary the fq clause you have
where you've set cache=false. Or vary the parameters in the fq clauses.
The point is to only take measurements after enough queries have gone
through so you're sure the low-level caches are initialized. But the
queries all have to be different or you hit the queryResultCache.

 Best,
Erick

On Wed, Oct 14, 2015 at 9:50 AM, Lorenzo Fundaró
<lo...@dawandamail.com> wrote:
> On 14 October 2015 at 18:18, Pushkar Raste <pu...@gmail.com> wrote:
>
>> Consider
>> 1. Turning on docValues for fields you are sorting, faceting on. This will
>> require to reindex your data
>>
>
> Yes. I am considering doing this.
>
>
>> 2. Try using TrieInt type field you are trying to do range search on (you
>> may have to fiddle with precisoinStep) to balance index size vs
>> performance.
>>
>
> Ok.
>
>
>> 3. If slowness is intermittent - turn on GC logging and see if there are
>> any long and tune GC strategy accordingly.
>>
>
> The Gc strategy is the default that comes when starting solr with bin/solr
> start script. And I was looking at the GC logs, and saw no Full GC at all.
>
> Thank you !
>
>
>
>>
>> -- Pushkar Raste
>>
>> On Wed, Oct 14, 2015 at 5:03 AM, Lorenzo Fundaró <
>> lorenzo.fundaro@dawandamail.com> wrote:
>>
>> > Hello,
>> >
>> > I have following conf for filters and commits :
>> >
>> > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
>> > acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
>> > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>> >
>> >      <autoCommit>
>> >        <!-- Every 15 seconds -->
>> >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>> >        <openSearcher>false</openSearcher>
>> >      </autoCommit>
>> >
>> >      <autoSoftCommit>
>> >        <!-- Every 10 minutes -->
>> >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>> >      </autoSoftCommit>
>> >
>> > and the following stats for filters:
>> >
>> > lookups = 3602
>> > hits  =  3148
>> > hit ratio = 0.87
>> > inserts = 455
>> > evictions = 400
>> > size = 63
>> > warmupTime = 770
>> >
>> > *Problem: *a lot of slow queries, for example:
>> >
>> > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
>> > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
>> > desc&fq={!cost=1 cache=true}type_s:Product AND
>> is_valid_b:true&fq={!cost=50
>> > cache=true}in_languages_t:de&fq={!cost=99
>> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
>> > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
>> >
>> > I could increase the size of the filter so I would decrease the amount of
>> > evictions, but it seems to me this would not be solving the root problem.
>> >
>> > Some ideas on where/how to start for optimisation ? Is it actually normal
>> > that this query takes this time ?
>> >
>> > We have an index of ~14 million docs. 4 replicas with two cores and 1
>> shard
>> > each.
>> >
>> > thank you.
>> >
>> >
>> > --
>> >
>> > --
>> > Lorenzo Fundaro
>> > Backend Engineer
>> > E-Mail: lorenzo.fundaro@dawandamail.com
>> >
>> > Fax       + 49 - (0)30 - 25 76 08 52
>> > Tel        + 49 - (0)179 - 51 10 982
>> >
>> > DaWanda GmbH
>> > Windscheidstraße 18
>> > 10627 Berlin
>> >
>> > Geschäftsführer: Claudia Helming, Michael Pütz
>> > Amtsgericht Charlottenburg HRB 104695 B
>> >
>>
>
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Lorenzo Fundaró <lo...@dawandamail.com>.

On 14 October 2015 at 18:18, Pushkar Raste <pu...@gmail.com> wrote:

> Consider
> 1. Turning on docValues for fields you are sorting, faceting on. This will
> require to reindex your data
>

Yes. I am considering doing this.


> 2. Try using TrieInt type field you are trying to do range search on (you
> may have to fiddle with precisoinStep) to balance index size vs
> performance.
>

Ok.


> 3. If slowness is intermittent - turn on GC logging and see if there are
> any long and tune GC strategy accordingly.
>

The Gc strategy is the default that comes when starting solr with bin/solr
start script. And I was looking at the GC logs, and saw no Full GC at all.

Thank you !



>
> -- Pushkar Raste
>
> On Wed, Oct 14, 2015 at 5:03 AM, Lorenzo Fundaró <
> lorenzo.fundaro@dawandamail.com> wrote:
>
> > Hello,
> >
> > I have following conf for filters and commits :
> >
> > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> > acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
> > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
> >
> >      <autoCommit>
> >        <!-- Every 15 seconds -->
> >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> >        <openSearcher>false</openSearcher>
> >      </autoCommit>
> >
> >      <autoSoftCommit>
> >        <!-- Every 10 minutes -->
> >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
> >      </autoSoftCommit>
> >
> > and the following stats for filters:
> >
> > lookups = 3602
> > hits  =  3148
> > hit ratio = 0.87
> > inserts = 455
> > evictions = 400
> > size = 63
> > warmupTime = 770
> >
> > *Problem: *a lot of slow queries, for example:
> >
> > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> > desc&fq={!cost=1 cache=true}type_s:Product AND
> is_valid_b:true&fq={!cost=50
> > cache=true}in_languages_t:de&fq={!cost=99
> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
> >
> > I could increase the size of the filter so I would decrease the amount of
> > evictions, but it seems to me this would not be solving the root problem.
> >
> > Some ideas on where/how to start for optimisation ? Is it actually normal
> > that this query takes this time ?
> >
> > We have an index of ~14 million docs. 4 replicas with two cores and 1
> shard
> > each.
> >
> > thank you.
> >
> >
> > --
> >
> > --
> > Lorenzo Fundaro
> > Backend Engineer
> > E-Mail: lorenzo.fundaro@dawandamail.com
> >
> > Fax       + 49 - (0)30 - 25 76 08 52
> > Tel        + 49 - (0)179 - 51 10 982
> >
> > DaWanda GmbH
> > Windscheidstraße 18
> > 10627 Berlin
> >
> > Geschäftsführer: Claudia Helming, Michael Pütz
> > Amtsgericht Charlottenburg HRB 104695 B
> >
>



-- 

-- 
Lorenzo Fundaro
Backend Engineer
E-Mail: lorenzo.fundaro@dawandamail.com

Fax       + 49 - (0)30 - 25 76 08 52
Tel        + 49 - (0)179 - 51 10 982

DaWanda GmbH
Windscheidstraße 18
10627 Berlin

Geschäftsführer: Claudia Helming, Michael Pütz
Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Pushkar Raste <pu...@gmail.com>.

Consider
1. Turning on docValues for fields you are sorting, faceting on. This will
require to reindex your data
2. Try using TrieInt type field you are trying to do range search on (you
may have to fiddle with precisoinStep) to balance index size vs performance.
3. If slowness is intermittent - turn on GC logging and see if there are
any long and tune GC strategy accordingly.

-- Pushkar Raste

On Wed, Oct 14, 2015 at 5:03 AM, Lorenzo Fundaró <
lorenzo.fundaro@dawandamail.com> wrote:

> Hello,
>
> I have following conf for filters and commits :
>
> Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
> regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>
>      <autoCommit>
>        <!-- Every 15 seconds -->
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>      <autoSoftCommit>
>        <!-- Every 10 minutes -->
>        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>      </autoSoftCommit>
>
> and the following stats for filters:
>
> lookups = 3602
> hits  =  3148
> hit ratio = 0.87
> inserts = 455
> evictions = 400
> size = 63
> warmupTime = 770
>
> *Problem: *a lot of slow queries, for example:
>
> {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> desc&fq={!cost=1 cache=true}type_s:Product AND is_valid_b:true&fq={!cost=50
> cache=true}in_languages_t:de&fq={!cost=99
> cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
>
> I could increase the size of the filter so I would decrease the amount of
> evictions, but it seems to me this would not be solving the root problem.
>
> Some ideas on where/how to start for optimisation ? Is it actually normal
> that this query takes this time ?
>
> We have an index of ~14 million docs. 4 replicas with two cores and 1 shard
> each.
>
> thank you.
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B
>

Re: slow queries

Posted by Susheel Kumar <su...@gmail.com>.

Hi Lorenzo,

Can you provide which solr version you are using, index size on disks &
hardware config (memory/processor on each machine.

Thanks,
Susheel

On Wed, Oct 14, 2015 at 6:03 AM, Lorenzo Fundaró <
lorenzo.fundaro@dawandamail.com> wrote:

> Hello,
>
> I have following conf for filters and commits :
>
> Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
> regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>
>      <autoCommit>
>        <!-- Every 15 seconds -->
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>      <autoSoftCommit>
>        <!-- Every 10 minutes -->
>        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>      </autoSoftCommit>
>
> and the following stats for filters:
>
> lookups = 3602
> hits  =  3148
> hit ratio = 0.87
> inserts = 455
> evictions = 400
> size = 63
> warmupTime = 770
>
> *Problem: *a lot of slow queries, for example:
>
> {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> desc&fq={!cost=1 cache=true}type_s:Product AND is_valid_b:true&fq={!cost=50
> cache=true}in_languages_t:de&fq={!cost=99
> cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
>
> I could increase the size of the filter so I would decrease the amount of
> evictions, but it seems to me this would not be solving the root problem.
>
> Some ideas on where/how to start for optimisation ? Is it actually normal
> that this query takes this time ?
>
> We have an index of ~14 million docs. 4 replicas with two cores and 1 shard
> each.
>
> thank you.
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B
>

Re: slow queries

Posted by Erick Erickson <er...@gmail.com>.

Also check the Zookeeper config file settings for Zookeeper's keep-alive,
in some versions that was set to 15 seconds. Still shouldn't
be tripped by a 4 second GC pause.

Best,
Erick

On Thu, Oct 15, 2015 at 7:33 AM, Lorenzo Fundaró
<lo...@dawandamail.com> wrote:
> On 15 October 2015 at 15:59, Eric Torti <er...@gmail.com> wrote:
>
>> Hi, Lorenzo,
>>
>> I don't think this has a direct relation to your problem but it looks
>> like you're setting -DzkClientTimeout twice. From what I know about
>>
>
> Thanks, I am aware of this double setting, but I also think the jvm is
> taking the last argument which is the intended
> one.
>
>
>> setting VM arguments twice, you're
>> probably ending up with the last one being enforced.
>>
>> Just something to be aware of I guess.
>>
>> I don't think this relates to your problem because the GC pauses are
>> not superior to 30s which seems to be the time zookeeper would let a
>> node be irresponsive before considering it in recovery.
>>
>
> Ok, good to consider. Thanks :)
>
>>
>> Best,
>>
>> Eric Torti
>>
>> On Thu, Oct 15, 2015 at 6:51 AM, Lorenzo Fundaró
>> <lo...@dawandamail.com> wrote:
>> > On 14 October 2015 at 20:35, Pushkar Raste <pu...@gmail.com>
>> wrote:
>> >
>> >> You may want to start solr with following settings to enable logging GC
>> >> details. Here are some flags you might want to enable.
>> >>
>> >> -Xloggc:<path_to_log_dir>/gc.log
>> >> -XX:+PrintGCDetails
>> >> -XX:+PrintGCDateStamps
>> >> -XX:+PrintGCTimeStamps
>> >> -XX:+PrintTenuringDistribution
>> >> -XX:+PrintGCApplicationStoppedTime
>> >> -XX:+PrintHeapAtGC
>> >>
>> >> Once you have GC logs, look for string "Total time for which application
>> >> threads were stopped" to check if you have long pauses (you may get long
>> >> pauses even with young generation GC).
>> >>
>> >
>> >
>> > Yes, there are several lines indicating that threads are being stopped.
>> > There is this one particularly that draw my attention because right
>> after a
>> > second it happened 2 of my replicas went into
>> > recovery mode, including the one who suffered the thread stop.
>> >
>> > solr_gc.log.1.current:2015-10-15T07:47:03.263+0000: 251173.653: Total
>> time
>> > for which application threads were stopped: 1.4936161 seconds, Stopping
>> > threads took: 0.0000502 seconds
>> > (is a second of stopped threads enough to have a node in recovery node ?)
>> > When this happened, the leader had a couple of connection resets while
>> > trying to communicate with this replica.
>> >
>> > and this server the highest stop takes 4s.
>> >
>> > solr_gc.log.1.current:2015-10-14T20:24:01.353+0000: 210191.743: Total
>> time
>> > for which application threads were stopped: 4.0111066 seconds, Stopping
>> > threads took: 0.0000776 seconds
>> >
>> > These are the jvm flags
>> >
>> > -XX:NewSize=256m -XX:MaxNewSize=256m
>> > /usr/lib/jvm/java-8-oracle/bin/java -server -Xss256k -Xms16g -Xmx16g
>> > -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90
>> > -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> > -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
>> > -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
>> > -XX:CMSInitiatingOccupancyFraction=50
>> -XX:CMSMaxAbortablePrecleanTime=6000
>> > -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc
>> > -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
>> > -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
>> > -XX:+PrintGCApplicationStoppedTime -Xloggc:/solr_gc.log
>> > -DzkClientTimeout=15000   -Duser.timezone=UTC
>> > -Djava.net.preferIPv4Stack=true -DzkClientTimeout=30000  -XX:NewSize=256m
>> > -XX:MaxNewSize=256m
>> >
>> > All of the options are the default that come with the solr startup
>> script,
>> > only the ones specified in the first line are being put by us.
>> >
>> >
>> >
>> >
>> >>
>> >> -- Pushkar Raste
>> >>
>> >> On Wed, Oct 14, 2015 at 11:47 AM, Lorenzo Fundaró <
>> >> lorenzo.fundaro@dawandamail.com> wrote:
>> >>
>> >> > <<What do you see if you attach
>> >> > &debug=true to the query?>>
>> >> >
>> >> > "debug": { "rawquerystring": "*:*", "querystring": "*:*",
>> "parsedquery":
>> >> > "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*",
>> "
>> >> > explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> product
>> >> > of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:30852121":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:22298365":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:38243373":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:45356790":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:38798130":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:51012182":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:3142218":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:10726118":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> >> > product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
>> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
>> "Product:13702130":
>> >> > "\n1.0
>> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> >> > Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
>> 1.0
>> >> =
>> >> > queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring":
>> >> null,
>> >> > "boost_queries": null, "parsed_boost_queries": [], "boostfuncs":
>> null, "
>> >> > filter_queries": [ "{!cost=1 cache=true}type_s:Product AND
>> >> > is_valid_b:true",
>> >> > "{!cost=50 cache=true}in_languages_t:de", "{!cost=99
>> >> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL))
>> AND
>> >> > (cents_ri: [* TO 3000])" ], "parsed_filter_queries": [
>> "+type_s:Product
>> >> > +is_valid_b:true", "in_languages_t:de", "{!cache=false
>> >> > cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
>> >> > shipping_country_codes_mt:eur shipping_country_codes_mt:all)
>> +cents_ri:[*
>> >> > TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query":
>> { "
>> >> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 },
>> "highlight": {
>> >> "
>> >> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
>> >> "spellcheck":
>> >> > {
>> >> > "time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18,
>> "query":
>> >> {
>> >> > "
>> >> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 },
>> "highlight": {
>> >> "
>> >> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
>> >> "spellcheck":
>> >> > {
>> >> > "time": 0 }, "debug": { "time": 18 } }
>> >> >
>> >> > I think this clause is wrong:
>> >> > (cents_ri: [* 3000])
>> >> >
>> >> > I think you mean
>> >> > (cents_ri: [* TO 3000])
>> >> >
>> >> > I think I made no difference. I tried both and they both worked.
>> >> >
>> >> > But are these slow queries constant or intermittent?
>> >> >
>> >> > They are definetly cached. The second time runs in no time.
>> >> >
>> >> > I gonna try adding them in the pre warmcache too. And see the results.
>> >> >
>> >> > The field that I used for sorting is indexed but not stored and it's
>> not
>> >> a
>> >> > DocValue. I tried the query without the sort and the performance didnt
>> >> > change significantly.
>> >> >
>> >> > By the time I saw that log, the server was getting around 7k updates
>> per
>> >> > minute. The query information I don't have it now, but I think it will
>> >> > qualify as heavy load query.
>> >> >
>> >> > Thank you !
>> >> >
>> >> >
>> >> > On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com>
>> >> > wrote:
>> >> >
>> >> > > A couple of things don't particularly make sense here:
>> >> > >
>> >> > > You specify edismax, q=*:* yet you specify qf=
>> >> > > You're searching across whatever you defined as the default
>> >> > > field in the request handler. What do you see if you attach
>> >> > > &debug=true to the query?
>> >> > >
>> >> > > I think this clause is wrong:
>> >> > > (cents_ri: [* 3000])
>> >> > >
>> >> > > I think you mean
>> >> > > (cents_ri: [* TO 3000])
>> >> > >
>> >> > > I'm not sure either of those is the problem, but are places I'd
>> start.
>> >> > >
>> >> > > As far as the size of your filter cache goes, a hit ratio of .87
>> >> actually
>> >> > > isn't bad. Upping the size would add some marginal benefit, but it's
>> >> > > unlikely to be a magic bullet.
>> >> > >
>> >> > > But are these slow queries constant or intermittent? In other words,
>> >> > > are all queries of this general form slow or just the first few? In
>> >> > > particular
>> >> > > is the first query that mentions sorting on this field slow but
>> >> > subsequent
>> >> > > ones faster? In that case consider adding a query to the newSearcher
>> >> > > event in solrconfig.xml that mentions this sort, that would pre-warm
>> >> > > the sort values. Also, defining all fields that you sort on as
>> >> > > docValues="true"
>> >> > > is recommended at this point.
>> >> > >
>> >> > > What I'd try is removing clauses to see which one is the problem. On
>> >> > > the surface this is surprisingly slow. And how heavily loaded is the
>> >> > > server?
>> >> > > Your autocommit settings look fine, my question is more how much
>> >> indexing
>> >> > > and querying is going on when you take these measurements.
>> >> > >
>> >> > > Best,
>> >> > > Erick
>> >> > >
>> >> > > On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
>> >> > > <lo...@dawandamail.com> wrote:
>> >> > > > Hello,
>> >> > > >
>> >> > > > I have following conf for filters and commits :
>> >> > > >
>> >> > > > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
>> >> > > > acceptableSize=60, cleanupThread=false, timeDecay=true,
>> >> > autowarmCount=8,
>> >> > > > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>> >> > > >
>> >> > > >      <autoCommit>
>> >> > > >        <!-- Every 15 seconds -->
>> >> > > >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>> >> > > >        <openSearcher>false</openSearcher>
>> >> > > >      </autoCommit>
>> >> > > >
>> >> > > >      <autoSoftCommit>
>> >> > > >        <!-- Every 10 minutes -->
>> >> > > >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>> >> > > >      </autoSoftCommit>
>> >> > > >
>> >> > > > and the following stats for filters:
>> >> > > >
>> >> > > > lookups = 3602
>> >> > > > hits  =  3148
>> >> > > > hit ratio = 0.87
>> >> > > > inserts = 455
>> >> > > > evictions = 400
>> >> > > > size = 63
>> >> > > > warmupTime = 770
>> >> > > >
>> >> > > > *Problem: *a lot of slow queries, for example:
>> >> > > >
>> >> > > > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
>> >> > > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
>> >> > > > desc&fq={!cost=1 cache=true}type_s:Product AND
>> >> > > is_valid_b:true&fq={!cost=50
>> >> > > > cache=true}in_languages_t:de&fq={!cost=99
>> >> > > > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR
>> ALL))
>> >> AND
>> >> > > > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0
>> >> QTime=1378
>> >> > > >
>> >> > > > I could increase the size of the filter so I would decrease the
>> >> amount
>> >> > of
>> >> > > > evictions, but it seems to me this would not be solving the root
>> >> > problem.
>> >> > > >
>> >> > > > Some ideas on where/how to start for optimisation ? Is it actually
>> >> > normal
>> >> > > > that this query takes this time ?
>> >> > > >
>> >> > > > We have an index of ~14 million docs. 4 replicas with two cores
>> and 1
>> >> > > shard
>> >> > > > each.
>> >> > > >
>> >> > > > thank you.
>> >> > > >
>> >> > > >
>> >> > > > --
>> >> > > >
>> >> > > > --
>> >> > > > Lorenzo Fundaro
>> >> > > > Backend Engineer
>> >> > > > E-Mail: lorenzo.fundaro@dawandamail.com
>> >> > > >
>> >> > > > Fax       + 49 - (0)30 - 25 76 08 52
>> >> > > > Tel        + 49 - (0)179 - 51 10 982
>> >> > > >
>> >> > > > DaWanda GmbH
>> >> > > > Windscheidstraße 18
>> >> > > > 10627 Berlin
>> >> > > >
>> >> > > > Geschäftsführer: Claudia Helming, Michael Pütz
>> >> > > > Amtsgericht Charlottenburg HRB 104695 B
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > --
>> >> > Lorenzo Fundaro
>> >> > Backend Engineer
>> >> > E-Mail: lorenzo.fundaro@dawandamail.com
>> >> >
>> >> > Fax       + 49 - (0)30 - 25 76 08 52
>> >> > Tel        + 49 - (0)179 - 51 10 982
>> >> >
>> >> > DaWanda GmbH
>> >> > Windscheidstraße 18
>> >> > 10627 Berlin
>> >> >
>> >> > Geschäftsführer: Claudia Helming, Michael Pütz
>> >> > Amtsgericht Charlottenburg HRB 104695 B
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> >
>> > --
>> > Lorenzo Fundaro
>> > Backend Engineer
>> > E-Mail: lorenzo.fundaro@dawandamail.com
>> >
>> > Fax       + 49 - (0)30 - 25 76 08 52
>> > Tel        + 49 - (0)179 - 51 10 982
>> >
>> > DaWanda GmbH
>> > Windscheidstraße 18
>> > 10627 Berlin
>> >
>> > Geschäftsführer: Claudia Helming, Michael Pütz
>> > Amtsgericht Charlottenburg HRB 104695 B
>>
>
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Lorenzo Fundaró <lo...@dawandamail.com>.

On 15 October 2015 at 15:59, Eric Torti <er...@gmail.com> wrote:

> Hi, Lorenzo,
>
> I don't think this has a direct relation to your problem but it looks
> like you're setting -DzkClientTimeout twice. From what I know about
>

Thanks, I am aware of this double setting, but I also think the jvm is
taking the last argument which is the intended
one.


> setting VM arguments twice, you're
> probably ending up with the last one being enforced.
>
> Just something to be aware of I guess.
>
> I don't think this relates to your problem because the GC pauses are
> not superior to 30s which seems to be the time zookeeper would let a
> node be irresponsive before considering it in recovery.
>

Ok, good to consider. Thanks :)

>
> Best,
>
> Eric Torti
>
> On Thu, Oct 15, 2015 at 6:51 AM, Lorenzo Fundaró
> <lo...@dawandamail.com> wrote:
> > On 14 October 2015 at 20:35, Pushkar Raste <pu...@gmail.com>
> wrote:
> >
> >> You may want to start solr with following settings to enable logging GC
> >> details. Here are some flags you might want to enable.
> >>
> >> -Xloggc:<path_to_log_dir>/gc.log
> >> -XX:+PrintGCDetails
> >> -XX:+PrintGCDateStamps
> >> -XX:+PrintGCTimeStamps
> >> -XX:+PrintTenuringDistribution
> >> -XX:+PrintGCApplicationStoppedTime
> >> -XX:+PrintHeapAtGC
> >>
> >> Once you have GC logs, look for string "Total time for which application
> >> threads were stopped" to check if you have long pauses (you may get long
> >> pauses even with young generation GC).
> >>
> >
> >
> > Yes, there are several lines indicating that threads are being stopped.
> > There is this one particularly that draw my attention because right
> after a
> > second it happened 2 of my replicas went into
> > recovery mode, including the one who suffered the thread stop.
> >
> > solr_gc.log.1.current:2015-10-15T07:47:03.263+0000: 251173.653: Total
> time
> > for which application threads were stopped: 1.4936161 seconds, Stopping
> > threads took: 0.0000502 seconds
> > (is a second of stopped threads enough to have a node in recovery node ?)
> > When this happened, the leader had a couple of connection resets while
> > trying to communicate with this replica.
> >
> > and this server the highest stop takes 4s.
> >
> > solr_gc.log.1.current:2015-10-14T20:24:01.353+0000: 210191.743: Total
> time
> > for which application threads were stopped: 4.0111066 seconds, Stopping
> > threads took: 0.0000776 seconds
> >
> > These are the jvm flags
> >
> > -XX:NewSize=256m -XX:MaxNewSize=256m
> > /usr/lib/jvm/java-8-oracle/bin/java -server -Xss256k -Xms16g -Xmx16g
> > -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90
> > -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> > -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
> > -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
> > -XX:CMSInitiatingOccupancyFraction=50
> -XX:CMSMaxAbortablePrecleanTime=6000
> > -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc
> > -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> > -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
> > -XX:+PrintGCApplicationStoppedTime -Xloggc:/solr_gc.log
> > -DzkClientTimeout=15000   -Duser.timezone=UTC
> > -Djava.net.preferIPv4Stack=true -DzkClientTimeout=30000  -XX:NewSize=256m
> > -XX:MaxNewSize=256m
> >
> > All of the options are the default that come with the solr startup
> script,
> > only the ones specified in the first line are being put by us.
> >
> >
> >
> >
> >>
> >> -- Pushkar Raste
> >>
> >> On Wed, Oct 14, 2015 at 11:47 AM, Lorenzo Fundaró <
> >> lorenzo.fundaro@dawandamail.com> wrote:
> >>
> >> > <<What do you see if you attach
> >> > &debug=true to the query?>>
> >> >
> >> > "debug": { "rawquerystring": "*:*", "querystring": "*:*",
> "parsedquery":
> >> > "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*",
> "
> >> > explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> product
> >> > of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:30852121":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:22298365":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:38243373":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:45356790":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:38798130":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:51012182":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:3142218":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:10726118":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
> >> > product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
> >> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n",
> "Product:13702130":
> >> > "\n1.0
> >> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> >> > Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n
> 1.0
> >> =
> >> > queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring":
> >> null,
> >> > "boost_queries": null, "parsed_boost_queries": [], "boostfuncs":
> null, "
> >> > filter_queries": [ "{!cost=1 cache=true}type_s:Product AND
> >> > is_valid_b:true",
> >> > "{!cost=50 cache=true}in_languages_t:de", "{!cost=99
> >> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL))
> AND
> >> > (cents_ri: [* TO 3000])" ], "parsed_filter_queries": [
> "+type_s:Product
> >> > +is_valid_b:true", "in_languages_t:de", "{!cache=false
> >> > cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
> >> > shipping_country_codes_mt:eur shipping_country_codes_mt:all)
> +cents_ri:[*
> >> > TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query":
> { "
> >> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 },
> "highlight": {
> >> "
> >> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
> >> "spellcheck":
> >> > {
> >> > "time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18,
> "query":
> >> {
> >> > "
> >> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 },
> "highlight": {
> >> "
> >> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
> >> "spellcheck":
> >> > {
> >> > "time": 0 }, "debug": { "time": 18 } }
> >> >
> >> > I think this clause is wrong:
> >> > (cents_ri: [* 3000])
> >> >
> >> > I think you mean
> >> > (cents_ri: [* TO 3000])
> >> >
> >> > I think I made no difference. I tried both and they both worked.
> >> >
> >> > But are these slow queries constant or intermittent?
> >> >
> >> > They are definetly cached. The second time runs in no time.
> >> >
> >> > I gonna try adding them in the pre warmcache too. And see the results.
> >> >
> >> > The field that I used for sorting is indexed but not stored and it's
> not
> >> a
> >> > DocValue. I tried the query without the sort and the performance didnt
> >> > change significantly.
> >> >
> >> > By the time I saw that log, the server was getting around 7k updates
> per
> >> > minute. The query information I don't have it now, but I think it will
> >> > qualify as heavy load query.
> >> >
> >> > Thank you !
> >> >
> >> >
> >> > On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com>
> >> > wrote:
> >> >
> >> > > A couple of things don't particularly make sense here:
> >> > >
> >> > > You specify edismax, q=*:* yet you specify qf=
> >> > > You're searching across whatever you defined as the default
> >> > > field in the request handler. What do you see if you attach
> >> > > &debug=true to the query?
> >> > >
> >> > > I think this clause is wrong:
> >> > > (cents_ri: [* 3000])
> >> > >
> >> > > I think you mean
> >> > > (cents_ri: [* TO 3000])
> >> > >
> >> > > I'm not sure either of those is the problem, but are places I'd
> start.
> >> > >
> >> > > As far as the size of your filter cache goes, a hit ratio of .87
> >> actually
> >> > > isn't bad. Upping the size would add some marginal benefit, but it's
> >> > > unlikely to be a magic bullet.
> >> > >
> >> > > But are these slow queries constant or intermittent? In other words,
> >> > > are all queries of this general form slow or just the first few? In
> >> > > particular
> >> > > is the first query that mentions sorting on this field slow but
> >> > subsequent
> >> > > ones faster? In that case consider adding a query to the newSearcher
> >> > > event in solrconfig.xml that mentions this sort, that would pre-warm
> >> > > the sort values. Also, defining all fields that you sort on as
> >> > > docValues="true"
> >> > > is recommended at this point.
> >> > >
> >> > > What I'd try is removing clauses to see which one is the problem. On
> >> > > the surface this is surprisingly slow. And how heavily loaded is the
> >> > > server?
> >> > > Your autocommit settings look fine, my question is more how much
> >> indexing
> >> > > and querying is going on when you take these measurements.
> >> > >
> >> > > Best,
> >> > > Erick
> >> > >
> >> > > On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
> >> > > <lo...@dawandamail.com> wrote:
> >> > > > Hello,
> >> > > >
> >> > > > I have following conf for filters and commits :
> >> > > >
> >> > > > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> >> > > > acceptableSize=60, cleanupThread=false, timeDecay=true,
> >> > autowarmCount=8,
> >> > > > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
> >> > > >
> >> > > >      <autoCommit>
> >> > > >        <!-- Every 15 seconds -->
> >> > > >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> >> > > >        <openSearcher>false</openSearcher>
> >> > > >      </autoCommit>
> >> > > >
> >> > > >      <autoSoftCommit>
> >> > > >        <!-- Every 10 minutes -->
> >> > > >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
> >> > > >      </autoSoftCommit>
> >> > > >
> >> > > > and the following stats for filters:
> >> > > >
> >> > > > lookups = 3602
> >> > > > hits  =  3148
> >> > > > hit ratio = 0.87
> >> > > > inserts = 455
> >> > > > evictions = 400
> >> > > > size = 63
> >> > > > warmupTime = 770
> >> > > >
> >> > > > *Problem: *a lot of slow queries, for example:
> >> > > >
> >> > > > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> >> > > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> >> > > > desc&fq={!cost=1 cache=true}type_s:Product AND
> >> > > is_valid_b:true&fq={!cost=50
> >> > > > cache=true}in_languages_t:de&fq={!cost=99
> >> > > > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR
> ALL))
> >> AND
> >> > > > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0
> >> QTime=1378
> >> > > >
> >> > > > I could increase the size of the filter so I would decrease the
> >> amount
> >> > of
> >> > > > evictions, but it seems to me this would not be solving the root
> >> > problem.
> >> > > >
> >> > > > Some ideas on where/how to start for optimisation ? Is it actually
> >> > normal
> >> > > > that this query takes this time ?
> >> > > >
> >> > > > We have an index of ~14 million docs. 4 replicas with two cores
> and 1
> >> > > shard
> >> > > > each.
> >> > > >
> >> > > > thank you.
> >> > > >
> >> > > >
> >> > > > --
> >> > > >
> >> > > > --
> >> > > > Lorenzo Fundaro
> >> > > > Backend Engineer
> >> > > > E-Mail: lorenzo.fundaro@dawandamail.com
> >> > > >
> >> > > > Fax       + 49 - (0)30 - 25 76 08 52
> >> > > > Tel        + 49 - (0)179 - 51 10 982
> >> > > >
> >> > > > DaWanda GmbH
> >> > > > Windscheidstraße 18
> >> > > > 10627 Berlin
> >> > > >
> >> > > > Geschäftsführer: Claudia Helming, Michael Pütz
> >> > > > Amtsgericht Charlottenburg HRB 104695 B
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > --
> >> > Lorenzo Fundaro
> >> > Backend Engineer
> >> > E-Mail: lorenzo.fundaro@dawandamail.com
> >> >
> >> > Fax       + 49 - (0)30 - 25 76 08 52
> >> > Tel        + 49 - (0)179 - 51 10 982
> >> >
> >> > DaWanda GmbH
> >> > Windscheidstraße 18
> >> > 10627 Berlin
> >> >
> >> > Geschäftsführer: Claudia Helming, Michael Pütz
> >> > Amtsgericht Charlottenburg HRB 104695 B
> >> >
> >>
> >
> >
> >
> > --
> >
> > --
> > Lorenzo Fundaro
> > Backend Engineer
> > E-Mail: lorenzo.fundaro@dawandamail.com
> >
> > Fax       + 49 - (0)30 - 25 76 08 52
> > Tel        + 49 - (0)179 - 51 10 982
> >
> > DaWanda GmbH
> > Windscheidstraße 18
> > 10627 Berlin
> >
> > Geschäftsführer: Claudia Helming, Michael Pütz
> > Amtsgericht Charlottenburg HRB 104695 B
>



-- 

-- 
Lorenzo Fundaro
Backend Engineer
E-Mail: lorenzo.fundaro@dawandamail.com

Fax       + 49 - (0)30 - 25 76 08 52
Tel        + 49 - (0)179 - 51 10 982

DaWanda GmbH
Windscheidstraße 18
10627 Berlin

Geschäftsführer: Claudia Helming, Michael Pütz
Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Eric Torti <er...@gmail.com>.

Hi, Lorenzo,

I don't think this has a direct relation to your problem but it looks
like you're setting -DzkClientTimeout twice. From what I know about
setting VM arguments twice, you're
probably ending up with the last one being enforced.

Just something to be aware of I guess.

I don't think this relates to your problem because the GC pauses are
not superior to 30s which seems to be the time zookeeper would let a
node be irresponsive before considering it in recovery.

Best,

Eric Torti

On Thu, Oct 15, 2015 at 6:51 AM, Lorenzo Fundaró
<lo...@dawandamail.com> wrote:
> On 14 October 2015 at 20:35, Pushkar Raste <pu...@gmail.com> wrote:
>
>> You may want to start solr with following settings to enable logging GC
>> details. Here are some flags you might want to enable.
>>
>> -Xloggc:<path_to_log_dir>/gc.log
>> -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps
>> -XX:+PrintGCTimeStamps
>> -XX:+PrintTenuringDistribution
>> -XX:+PrintGCApplicationStoppedTime
>> -XX:+PrintHeapAtGC
>>
>> Once you have GC logs, look for string "Total time for which application
>> threads were stopped" to check if you have long pauses (you may get long
>> pauses even with young generation GC).
>>
>
>
> Yes, there are several lines indicating that threads are being stopped.
> There is this one particularly that draw my attention because right after a
> second it happened 2 of my replicas went into
> recovery mode, including the one who suffered the thread stop.
>
> solr_gc.log.1.current:2015-10-15T07:47:03.263+0000: 251173.653: Total time
> for which application threads were stopped: 1.4936161 seconds, Stopping
> threads took: 0.0000502 seconds
> (is a second of stopped threads enough to have a node in recovery node ?)
> When this happened, the leader had a couple of connection resets while
> trying to communicate with this replica.
>
> and this server the highest stop takes 4s.
>
> solr_gc.log.1.current:2015-10-14T20:24:01.353+0000: 210191.743: Total time
> for which application threads were stopped: 4.0111066 seconds, Stopping
> threads took: 0.0000776 seconds
>
> These are the jvm flags
>
> -XX:NewSize=256m -XX:MaxNewSize=256m
> /usr/lib/jvm/java-8-oracle/bin/java -server -Xss256k -Xms16g -Xmx16g
> -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90
> -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
> -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000
> -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc
> -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
> -XX:+PrintGCApplicationStoppedTime -Xloggc:/solr_gc.log
> -DzkClientTimeout=15000   -Duser.timezone=UTC
> -Djava.net.preferIPv4Stack=true -DzkClientTimeout=30000  -XX:NewSize=256m
> -XX:MaxNewSize=256m
>
> All of the options are the default that come with the solr startup script,
> only the ones specified in the first line are being put by us.
>
>
>
>
>>
>> -- Pushkar Raste
>>
>> On Wed, Oct 14, 2015 at 11:47 AM, Lorenzo Fundaró <
>> lorenzo.fundaro@dawandamail.com> wrote:
>>
>> > <<What do you see if you attach
>> > &debug=true to the query?>>
>> >
>> > "debug": { "rawquerystring": "*:*", "querystring": "*:*", "parsedquery":
>> > "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*", "
>> > explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> product
>> > of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:30852121":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:22298365":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38243373":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:45356790":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38798130":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:51012182":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:3142218":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:10726118":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
>> > product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
>> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:13702130":
>> > "\n1.0
>> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
>> > Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
>> =
>> > queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring":
>> null,
>> > "boost_queries": null, "parsed_boost_queries": [], "boostfuncs": null, "
>> > filter_queries": [ "{!cost=1 cache=true}type_s:Product AND
>> > is_valid_b:true",
>> > "{!cost=50 cache=true}in_languages_t:de", "{!cost=99
>> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
>> > (cents_ri: [* TO 3000])" ], "parsed_filter_queries": [ "+type_s:Product
>> > +is_valid_b:true", "in_languages_t:de", "{!cache=false
>> > cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
>> > shipping_country_codes_mt:eur shipping_country_codes_mt:all) +cents_ri:[*
>> > TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query": { "
>> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": {
>> "
>> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
>> "spellcheck":
>> > {
>> > "time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18, "query":
>> {
>> > "
>> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": {
>> "
>> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
>> "spellcheck":
>> > {
>> > "time": 0 }, "debug": { "time": 18 } }
>> >
>> > I think this clause is wrong:
>> > (cents_ri: [* 3000])
>> >
>> > I think you mean
>> > (cents_ri: [* TO 3000])
>> >
>> > I think I made no difference. I tried both and they both worked.
>> >
>> > But are these slow queries constant or intermittent?
>> >
>> > They are definetly cached. The second time runs in no time.
>> >
>> > I gonna try adding them in the pre warmcache too. And see the results.
>> >
>> > The field that I used for sorting is indexed but not stored and it's not
>> a
>> > DocValue. I tried the query without the sort and the performance didnt
>> > change significantly.
>> >
>> > By the time I saw that log, the server was getting around 7k updates per
>> > minute. The query information I don't have it now, but I think it will
>> > qualify as heavy load query.
>> >
>> > Thank you !
>> >
>> >
>> > On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com>
>> > wrote:
>> >
>> > > A couple of things don't particularly make sense here:
>> > >
>> > > You specify edismax, q=*:* yet you specify qf=
>> > > You're searching across whatever you defined as the default
>> > > field in the request handler. What do you see if you attach
>> > > &debug=true to the query?
>> > >
>> > > I think this clause is wrong:
>> > > (cents_ri: [* 3000])
>> > >
>> > > I think you mean
>> > > (cents_ri: [* TO 3000])
>> > >
>> > > I'm not sure either of those is the problem, but are places I'd start.
>> > >
>> > > As far as the size of your filter cache goes, a hit ratio of .87
>> actually
>> > > isn't bad. Upping the size would add some marginal benefit, but it's
>> > > unlikely to be a magic bullet.
>> > >
>> > > But are these slow queries constant or intermittent? In other words,
>> > > are all queries of this general form slow or just the first few? In
>> > > particular
>> > > is the first query that mentions sorting on this field slow but
>> > subsequent
>> > > ones faster? In that case consider adding a query to the newSearcher
>> > > event in solrconfig.xml that mentions this sort, that would pre-warm
>> > > the sort values. Also, defining all fields that you sort on as
>> > > docValues="true"
>> > > is recommended at this point.
>> > >
>> > > What I'd try is removing clauses to see which one is the problem. On
>> > > the surface this is surprisingly slow. And how heavily loaded is the
>> > > server?
>> > > Your autocommit settings look fine, my question is more how much
>> indexing
>> > > and querying is going on when you take these measurements.
>> > >
>> > > Best,
>> > > Erick
>> > >
>> > > On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
>> > > <lo...@dawandamail.com> wrote:
>> > > > Hello,
>> > > >
>> > > > I have following conf for filters and commits :
>> > > >
>> > > > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
>> > > > acceptableSize=60, cleanupThread=false, timeDecay=true,
>> > autowarmCount=8,
>> > > > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>> > > >
>> > > >      <autoCommit>
>> > > >        <!-- Every 15 seconds -->
>> > > >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>> > > >        <openSearcher>false</openSearcher>
>> > > >      </autoCommit>
>> > > >
>> > > >      <autoSoftCommit>
>> > > >        <!-- Every 10 minutes -->
>> > > >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>> > > >      </autoSoftCommit>
>> > > >
>> > > > and the following stats for filters:
>> > > >
>> > > > lookups = 3602
>> > > > hits  =  3148
>> > > > hit ratio = 0.87
>> > > > inserts = 455
>> > > > evictions = 400
>> > > > size = 63
>> > > > warmupTime = 770
>> > > >
>> > > > *Problem: *a lot of slow queries, for example:
>> > > >
>> > > > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
>> > > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
>> > > > desc&fq={!cost=1 cache=true}type_s:Product AND
>> > > is_valid_b:true&fq={!cost=50
>> > > > cache=true}in_languages_t:de&fq={!cost=99
>> > > > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL))
>> AND
>> > > > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0
>> QTime=1378
>> > > >
>> > > > I could increase the size of the filter so I would decrease the
>> amount
>> > of
>> > > > evictions, but it seems to me this would not be solving the root
>> > problem.
>> > > >
>> > > > Some ideas on where/how to start for optimisation ? Is it actually
>> > normal
>> > > > that this query takes this time ?
>> > > >
>> > > > We have an index of ~14 million docs. 4 replicas with two cores and 1
>> > > shard
>> > > > each.
>> > > >
>> > > > thank you.
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > --
>> > > > Lorenzo Fundaro
>> > > > Backend Engineer
>> > > > E-Mail: lorenzo.fundaro@dawandamail.com
>> > > >
>> > > > Fax       + 49 - (0)30 - 25 76 08 52
>> > > > Tel        + 49 - (0)179 - 51 10 982
>> > > >
>> > > > DaWanda GmbH
>> > > > Windscheidstraße 18
>> > > > 10627 Berlin
>> > > >
>> > > > Geschäftsführer: Claudia Helming, Michael Pütz
>> > > > Amtsgericht Charlottenburg HRB 104695 B
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > --
>> > Lorenzo Fundaro
>> > Backend Engineer
>> > E-Mail: lorenzo.fundaro@dawandamail.com
>> >
>> > Fax       + 49 - (0)30 - 25 76 08 52
>> > Tel        + 49 - (0)179 - 51 10 982
>> >
>> > DaWanda GmbH
>> > Windscheidstraße 18
>> > 10627 Berlin
>> >
>> > Geschäftsführer: Claudia Helming, Michael Pütz
>> > Amtsgericht Charlottenburg HRB 104695 B
>> >
>>
>
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Lorenzo Fundaró <lo...@dawandamail.com>.

On 14 October 2015 at 20:35, Pushkar Raste <pu...@gmail.com> wrote:

> You may want to start solr with following settings to enable logging GC
> details. Here are some flags you might want to enable.
>
> -Xloggc:<path_to_log_dir>/gc.log
> -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps
> -XX:+PrintGCTimeStamps
> -XX:+PrintTenuringDistribution
> -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintHeapAtGC
>
> Once you have GC logs, look for string "Total time for which application
> threads were stopped" to check if you have long pauses (you may get long
> pauses even with young generation GC).
>


Yes, there are several lines indicating that threads are being stopped.
There is this one particularly that draw my attention because right after a
second it happened 2 of my replicas went into
recovery mode, including the one who suffered the thread stop.

solr_gc.log.1.current:2015-10-15T07:47:03.263+0000: 251173.653: Total time
for which application threads were stopped: 1.4936161 seconds, Stopping
threads took: 0.0000502 seconds
(is a second of stopped threads enough to have a node in recovery node ?)
When this happened, the leader had a couple of connection resets while
trying to communicate with this replica.

and this server the highest stop takes 4s.

solr_gc.log.1.current:2015-10-14T20:24:01.353+0000: 210191.743: Total time
for which application threads were stopped: 4.0111066 seconds, Stopping
threads took: 0.0000776 seconds

These are the jvm flags

-XX:NewSize=256m -XX:MaxNewSize=256m
/usr/lib/jvm/java-8-oracle/bin/java -server -Xss256k -Xms16g -Xmx16g
-XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90
-XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark
-XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000
-XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc
-XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime -Xloggc:/solr_gc.log
-DzkClientTimeout=15000   -Duser.timezone=UTC
-Djava.net.preferIPv4Stack=true -DzkClientTimeout=30000  -XX:NewSize=256m
-XX:MaxNewSize=256m

All of the options are the default that come with the solr startup script,
only the ones specified in the first line are being put by us.




>
> -- Pushkar Raste
>
> On Wed, Oct 14, 2015 at 11:47 AM, Lorenzo Fundaró <
> lorenzo.fundaro@dawandamail.com> wrote:
>
> > <<What do you see if you attach
> > &debug=true to the query?>>
> >
> > "debug": { "rawquerystring": "*:*", "querystring": "*:*", "parsedquery":
> > "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*", "
> > explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product
> > of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:30852121":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:22298365":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38243373":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:45356790":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38798130":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:51012182":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:3142218":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:10726118":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
> > product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
> > MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:13702130":
> > "\n1.0
> > = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> > Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0
> =
> > queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring":
> null,
> > "boost_queries": null, "parsed_boost_queries": [], "boostfuncs": null, "
> > filter_queries": [ "{!cost=1 cache=true}type_s:Product AND
> > is_valid_b:true",
> > "{!cost=50 cache=true}in_languages_t:de", "{!cost=99
> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> > (cents_ri: [* TO 3000])" ], "parsed_filter_queries": [ "+type_s:Product
> > +is_valid_b:true", "in_languages_t:de", "{!cache=false
> > cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
> > shipping_country_codes_mt:eur shipping_country_codes_mt:all) +cents_ri:[*
> > TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query": { "
> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": {
> "
> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
> "spellcheck":
> > {
> > "time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18, "query":
> {
> > "
> > time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": {
> "
> > time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 },
> "spellcheck":
> > {
> > "time": 0 }, "debug": { "time": 18 } }
> >
> > I think this clause is wrong:
> > (cents_ri: [* 3000])
> >
> > I think you mean
> > (cents_ri: [* TO 3000])
> >
> > I think I made no difference. I tried both and they both worked.
> >
> > But are these slow queries constant or intermittent?
> >
> > They are definetly cached. The second time runs in no time.
> >
> > I gonna try adding them in the pre warmcache too. And see the results.
> >
> > The field that I used for sorting is indexed but not stored and it's not
> a
> > DocValue. I tried the query without the sort and the performance didnt
> > change significantly.
> >
> > By the time I saw that log, the server was getting around 7k updates per
> > minute. The query information I don't have it now, but I think it will
> > qualify as heavy load query.
> >
> > Thank you !
> >
> >
> > On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com>
> > wrote:
> >
> > > A couple of things don't particularly make sense here:
> > >
> > > You specify edismax, q=*:* yet you specify qf=
> > > You're searching across whatever you defined as the default
> > > field in the request handler. What do you see if you attach
> > > &debug=true to the query?
> > >
> > > I think this clause is wrong:
> > > (cents_ri: [* 3000])
> > >
> > > I think you mean
> > > (cents_ri: [* TO 3000])
> > >
> > > I'm not sure either of those is the problem, but are places I'd start.
> > >
> > > As far as the size of your filter cache goes, a hit ratio of .87
> actually
> > > isn't bad. Upping the size would add some marginal benefit, but it's
> > > unlikely to be a magic bullet.
> > >
> > > But are these slow queries constant or intermittent? In other words,
> > > are all queries of this general form slow or just the first few? In
> > > particular
> > > is the first query that mentions sorting on this field slow but
> > subsequent
> > > ones faster? In that case consider adding a query to the newSearcher
> > > event in solrconfig.xml that mentions this sort, that would pre-warm
> > > the sort values. Also, defining all fields that you sort on as
> > > docValues="true"
> > > is recommended at this point.
> > >
> > > What I'd try is removing clauses to see which one is the problem. On
> > > the surface this is surprisingly slow. And how heavily loaded is the
> > > server?
> > > Your autocommit settings look fine, my question is more how much
> indexing
> > > and querying is going on when you take these measurements.
> > >
> > > Best,
> > > Erick
> > >
> > > On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
> > > <lo...@dawandamail.com> wrote:
> > > > Hello,
> > > >
> > > > I have following conf for filters and commits :
> > > >
> > > > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> > > > acceptableSize=60, cleanupThread=false, timeDecay=true,
> > autowarmCount=8,
> > > > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
> > > >
> > > >      <autoCommit>
> > > >        <!-- Every 15 seconds -->
> > > >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> > > >        <openSearcher>false</openSearcher>
> > > >      </autoCommit>
> > > >
> > > >      <autoSoftCommit>
> > > >        <!-- Every 10 minutes -->
> > > >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
> > > >      </autoSoftCommit>
> > > >
> > > > and the following stats for filters:
> > > >
> > > > lookups = 3602
> > > > hits  =  3148
> > > > hit ratio = 0.87
> > > > inserts = 455
> > > > evictions = 400
> > > > size = 63
> > > > warmupTime = 770
> > > >
> > > > *Problem: *a lot of slow queries, for example:
> > > >
> > > > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> > > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> > > > desc&fq={!cost=1 cache=true}type_s:Product AND
> > > is_valid_b:true&fq={!cost=50
> > > > cache=true}in_languages_t:de&fq={!cost=99
> > > > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL))
> AND
> > > > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0
> QTime=1378
> > > >
> > > > I could increase the size of the filter so I would decrease the
> amount
> > of
> > > > evictions, but it seems to me this would not be solving the root
> > problem.
> > > >
> > > > Some ideas on where/how to start for optimisation ? Is it actually
> > normal
> > > > that this query takes this time ?
> > > >
> > > > We have an index of ~14 million docs. 4 replicas with two cores and 1
> > > shard
> > > > each.
> > > >
> > > > thank you.
> > > >
> > > >
> > > > --
> > > >
> > > > --
> > > > Lorenzo Fundaro
> > > > Backend Engineer
> > > > E-Mail: lorenzo.fundaro@dawandamail.com
> > > >
> > > > Fax       + 49 - (0)30 - 25 76 08 52
> > > > Tel        + 49 - (0)179 - 51 10 982
> > > >
> > > > DaWanda GmbH
> > > > Windscheidstraße 18
> > > > 10627 Berlin
> > > >
> > > > Geschäftsführer: Claudia Helming, Michael Pütz
> > > > Amtsgericht Charlottenburg HRB 104695 B
> > >
> >
> >
> >
> > --
> >
> > --
> > Lorenzo Fundaro
> > Backend Engineer
> > E-Mail: lorenzo.fundaro@dawandamail.com
> >
> > Fax       + 49 - (0)30 - 25 76 08 52
> > Tel        + 49 - (0)179 - 51 10 982
> >
> > DaWanda GmbH
> > Windscheidstraße 18
> > 10627 Berlin
> >
> > Geschäftsführer: Claudia Helming, Michael Pütz
> > Amtsgericht Charlottenburg HRB 104695 B
> >
>



-- 

-- 
Lorenzo Fundaro
Backend Engineer
E-Mail: lorenzo.fundaro@dawandamail.com

Fax       + 49 - (0)30 - 25 76 08 52
Tel        + 49 - (0)179 - 51 10 982

DaWanda GmbH
Windscheidstraße 18
10627 Berlin

Geschäftsführer: Claudia Helming, Michael Pütz
Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Pushkar Raste <pu...@gmail.com>.

You may want to start solr with following settings to enable logging GC
details. Here are some flags you might want to enable.

-Xloggc:<path_to_log_dir>/gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintHeapAtGC

Once you have GC logs, look for string "Total time for which application
threads were stopped" to check if you have long pauses (you may get long
pauses even with young generation GC).

-- Pushkar Raste

On Wed, Oct 14, 2015 at 11:47 AM, Lorenzo Fundaró <
lorenzo.fundaro@dawandamail.com> wrote:

> <<What do you see if you attach
> &debug=true to the query?>>
>
> "debug": { "rawquerystring": "*:*", "querystring": "*:*", "parsedquery":
> "(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*", "
> explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery, product
> of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:30852121":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:22298365":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38243373":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:45356790":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38798130":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:51012182":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:3142218":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:10726118":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
> product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
> MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:13702130":
> "\n1.0
> = (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
> Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
> queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring": null,
> "boost_queries": null, "parsed_boost_queries": [], "boostfuncs": null, "
> filter_queries": [ "{!cost=1 cache=true}type_s:Product AND
> is_valid_b:true",
> "{!cost=50 cache=true}in_languages_t:de", "{!cost=99
> cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> (cents_ri: [* TO 3000])" ], "parsed_filter_queries": [ "+type_s:Product
> +is_valid_b:true", "in_languages_t:de", "{!cache=false
> cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
> shipping_country_codes_mt:eur shipping_country_codes_mt:all) +cents_ri:[*
> TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query": { "
> time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": { "
> time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 }, "spellcheck":
> {
> "time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18, "query": {
> "
> time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": { "
> time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 }, "spellcheck":
> {
> "time": 0 }, "debug": { "time": 18 } }
>
> I think this clause is wrong:
> (cents_ri: [* 3000])
>
> I think you mean
> (cents_ri: [* TO 3000])
>
> I think I made no difference. I tried both and they both worked.
>
> But are these slow queries constant or intermittent?
>
> They are definetly cached. The second time runs in no time.
>
> I gonna try adding them in the pre warmcache too. And see the results.
>
> The field that I used for sorting is indexed but not stored and it's not a
> DocValue. I tried the query without the sort and the performance didnt
> change significantly.
>
> By the time I saw that log, the server was getting around 7k updates per
> minute. The query information I don't have it now, but I think it will
> qualify as heavy load query.
>
> Thank you !
>
>
> On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com>
> wrote:
>
> > A couple of things don't particularly make sense here:
> >
> > You specify edismax, q=*:* yet you specify qf=
> > You're searching across whatever you defined as the default
> > field in the request handler. What do you see if you attach
> > &debug=true to the query?
> >
> > I think this clause is wrong:
> > (cents_ri: [* 3000])
> >
> > I think you mean
> > (cents_ri: [* TO 3000])
> >
> > I'm not sure either of those is the problem, but are places I'd start.
> >
> > As far as the size of your filter cache goes, a hit ratio of .87 actually
> > isn't bad. Upping the size would add some marginal benefit, but it's
> > unlikely to be a magic bullet.
> >
> > But are these slow queries constant or intermittent? In other words,
> > are all queries of this general form slow or just the first few? In
> > particular
> > is the first query that mentions sorting on this field slow but
> subsequent
> > ones faster? In that case consider adding a query to the newSearcher
> > event in solrconfig.xml that mentions this sort, that would pre-warm
> > the sort values. Also, defining all fields that you sort on as
> > docValues="true"
> > is recommended at this point.
> >
> > What I'd try is removing clauses to see which one is the problem. On
> > the surface this is surprisingly slow. And how heavily loaded is the
> > server?
> > Your autocommit settings look fine, my question is more how much indexing
> > and querying is going on when you take these measurements.
> >
> > Best,
> > Erick
> >
> > On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
> > <lo...@dawandamail.com> wrote:
> > > Hello,
> > >
> > > I have following conf for filters and commits :
> > >
> > > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> > > acceptableSize=60, cleanupThread=false, timeDecay=true,
> autowarmCount=8,
> > > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
> > >
> > >      <autoCommit>
> > >        <!-- Every 15 seconds -->
> > >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> > >        <openSearcher>false</openSearcher>
> > >      </autoCommit>
> > >
> > >      <autoSoftCommit>
> > >        <!-- Every 10 minutes -->
> > >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
> > >      </autoSoftCommit>
> > >
> > > and the following stats for filters:
> > >
> > > lookups = 3602
> > > hits  =  3148
> > > hit ratio = 0.87
> > > inserts = 455
> > > evictions = 400
> > > size = 63
> > > warmupTime = 770
> > >
> > > *Problem: *a lot of slow queries, for example:
> > >
> > > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> > =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> > > desc&fq={!cost=1 cache=true}type_s:Product AND
> > is_valid_b:true&fq={!cost=50
> > > cache=true}in_languages_t:de&fq={!cost=99
> > > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> > > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
> > >
> > > I could increase the size of the filter so I would decrease the amount
> of
> > > evictions, but it seems to me this would not be solving the root
> problem.
> > >
> > > Some ideas on where/how to start for optimisation ? Is it actually
> normal
> > > that this query takes this time ?
> > >
> > > We have an index of ~14 million docs. 4 replicas with two cores and 1
> > shard
> > > each.
> > >
> > > thank you.
> > >
> > >
> > > --
> > >
> > > --
> > > Lorenzo Fundaro
> > > Backend Engineer
> > > E-Mail: lorenzo.fundaro@dawandamail.com
> > >
> > > Fax       + 49 - (0)30 - 25 76 08 52
> > > Tel        + 49 - (0)179 - 51 10 982
> > >
> > > DaWanda GmbH
> > > Windscheidstraße 18
> > > 10627 Berlin
> > >
> > > Geschäftsführer: Claudia Helming, Michael Pütz
> > > Amtsgericht Charlottenburg HRB 104695 B
> >
>
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B
>

Re: slow queries

Posted by Lorenzo Fundaró <lo...@dawandamail.com>.

<<What do you see if you attach
&debug=true to the query?>>

"debug": { "rawquerystring": "*:*", "querystring": "*:*", "parsedquery":
"(+MatchAllDocsQuery(*:*))/no_coord", "parsedquery_toString": "+*:*", "
explain": { "Product:47047358": "\n1.0 = (MATCH) MatchAllDocsQuery, product
of:\n 1.0 = queryNorm\n", "Product:32211113": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:30852121": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:35018929": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:31682082": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:31077677": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:22298365": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:41094514": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:13106166": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:19142249": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38243373": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:20434065": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:25194801": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:885482": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:45356790": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:67719831": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:12843394": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:38126213": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:38798130": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:30292169": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:11535854": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:8443674": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:51012182": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:75780871": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:20227881": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:38093629": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:3142218": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:15295602": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:3375982": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:38276777": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:10726118": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:50827742": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n", "Product:5771722": "\n1.0 = (MATCH) MatchAllDocsQuery,
product of:\n 1.0 = queryNorm\n", "Product:3245678": "\n1.0 = (MATCH)
MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "Product:13702130": "\n1.0
= (MATCH) MatchAllDocsQuery, product of:\n 1.0 = queryNorm\n", "
Product:25679953": "\n1.0 = (MATCH) MatchAllDocsQuery, product of:\n 1.0 =
queryNorm\n" }, "QParser": "ExtendedDismaxQParser", "altquerystring": null,
"boost_queries": null, "parsed_boost_queries": [], "boostfuncs": null, "
filter_queries": [ "{!cost=1 cache=true}type_s:Product AND is_valid_b:true",
"{!cost=50 cache=true}in_languages_t:de", "{!cost=99
cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
(cents_ri: [* TO 3000])" ], "parsed_filter_queries": [ "+type_s:Product
+is_valid_b:true", "in_languages_t:de", "{!cache=false
cost=99}+(shipping_country_codes_mt:de shipping_country_codes_mt:euro
shipping_country_codes_mt:eur shipping_country_codes_mt:all) +cents_ri:[*
TO 3000]" ], "timing": { "time": 18, "prepare": { "time": 0, "query": { "
time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": { "
time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 }, "spellcheck": {
"time": 0 }, "debug": { "time": 0 } }, "process": { "time": 18, "query": { "
time": 0 }, "facet": { "time": 0 }, "mlt": { "time": 0 }, "highlight": { "
time": 0 }, "stats": { "time": 0 }, "expand": { "time": 0 }, "spellcheck": {
"time": 0 }, "debug": { "time": 18 } }

I think this clause is wrong:
(cents_ri: [* 3000])

I think you mean
(cents_ri: [* TO 3000])

I think I made no difference. I tried both and they both worked.

But are these slow queries constant or intermittent?

They are definetly cached. The second time runs in no time.

I gonna try adding them in the pre warmcache too. And see the results.

The field that I used for sorting is indexed but not stored and it's not a
DocValue. I tried the query without the sort and the performance didnt
change significantly.

By the time I saw that log, the server was getting around 7k updates per
minute. The query information I don't have it now, but I think it will
qualify as heavy load query.

Thank you !


On 14 October 2015 at 17:14, Erick Erickson <er...@gmail.com> wrote:

> A couple of things don't particularly make sense here:
>
> You specify edismax, q=*:* yet you specify qf=
> You're searching across whatever you defined as the default
> field in the request handler. What do you see if you attach
> &debug=true to the query?
>
> I think this clause is wrong:
> (cents_ri: [* 3000])
>
> I think you mean
> (cents_ri: [* TO 3000])
>
> I'm not sure either of those is the problem, but are places I'd start.
>
> As far as the size of your filter cache goes, a hit ratio of .87 actually
> isn't bad. Upping the size would add some marginal benefit, but it's
> unlikely to be a magic bullet.
>
> But are these slow queries constant or intermittent? In other words,
> are all queries of this general form slow or just the first few? In
> particular
> is the first query that mentions sorting on this field slow but subsequent
> ones faster? In that case consider adding a query to the newSearcher
> event in solrconfig.xml that mentions this sort, that would pre-warm
> the sort values. Also, defining all fields that you sort on as
> docValues="true"
> is recommended at this point.
>
> What I'd try is removing clauses to see which one is the problem. On
> the surface this is surprisingly slow. And how heavily loaded is the
> server?
> Your autocommit settings look fine, my question is more how much indexing
> and querying is going on when you take these measurements.
>
> Best,
> Erick
>
> On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
> <lo...@dawandamail.com> wrote:
> > Hello,
> >
> > I have following conf for filters and commits :
> >
> > Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> > acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
> > regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
> >
> >      <autoCommit>
> >        <!-- Every 15 seconds -->
> >        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> >        <openSearcher>false</openSearcher>
> >      </autoCommit>
> >
> >      <autoSoftCommit>
> >        <!-- Every 10 minutes -->
> >        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
> >      </autoSoftCommit>
> >
> > and the following stats for filters:
> >
> > lookups = 3602
> > hits  =  3148
> > hit ratio = 0.87
> > inserts = 455
> > evictions = 400
> > size = 63
> > warmupTime = 770
> >
> > *Problem: *a lot of slow queries, for example:
> >
> > {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl
> =map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> > desc&fq={!cost=1 cache=true}type_s:Product AND
> is_valid_b:true&fq={!cost=50
> > cache=true}in_languages_t:de&fq={!cost=99
> > cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> > (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
> >
> > I could increase the size of the filter so I would decrease the amount of
> > evictions, but it seems to me this would not be solving the root problem.
> >
> > Some ideas on where/how to start for optimisation ? Is it actually normal
> > that this query takes this time ?
> >
> > We have an index of ~14 million docs. 4 replicas with two cores and 1
> shard
> > each.
> >
> > thank you.
> >
> >
> > --
> >
> > --
> > Lorenzo Fundaro
> > Backend Engineer
> > E-Mail: lorenzo.fundaro@dawandamail.com
> >
> > Fax       + 49 - (0)30 - 25 76 08 52
> > Tel        + 49 - (0)179 - 51 10 982
> >
> > DaWanda GmbH
> > Windscheidstraße 18
> > 10627 Berlin
> >
> > Geschäftsführer: Claudia Helming, Michael Pütz
> > Amtsgericht Charlottenburg HRB 104695 B
>



-- 

-- 
Lorenzo Fundaro
Backend Engineer
E-Mail: lorenzo.fundaro@dawandamail.com

Fax       + 49 - (0)30 - 25 76 08 52
Tel        + 49 - (0)179 - 51 10 982

DaWanda GmbH
Windscheidstraße 18
10627 Berlin

Geschäftsführer: Claudia Helming, Michael Pütz
Amtsgericht Charlottenburg HRB 104695 B

Re: slow queries

Posted by Erick Erickson <er...@gmail.com>.

A couple of things don't particularly make sense here:

You specify edismax, q=*:* yet you specify qf=
You're searching across whatever you defined as the default
field in the request handler. What do you see if you attach
&debug=true to the query?

I think this clause is wrong:
(cents_ri: [* 3000])

I think you mean
(cents_ri: [* TO 3000])

I'm not sure either of those is the problem, but are places I'd start.

As far as the size of your filter cache goes, a hit ratio of .87 actually
isn't bad. Upping the size would add some marginal benefit, but it's
unlikely to be a magic bullet.

But are these slow queries constant or intermittent? In other words,
are all queries of this general form slow or just the first few? In particular
is the first query that mentions sorting on this field slow but subsequent
ones faster? In that case consider adding a query to the newSearcher
event in solrconfig.xml that mentions this sort, that would pre-warm
the sort values. Also, defining all fields that you sort on as docValues="true"
is recommended at this point.

What I'd try is removing clauses to see which one is the problem. On
the surface this is surprisingly slow. And how heavily loaded is the server?
Your autocommit settings look fine, my question is more how much indexing
and querying is going on when you take these measurements.

Best,
Erick

On Wed, Oct 14, 2015 at 3:03 AM, Lorenzo Fundaró
<lo...@dawandamail.com> wrote:
> Hello,
>
> I have following conf for filters and commits :
>
> Concurrent LFU Cache(maxSize=64, initialSize=64, minSize=57,
> acceptableSize=60, cleanupThread=false, timeDecay=true, autowarmCount=8,
> regenerator=org.apache.solr.search.SolrIndexSearcher$2@169ee0fd)
>
>      <autoCommit>
>        <!-- Every 15 seconds -->
>        <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>      <autoSoftCommit>
>        <!-- Every 10 minutes -->
>        <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime>
>      </autoSoftCommit>
>
> and the following stats for filters:
>
> lookups = 3602
> hits  =  3148
> hit ratio = 0.87
> inserts = 455
> evictions = 400
> size = 63
> warmupTime = 770
>
> *Problem: *a lot of slow queries, for example:
>
> {q=*:*&tie=1.0&defType=edismax&qt=standard&json.nl=map&qf=&fl=pk_i,score&start=0&sort=view_counter_i
> desc&fq={!cost=1 cache=true}type_s:Product AND is_valid_b:true&fq={!cost=50
> cache=true}in_languages_t:de&fq={!cost=99
> cache=false}(shipping_country_codes_mt: (DE OR EURO OR EUR OR ALL)) AND
> (cents_ri: [* 3000])&rows=36&wt=json} hits=3768003 status=0 QTime=1378
>
> I could increase the size of the filter so I would decrease the amount of
> evictions, but it seems to me this would not be solving the root problem.
>
> Some ideas on where/how to start for optimisation ? Is it actually normal
> that this query takes this time ?
>
> We have an index of ~14 million docs. 4 replicas with two cores and 1 shard
> each.
>
> thank you.
>
>
> --
>
> --
> Lorenzo Fundaro
> Backend Engineer
> E-Mail: lorenzo.fundaro@dawandamail.com
>
> Fax       + 49 - (0)30 - 25 76 08 52
> Tel        + 49 - (0)179 - 51 10 982
>
> DaWanda GmbH
> Windscheidstraße 18
> 10627 Berlin
>
> Geschäftsführer: Claudia Helming, Michael Pütz
> Amtsgericht Charlottenburg HRB 104695 B