You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2017/07/07 11:01:10 UTC

Slowly running OOM due to Query instances?!

Hello,

This morning i spotted our QTime suddenly go up. This has been going on for a few hours by now and coincides with a serious increase in heap consumption. No node ran out of memory so far but either that is going to happen soon, or the nodes become unusable in another manner.

I restarted one of the Solr instances and launched VisualVM at it, and some other nodes that use to much heap. Starting the memory sampler, something was obvious straight away.

The nodes consuming too much heap all have a serious amount of *Query, and BooleanClause instances, PayloadScoreQuery, TermQuery, BoostQuery, BooleanQuery, SpanTermQuery and so forth. Lots of Builder and Term instances too, very distinct from the node that was just freshly restarted.

Another peculiarity, some nodes have exactly 65536 instances of TermQuery and/or BoostQuery, probably unrelated but not something i would have expected to see anyway.

So, what's up? We do have a custom query parser extending EdismaxQParser, it transliterates dates and creates payload and span queries. I may be doing something wrong but i don't know, i have made and used a variety of QParsers, for many years but this is new. Any hints on where to look, what to watch out for? 

Many thanks!
Markus

Xmx 800m, 8 GB RAM, SSD
2 shards, three replica's
replica size ~17 GB, 2.2 million docs/replica

Re: Slowly running OOM due to Query instances?!

Posted by Erik Hatcher <er...@gmail.com>.
With generated Query’s, one has to be really careful with .equals and .hashCode implementations.  That may not be applicable here, but something that has bitten me with caching.   Note that there were fixes made in Solr 6.6 with PayloadScoreQuery in this regard.   See LUCENE-7808 and LUCENE-7481

	Erik


> On Jul 7, 2017, at 7:01 AM, Markus Jelsma <ma...@openindex.io> wrote:
> 
> Hello,
> 
> This morning i spotted our QTime suddenly go up. This has been going on for a few hours by now and coincides with a serious increase in heap consumption. No node ran out of memory so far but either that is going to happen soon, or the nodes become unusable in another manner.
> 
> I restarted one of the Solr instances and launched VisualVM at it, and some other nodes that use to much heap. Starting the memory sampler, something was obvious straight away.
> 
> The nodes consuming too much heap all have a serious amount of *Query, and BooleanClause instances, PayloadScoreQuery, TermQuery, BoostQuery, BooleanQuery, SpanTermQuery and so forth. Lots of Builder and Term instances too, very distinct from the node that was just freshly restarted.
> 
> Another peculiarity, some nodes have exactly 65536 instances of TermQuery and/or BoostQuery, probably unrelated but not something i would have expected to see anyway.
> 
> So, what's up? We do have a custom query parser extending EdismaxQParser, it transliterates dates and creates payload and span queries. I may be doing something wrong but i don't know, i have made and used a variety of QParsers, for many years but this is new. Any hints on where to look, what to watch out for? 
> 
> Many thanks!
> Markus
> 
> Xmx 800m, 8 GB RAM, SSD
> 2 shards, three replica's
> replica size ~17 GB, 2.2 million docs/replica


Re: Slowly running OOM due to Query instances?!

Posted by Susheel Kumar <su...@gmail.com>.
Xms 800m sounds low regardless did you know how much total cache
consumption may go based on your current solrconfig.xml settings. Also 2
shards and 3 replca's are on 6 such machines i assume.

Thanks,
Susheel

On Fri, Jul 7, 2017 at 7:01 AM, Markus Jelsma <ma...@openindex.io>
wrote:

> Hello,
>
> This morning i spotted our QTime suddenly go up. This has been going on
> for a few hours by now and coincides with a serious increase in heap
> consumption. No node ran out of memory so far but either that is going to
> happen soon, or the nodes become unusable in another manner.
>
> I restarted one of the Solr instances and launched VisualVM at it, and
> some other nodes that use to much heap. Starting the memory sampler,
> something was obvious straight away.
>
> The nodes consuming too much heap all have a serious amount of *Query, and
> BooleanClause instances, PayloadScoreQuery, TermQuery, BoostQuery,
> BooleanQuery, SpanTermQuery and so forth. Lots of Builder and Term
> instances too, very distinct from the node that was just freshly restarted.
>
> Another peculiarity, some nodes have exactly 65536 instances of TermQuery
> and/or BoostQuery, probably unrelated but not something i would have
> expected to see anyway.
>
> So, what's up? We do have a custom query parser extending EdismaxQParser,
> it transliterates dates and creates payload and span queries. I may be
> doing something wrong but i don't know, i have made and used a variety of
> QParsers, for many years but this is new. Any hints on where to look, what
> to watch out for?
>
> Many thanks!
> Markus
>
> Xmx 800m, 8 GB RAM, SSD
> 2 shards, three replica's
> replica size ~17 GB, 2.2 million docs/replica
>