You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by solrnoobie <ra...@yahoo.com> on 2019/05/03 08:32:43 UTC

Solr long q values

So whenever we have long q values (from a sentence to a small paragraph), we
encounter some heap problems (OOM) and I guess this is normal?

So my question would be is how should we handle this type of problem? Of
course we could always limit the size of the search term queries in the
application side but is there anything we could do in our configuration that
could prevent the OOM issues even if some random user intentionally bombard
us with long search queries in the front end?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr long q values

Posted by solrnoobie <ra...@yahoo.com>.
Thank you for the replies!

Because of everyone's insight, I was able to deduce that the problem was on
our configuration.

Our heap size was 10 gigs so I don't think this is the problem since we only
have 900k data. So when we took a closer look at our schema, 2 of the
relevant fields has ShingleFilterFactory on query time so this caused the
OOM for long q values!



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr long q values

Posted by Walter Underwood <wu...@wunderwood.org>.
512M was the default heap for Java 1.1. We never changed the default. So no size was “chosen”.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 3, 2019, at 10:11 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 5/3/2019 1:37 PM, Erick Erickson wrote:
>> We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?
>> Feel free to, raise a JIRA, but I won’t have any time to work on it….
> 
> Done.
> 
> https://issues.apache.org/jira/browse/SOLR-13446
> 
> I think that for typical server systems, starting with a 512MB heap is a little bit nuts.
> 
> I think I know why such a low number was chosen.  Without a much smarter startup, a super low default is the only way to ensure that Solr will start on virtually any system that somebody tries it on, like the small AWS servers.
> 
> Thanks,
> Shawn


Re: Solr long q values

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/3/2019 1:37 PM, Erick Erickson wrote:
> We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?
> 
> Feel free to, raise a JIRA, but I won’t have any time to work on it….

Done.

https://issues.apache.org/jira/browse/SOLR-13446

I think that for typical server systems, starting with a 512MB heap is a 
little bit nuts.

I think I know why such a low number was chosen.  Without a much smarter 
startup, a super low default is the only way to ensure that Solr will 
start on virtually any system that somebody tries it on, like the small 
AWS servers.

Thanks,
Shawn

Re: Solr long q values

Posted by Erick Erickson <er...@gmail.com>.
Shawn:

We already do warnings for ulimits, so memory seems reasonable. Along the same vein, does starting with 512M make sense either?

Feel free to, raise a JIRA, but I won’t have any time to work on it….

> On May 3, 2019, at 3:27 PM, Walter Underwood <wu...@wunderwood.org> wrote:
> 
> We run very long queries with an 8 GB heap. 30 million documents in 8 shards with an average query length of 25 terms.
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On May 3, 2019, at 6:49 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>> 
>> On 5/3/2019 2:32 AM, solrnoobie wrote:
>>> So whenever we have long q values (from a sentence to a small paragraph), we
>>> encounter some heap problems (OOM) and I guess this is normal?
>>> So my question would be is how should we handle this type of problem? Of
>>> course we could always limit the size of the search term queries in the
>>> application side but is there anything we could do in our configuration that
>>> could prevent the OOM issues even if some random user intentionally bombard
>>> us with long search queries in the front end?
>> 
>> If you're running out of memory, then Solr will need a larger heap, or you'll need to change something so it requires less heap.
>> 
>> A large query string is one of those things that might require a larger heap.
>> 
>> The default heap size that Solr has shipped with since 5.0 is 512MB ... which is VERY small.  Virtually all Solr users will need to increase this or they will run into OOME, or find that their server is running extremely slow.  It does not take very much index data to require more than 512MB heap.
>> 
>> A thought for Erick and other committers:  I know we are trying to reduce log verbosity.  But along the same lines as the log entries about file and process limits, I was thinking it might be a good idea to have a one-line WARN entry if the max heap size is 1GB or less.  And a config option to disable the logging.
>> 
>> Thanks,
>> Shawn
> 


Re: Solr long q values

Posted by Walter Underwood <wu...@wunderwood.org>.
We run very long queries with an 8 GB heap. 30 million documents in 8 shards with an average query length of 25 terms.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On May 3, 2019, at 6:49 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 5/3/2019 2:32 AM, solrnoobie wrote:
>> So whenever we have long q values (from a sentence to a small paragraph), we
>> encounter some heap problems (OOM) and I guess this is normal?
>> So my question would be is how should we handle this type of problem? Of
>> course we could always limit the size of the search term queries in the
>> application side but is there anything we could do in our configuration that
>> could prevent the OOM issues even if some random user intentionally bombard
>> us with long search queries in the front end?
> 
> If you're running out of memory, then Solr will need a larger heap, or you'll need to change something so it requires less heap.
> 
> A large query string is one of those things that might require a larger heap.
> 
> The default heap size that Solr has shipped with since 5.0 is 512MB ... which is VERY small.  Virtually all Solr users will need to increase this or they will run into OOME, or find that their server is running extremely slow.  It does not take very much index data to require more than 512MB heap.
> 
> A thought for Erick and other committers:  I know we are trying to reduce log verbosity.  But along the same lines as the log entries about file and process limits, I was thinking it might be a good idea to have a one-line WARN entry if the max heap size is 1GB or less.  And a config option to disable the logging.
> 
> Thanks,
> Shawn


Re: Solr long q values

Posted by Shawn Heisey <ap...@elyograg.org>.
On 5/3/2019 2:32 AM, solrnoobie wrote:
> So whenever we have long q values (from a sentence to a small paragraph), we
> encounter some heap problems (OOM) and I guess this is normal?
> 
> So my question would be is how should we handle this type of problem? Of
> course we could always limit the size of the search term queries in the
> application side but is there anything we could do in our configuration that
> could prevent the OOM issues even if some random user intentionally bombard
> us with long search queries in the front end?

If you're running out of memory, then Solr will need a larger heap, or 
you'll need to change something so it requires less heap.

A large query string is one of those things that might require a larger 
heap.

The default heap size that Solr has shipped with since 5.0 is 512MB ... 
which is VERY small.  Virtually all Solr users will need to increase 
this or they will run into OOME, or find that their server is running 
extremely slow.  It does not take very much index data to require more 
than 512MB heap.

A thought for Erick and other committers:  I know we are trying to 
reduce log verbosity.  But along the same lines as the log entries about 
file and process limits, I was thinking it might be a good idea to have 
a one-line WARN entry if the max heap size is 1GB or less.  And a config 
option to disable the logging.

Thanks,
Shawn