You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Walter Underwood (Jira)" <ji...@apache.org> on 2021/06/25 20:36:00 UTC

[jira] [Commented] (SOLR-15252) Solr should log WARN log when a query requests huge rows number

    [ https://issues.apache.org/jira/browse/SOLR-15252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369704#comment-17369704 ] 

Walter Underwood commented on SOLR-15252:
-----------------------------------------

I see that this has been closed, but it is a feature we would use.

I've had deep paging (by bots) cause a Solr outage multiple times, both at Netflix and at Chegg. Yes, we put in defenses in the middle tier, but Solr really should defend against this. It was a denial of service vulnerability in 1.3 and it is still a vulnerability in 8.7.

I would be fine with a max_row or deepest_row parameter that would return a 400 Bad Request for requests where start+rows was greater than the threshold.

Every time this has happened, it has been a huge pain to debug. A small number of queries, like 100, can take down a large cluster. Because it is just a few queries, it is not caught by normal bot defenses and it is hard to spot in logs.

Our middle-tier limit is 500. That is plenty of results for our use cases. I can imagine different limits for different uses (web, mobile), but we don't need that now.

> Solr should log WARN log when a query requests huge rows number
> ---------------------------------------------------------------
>
>                 Key: SOLR-15252
>                 URL: https://issues.apache.org/jira/browse/SOLR-15252
>             Project: Solr
>          Issue Type: Improvement
>          Components: query
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have all seen it - clients that use Integer.MAX_VALUE or 10000000 as rows parameter, to just make sure they get all possible results. And this of course leads to high GC pauses since Lucene allocates an array up front to hold results.
> Solr should either log WARN when it encounters a value above a certain threshold, such as 100k (then you should use cursormark instead). Or it should simply respond with 400 error and have a system property or query parameter folks can use to override if they know what they are doing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org