You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by S G <sg...@gmail.com> on 2020/02/25 22:01:58 UTC

Why does Solr sort on _docid_ with rows=0 ?

Hi,

I see a lot of such queries in my Solr 7.6.0 logs:


*path=/select
params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
hits=287128180 status=0 QTime=7173*
On some searching, this is the code seems to fire the above:
https://github.com/apache/lucene-solr/blob/f80e8e11672d31c6e12069d2bd12a28b92e5a336/solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java#L89-L101

Can someone explain why Solr is doing this?
Note that "hits" is a very large value and is something which could be
impacting performance?

If you want to check a zombie server, shouldn't there be a much less
expensive way to do a health-check instead?

Thanks
SG

Re: Why does Solr sort on _docid_ with rows=0 ?

Posted by S G <sg...@gmail.com>.
Thanks Hoss. Yes, that jira seems like a good one to fix.
And the variable name definitely does not explain why it will not cause any
sort operation.

-SG

On Mon, Mar 2, 2020 at 10:06 AM Chris Hostetter <ho...@fucit.org>
wrote:

> : docid is the natural order of the posting lists, so there is no sorting
> effort.
> : I expect that means “don’t sort”.
>
> basically yes, as documented in the comment right above hte lines of code
> linked to.
>
> : > So no one knows this then?
> : > It seems like a good opportunity to get some performance!
>
> The variable name is really stupid, but the 'solrQuery' variable you see
> in the code is *only* ever used for 'checkAZombieServer()' ... which
> should only be called when a server hasn't been responding to other (user
> initiated requests)
>
> : >> I see a lot of such queries in my Solr 7.6.0 logs:
>
> If you are seeing a lot of those queries, then there are other problems in
> your cluster you should investigate -- that's when/why LBSolrClient does
> this query -- to see if the server is responding.
>
> : >> *path=/select
> : >>
> params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
> : >> hits=287128180 status=0 QTime=7173*
>
> that is an abnormally large number of documents to have in a single shard.
>
> : >> If you want to check a zombie server, shouldn't there be a much less
> : >> expensive way to do a health-check instead?
>
> Probably yes -- i've opened SOLR-14298...
>
> https://issues.apache.org/jira/browse/SOLR-14298
>
>
>
> -Hoss
> http://www.lucidworks.com/

Re: Why does Solr sort on _docid_ with rows=0 ?

Posted by Chris Hostetter <ho...@fucit.org>.
: docid is the natural order of the posting lists, so there is no sorting effort.
: I expect that means “don’t sort”.

basically yes, as documented in the comment right above hte lines of code 
linked to.

: > So no one knows this then?
: > It seems like a good opportunity to get some performance!

The variable name is really stupid, but the 'solrQuery' variable you see 
in the code is *only* ever used for 'checkAZombieServer()' ... which 
should only be called when a server hasn't been responding to other (user 
initiated requests)

: >> I see a lot of such queries in my Solr 7.6.0 logs:

If you are seeing a lot of those queries, then there are other problems in 
your cluster you should investigate -- that's when/why LBSolrClient does 
this query -- to see if the server is responding.

: >> *path=/select
: >> params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
: >> hits=287128180 status=0 QTime=7173*

that is an abnormally large number of documents to have in a single shard.

: >> If you want to check a zombie server, shouldn't there be a much less
: >> expensive way to do a health-check instead?

Probably yes -- i've opened SOLR-14298...

https://issues.apache.org/jira/browse/SOLR-14298



-Hoss
http://www.lucidworks.com/

Re: Why does Solr sort on _docid_ with rows=0 ?

Posted by Walter Underwood <wu...@wunderwood.org>.
docid is the natural order of the posting lists, so there is no sorting effort.
I expect that means “don’t sort”.

Also, cross-posting is probably not good. I’m replying only to solr-user.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 28, 2020, at 5:42 PM, S G <sg...@gmail.com> wrote:
> 
> So no one knows this then?
> It seems like a good opportunity to get some performance!
> 
> On Tue, Feb 25, 2020 at 2:01 PM S G <sg...@gmail.com> wrote:
> 
>> Hi,
>> 
>> I see a lot of such queries in my Solr 7.6.0 logs:
>> 
>> 
>> *path=/select
>> params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
>> hits=287128180 status=0 QTime=7173*
>> On some searching, this is the code seems to fire the above:
>> 
>> https://github.com/apache/lucene-solr/blob/f80e8e11672d31c6e12069d2bd12a28b92e5a336/solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java#L89-L101
>> 
>> Can someone explain why Solr is doing this?
>> Note that "hits" is a very large value and is something which could be
>> impacting performance?
>> 
>> If you want to check a zombie server, shouldn't there be a much less
>> expensive way to do a health-check instead?
>> 
>> Thanks
>> SG
>> 
>> 
>> 
>> 


Re: Why does Solr sort on _docid_ with rows=0 ?

Posted by S G <sg...@gmail.com>.
So no one knows this then?
It seems like a good opportunity to get some performance!

On Tue, Feb 25, 2020 at 2:01 PM S G <sg...@gmail.com> wrote:

> Hi,
>
> I see a lot of such queries in my Solr 7.6.0 logs:
>
>
> *path=/select
> params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
> hits=287128180 status=0 QTime=7173*
> On some searching, this is the code seems to fire the above:
>
> https://github.com/apache/lucene-solr/blob/f80e8e11672d31c6e12069d2bd12a28b92e5a336/solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java#L89-L101
>
> Can someone explain why Solr is doing this?
> Note that "hits" is a very large value and is something which could be
> impacting performance?
>
> If you want to check a zombie server, shouldn't there be a much less
> expensive way to do a health-check instead?
>
> Thanks
> SG
>
>
>
>

Re: Why does Solr sort on _docid_ with rows=0 ?

Posted by S G <sg...@gmail.com>.
So no one knows this then?
It seems like a good opportunity to get some performance!

On Tue, Feb 25, 2020 at 2:01 PM S G <sg...@gmail.com> wrote:

> Hi,
>
> I see a lot of such queries in my Solr 7.6.0 logs:
>
>
> *path=/select
> params={q=*:*&distrib=false&sort=_docid_+asc&rows=0&wt=javabin&version=2}
> hits=287128180 status=0 QTime=7173*
> On some searching, this is the code seems to fire the above:
>
> https://github.com/apache/lucene-solr/blob/f80e8e11672d31c6e12069d2bd12a28b92e5a336/solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java#L89-L101
>
> Can someone explain why Solr is doing this?
> Note that "hits" is a very large value and is something which could be
> impacting performance?
>
> If you want to check a zombie server, shouldn't there be a much less
> expensive way to do a health-check instead?
>
> Thanks
> SG
>
>
>
>