You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Hakan karaoğlu <hk...@gmail.com> on 2021/10/08 13:15:27 UTC

How to improve stats query performance?

Hello to everyone,

I have a few questions about the statistics query. I have a very simple
query like in the picture, I want to see the unique fromid value in long
date ranges. The value here is 10s. I have two questions.

First of all, as the time interval increases, for example, within a period
of 1 year, the query result does not come at all. The screen remains white.
We are talking about a data of about 12m. Also, the distinctValues value is
too long in queries as months. I don't want to deal with this field. The
countDistinct value that matters to me

My second question is, is there a different, faster method for fromid?

image : https://ibb.co/TgjmvDC

Re: How to improve stats query performance?

Posted by dinesh naik <di...@gmail.com>.
Hi Hakan,
Have you defined docValues as true for field fromid in the managed schema?
If you run stats query on a field without docValues, then Solr can not make
use of the OS cache and will have to load the whole index for that field
into the JVM (Java virtual machine). This ca slow down the performance of
your query.

Coming to first part of the question: check the solr logs and you will see
those queries with a higher date range might be getting timed out hence you
don't see anything in the admin ui.

Second part: is there any faster method for stats?
Yes, there are a couple of options, but you will have to test and find out
which works best for you based on the cardinalty and data set in your
cluster.
Here is the link which you can refer :
https://yonik.com/solr-count-distinct/

On Fri, Oct 8, 2021 at 7:54 PM Hakan karaoğlu <hk...@gmail.com>
wrote:

> Hello to everyone,
>
> I have a few questions about the statistics query. I have a very simple
> query like in the picture, I want to see the unique fromid value in long
> date ranges. The value here is 10s. I have two questions.
>
> First of all, as the time interval increases, for example, within a period
> of 1 year, the query result does not come at all. The screen remains white.
> We are talking about a data of about 12m. Also, the distinctValues value is
> too long in queries as months. I don't want to deal with this field. The
> countDistinct value that matters to me
>
> My second question is, is there a different, faster method for fromid?
>
> image : https://ibb.co/TgjmvDC
>


-- 
Best Regards,
Dinesh Naik