You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul Goswami <ra...@gmail.com> on 2019/06/01 19:47:16 UTC

Re: Graph query extremely slow

Hi Toke,

Thanks for the sharing the sanity check results. I am setting rows=100. The
graph fq in my case gives a numFound of a little over 1 million. The total
number of docs is ~4 million.
I am using the graph query in an fq. Could the performance differ between
having it in an fq vs q ? Also, since the parameters of this fq don't
change shouldn't I expect to gain any advantage out of using the
filterCache?

Thanks,
Rahul

On Wed, May 22, 2019 at 7:40 AM Toke Eskildsen <to...@kb.dk> wrote:

> On Wed, 2019-05-15 at 21:37 -0400, Rahul Goswami wrote:
> > fq={!graph from=from_field to=to_field returnRoot=false}
> >
> > Executing _only_ the graph filter query takes about 64.5 seconds. The
> > total number of documents from this filter query is a little over 1
> > million.
>
> I tried building an index in Solr 7.6 with 4M simple records with every
> 4th record having a from_field and a to_field, each containing a random
> number from 0-65535 as a String.
>
>
> Asking for the first 10 results:
>
> time curl -s '
>
> http://localhost:8983/solr/gettingstarted/select?rows=10&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*
> <http://localhost:8983/solr/gettingstarted/select?rows=10&q=%7B!graph+from=from_field+to=to_field+returnRoot=true%7D+from_field:*>
> '
>  | jq .response.numFound
> 1000000
>
> real    0m0.018s
> user    0m0.011s
> sys     0m0.005s
>
>
> Asking for 1M results (ignoring that export or streaming should be used
> for exports of that size):
>
> time curl -s '
>
> http://localhost:8983/solr/gettingstarted/select?rows=1000000&q={!graph+from=from_field+to=to_field+returnRoot=true}+from_field:*
> <http://localhost:8983/solr/gettingstarted/select?rows=1000000&q=%7B!graph+from=from_field+to=to_field+returnRoot=true%7D+from_field:*>
> '
>  | jq .response.numFound
> 1000000
>
> real    0m10.101s
> user    0m3.344s
> sys     0m0.419s
>
> > Is this performance expected out of graph query ?
>
> As the sanity check above shows, there is a huge difference between
> evaluating a graph query (any query really) and asking for 1M results
> to be returned. With that in mind, what do you set rows to?
>
>
> - Toke Eskildsen, Royal Danish Library
>
>
>