You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by XJ <ol...@gmail.com> on 2012/02/04 07:20:31 UTC

Performance degradation with distributed search

Hello,

I am experimenting with solr distributed search/random sharding (currently
use geo sharding), hope to gain some performance and also scalability in
the future. (index size keep growing and geo shard is hard to scale)

However I'm seeing worse performance with distributed search, on a testing
server of 6 shards, 15 core cpu, 24G mem, index size is about 8G on each
shard. With geo sharding it can easily take 150 QPS load with good response
time. Now with distribute search, there are timeout and average response
time also inreases. This is probably no big surprise since I'm using same
amount of shards and plus overhead of distribute search/merge/http network
etc.

When I look into details (slow queries), I found some real issues that I
need help with. For example, a query which takes 200ms with geo sharding,
now timeout (>2000ms) with distributed search. And each shard query
(isShard=true) takes about 1200ms. But if I run the query toward the shard
only (without distributed search), it only takes <200ms. So I compared the
two query urls, the only difference is shard query using distribute
search has "fsv=true". I understand field sort values are need during merge
process, but didn't expect that'll make this much difference in
performance, although we do have lot of sort orders (about 20 different
sort orders).

Any suggestion/comment on the performance problem I'm having with
distributed search? Is distributed search the right choice for me? What
other setup/idea I can try?

thanks,
XJ

Re: Performance degradation with distributed search

Posted by XJ <ol...@gmail.com>.
BTW we just upgraded to Solr 3.5 from Solr 1.4. Thats why we want to
explore the improvements/new features of distributed search.

On Mon, Feb 6, 2012 at 12:30 PM, oleole <ol...@gmail.com> wrote:

> Yonik,
>
> Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true
> to the query) and it surprised me too. Could it due to we're using many
> complex sortings (20 sortings with dismax, and, or...). Any thing it can be
> optimized? Looks like it's calculated twice in solr?
>
> XJ
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Performance-degradation-with-distributed-search-tp3715060p3720739.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Performance degradation with distributed search

Posted by XJ <ol...@gmail.com>.
Yonik, thanks for your explanation. I've created a ticket here
https://issues.apache.org/jira/browse/SOLR-3104

On Mon, Feb 6, 2012 at 4:28 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Feb 6, 2012 at 6:16 PM, XJ <ol...@gmail.com> wrote:
> > Sorry I didn't make this clear. Yeah we use dismax in main query, as
> well as
> > in sort orders (different from main queries). Because of our complicated
> > business logic, we need many different relevancy queries in different
> sort
> > orders (other than sort by score, we also have around 20 other different
> > sort orders, some of them are dismax queries). However, this is
> something we
> > can not get away from right now. What kind of optimization I can try to
> do
> > there?
>
> OK, so basically it's slow because functions with embedded relevancy
> queries are "forward only" - if you request the value for a docid
> previous to the last, we need to reboot the query (re-weight, ask for
> the scorer, etc).  This means that for your 30 documents, that will
> require rebooting the query about 15 times (assuming that roughly half
> of the time the next docid will be less than the previous one).
>
> Unfortunately there's not much you can do externally... we need to
> implement optimizations at the Solr level for this.
> Can you open a JIRA issue for this?
>
> -Yonik
> lucidimagination.com
>

Re: Performance degradation with distributed search

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Feb 6, 2012 at 5:53 PM, XJ <ol...@gmail.com> wrote:
> Yes as I mentioned in previous email, we do dismax queries(with different mm
> values), solr function queries (map, etc) math calculations (sum, product,
> log). I understand those are expensive. But worst case it should only double
> the time not going from 200ms to 1200ms right?

You mention dismax... but I assume that's as the main query and you
sort by score (which is fine).
The only issue with relevancy queries is if you sorted by one that was
not the main query - this is not yet optimized.

But for straight function queries that don't contain embedded
relevancy queries, I would definitely not expect the degradation you
are seeing - hence we should try to get to the bottom of this.

-Yonik
lucidimagination.com



> XJ
>
> On Mon, Feb 6, 2012 at 2:37 PM, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
>>
>> On Mon, Feb 6, 2012 at 5:35 PM, XJ <ol...@gmail.com> wrote:
>> > hm.. just looked at the log only 112 matched, and start=0, rows=30
>>
>> Are any of the sort criteria sort-by-function with anything complex
>> (like an embedded relevance query)?
>>
>> -Yonik
>> lucidimagination.com
>>
>>
>> >
>> > On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley
>> > <yo...@lucidimagination.com>
>> > wrote:
>> >>
>> >> On Mon, Feb 6, 2012 at 3:30 PM, oleole <ol...@gmail.com> wrote:
>> >> > Thanks for your reply. Yeah that's the first thing I tried (adding
>> >> > fsv=true
>> >> > to the query) and it surprised me too. Could it due to we're using
>> >> > many
>> >> > complex sortings (20 sortings with dismax, and, or...). Any thing it
>> >> > can
>> >> > be
>> >> > optimized? Looks like it's calculated twice in solr?
>> >>
>> >> It currently does calculate it twice... but only for those documents
>> >> being returned (which should not be significant).
>> >> What is "rows" set to?
>> >>
>> >> -Yonik
>> >> lucidimagination.com
>> >
>> >
>
>

Re: Performance degradation with distributed search

Posted by XJ <ol...@gmail.com>.
Yes as I mentioned in previous email, we do dismax queries(with different
mm values), solr function queries (map, etc) math calculations (sum,
product, log). I understand those are expensive. But worst case it should
only double the time not going from 200ms to 1200ms right?

XJ

On Mon, Feb 6, 2012 at 2:37 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Feb 6, 2012 at 5:35 PM, XJ <ol...@gmail.com> wrote:
> > hm.. just looked at the log only 112 matched, and start=0, rows=30
>
> Are any of the sort criteria sort-by-function with anything complex
> (like an embedded relevance query)?
>
> -Yonik
> lucidimagination.com
>
>
> >
> > On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley <yonik@lucidimagination.com
> >
> > wrote:
> >>
> >> On Mon, Feb 6, 2012 at 3:30 PM, oleole <ol...@gmail.com> wrote:
> >> > Thanks for your reply. Yeah that's the first thing I tried (adding
> >> > fsv=true
> >> > to the query) and it surprised me too. Could it due to we're using
> many
> >> > complex sortings (20 sortings with dismax, and, or...). Any thing it
> can
> >> > be
> >> > optimized? Looks like it's calculated twice in solr?
> >>
> >> It currently does calculate it twice... but only for those documents
> >> being returned (which should not be significant).
> >> What is "rows" set to?
> >>
> >> -Yonik
> >> lucidimagination.com
> >
> >
>

Re: Performance degradation with distributed search

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Feb 6, 2012 at 5:35 PM, XJ <ol...@gmail.com> wrote:
> hm.. just looked at the log only 112 matched, and start=0, rows=30

Are any of the sort criteria sort-by-function with anything complex
(like an embedded relevance query)?

-Yonik
lucidimagination.com


>
> On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley <yo...@lucidimagination.com>
> wrote:
>>
>> On Mon, Feb 6, 2012 at 3:30 PM, oleole <ol...@gmail.com> wrote:
>> > Thanks for your reply. Yeah that's the first thing I tried (adding
>> > fsv=true
>> > to the query) and it surprised me too. Could it due to we're using many
>> > complex sortings (20 sortings with dismax, and, or...). Any thing it can
>> > be
>> > optimized? Looks like it's calculated twice in solr?
>>
>> It currently does calculate it twice... but only for those documents
>> being returned (which should not be significant).
>> What is "rows" set to?
>>
>> -Yonik
>> lucidimagination.com
>
>

Re: Performance degradation with distributed search

Posted by XJ <ol...@gmail.com>.
hm.. just looked at the log only 112 matched, and start=0, rows=30

On Mon, Feb 6, 2012 at 1:33 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Feb 6, 2012 at 3:30 PM, oleole <ol...@gmail.com> wrote:
> > Thanks for your reply. Yeah that's the first thing I tried (adding
> fsv=true
> > to the query) and it surprised me too. Could it due to we're using many
> > complex sortings (20 sortings with dismax, and, or...). Any thing it can
> be
> > optimized? Looks like it's calculated twice in solr?
>
> It currently does calculate it twice... but only for those documents
> being returned (which should not be significant).
> What is "rows" set to?
>
> -Yonik
> lucidimagination.com
>

Re: Performance degradation with distributed search

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Feb 6, 2012 at 3:30 PM, oleole <ol...@gmail.com> wrote:
> Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true
> to the query) and it surprised me too. Could it due to we're using many
> complex sortings (20 sortings with dismax, and, or...). Any thing it can be
> optimized? Looks like it's calculated twice in solr?

It currently does calculate it twice... but only for those documents
being returned (which should not be significant).
What is "rows" set to?

-Yonik
lucidimagination.com

Re: Performance degradation with distributed search

Posted by oleole <ol...@gmail.com>.
Yonik,

Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true
to the query) and it surprised me too. Could it due to we're using many
complex sortings (20 sortings with dismax, and, or...). Any thing it can be
optimized? Looks like it's calculated twice in solr?

XJ

--
View this message in context: http://lucene.472066.n3.nabble.com/Performance-degradation-with-distributed-search-tp3715060p3720739.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Performance degradation with distributed search

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Sat, Feb 4, 2012 at 1:20 AM, XJ <ol...@gmail.com> wrote:
> When I look into details (slow queries), I found some real issues that I
> need help with. For example, a query which takes 200ms with geo sharding,
> now timeout (>2000ms) with distributed search. And each shard query
> (isShard=true) takes about 1200ms. But if I run the query toward the shard
> only (without distributed search), it only takes <200ms. So I compared the
> two query urls, the only difference is shard query using distribute
> search has "fsv=true".

That's odd... I wouldn't expect fsv to make much of a difference.
Can you try running the query on the shard only and adding fsv=true to
verify that it's the culprit?

Also, what version of Solr are you using?

-Yonik
lucidimagination.com