You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Guangwei Yuan <gu...@gmail.com> on 2008/07/01 21:25:12 UTC

Slow performance using MatchAllDocsQuery with filter query

Hi,

I've noticed some bad performance in faceted browsing, when the query is
empty (so the MatchAllDocsQuery is used) and there are only filter queries.
An example of the search url is:

http://hostname:8080/solr/select/?q=&qt=dismax&fq=color:%23000000

One idea is to switch to the StandardRequest in such case, by putting
filters into the query filed, like "q=color:%23000000". But the downside is
the boost functions for dismax are lost and extra code maintenance. Any
ideas?

Thanks,
Guangwei

Re: Slow performance using MatchAllDocsQuery with filter query

Posted by Mike Klaas <mi...@gmail.com>.
On 1-Jul-08, at 12:25 PM, Guangwei Yuan wrote:

> I've noticed some bad performance in faceted browsing, when the  
> query is
> empty (so the MatchAllDocsQuery is used) and there are only filter  
> queries.
> An example of the search url is:
>
> http://hostname:8080/solr/select/?q=&qt=dismax&fq=color:%23000000
>
> One idea is to switch to the StandardRequest in such case, by putting
> filters into the query filed, like "q=color:%23000000". But the  
> downside is
> the boost functions for dismax are lost and extra code maintenance.  
> Any
> ideas?

I too have noticed surprisingly poor performance in similar  
situations.  I've never quite have enough time to track it down, and  
at times I have done things like what you suggest (though implemented  
using a custom param I can pass to dismax; see http://issues.apache.org/jira/browse/SOLR-407 
  , though that patch will not apply to current trunk).  We've also  
discussed giving Solr the ability to detect these situations itself  
and apply the appropriate optimizations.  It is difficult to come up  
with heuristics that always result in faster queries.

It is important to think about how the cache is interacting with  
this.  Are these cached queries?  Filters?  What do the cache  
statistics read like after re-executing the query multiple times?

cheers,
-Mike