You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by DHast <ha...@gmail.com> on 2009/10/20 20:18:06 UTC

Slow Phrase Queries

Hello,
I have recently installed Solr as an alternative to our home made lucene
search servers, and while in most respects the performance is better, i
notice that phrase searches are incredibly slow compared to normal lucene,
primarily when using facets

example:
"City of New York, Matter of" takes 11 seconds
City of New York, Matter of takes 1 second

the same searches using raw lucene take 5 seconds and 3 seconds
respectively.

i tried cutting out as much as i could from solrconfig without breaking it,
is there anything else i could try doing to make solr perform similarly to
raw lucene as far as phrase queries are concerned?
thanks
-- 
View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25979999.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Slow Phrase Queries

Posted by Tom Burton-West <tb...@gmail.com>.
You might try a couple tests in the Solr admin interface to make sure the
query is being processed the same in both Solr and raw lucene.  
1) use the analysis panel to determine if the Solr filter chain is doing
something unexpected compared to your lucene filter chain
2) try running a debug query from the Admin tool interface in Solr and then
in Lucene to see if the query is being parsed or otherwise interpreted
differently.

Tom 


DHast wrote:
> 
> Hello,
> I have recently installed Solr as an alternative to our home made lucene
> search servers, and while in most respects the performance is better, i
> notice that phrase searches are incredibly slow compared to normal lucene,
> primarily when using facets
> 
> example:
> "City of New York, Matter of" takes 11 seconds
> City of New York, Matter of takes 1 second
> 
> the same searches using raw lucene take 5 seconds and 3 seconds
> respectively.
> 
> i tried cutting out as much as i could from solrconfig without breaking
> it, is there anything else i could try doing to make solr perform
> similarly to raw lucene as far as phrase queries are concerned?
> thanks
> 

-- 
View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25980562.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Slow Phrase Queries

Posted by DHast <ha...@gmail.com>.
that's what my use case has shown, but i havent done enough experimenting to
know for sure.
the reason the field is untokenized is because i need the full value of an
authors name, example: "smith, jones", if it was tokenized and faceted it
would be jones and another entry for smith.  

i am running a lot of facets, 19 of them, some are facet queries and some
are fields, not including the author faceting.  the performance is great
with all of them running.  the problem is of those other facets there are
only a limited number of possibilities. 

But with authors, there are over 800,000 separate authors in my data.  the
next route is to modify the index to index the authors as an integer taken
from a database table of all authors, and re-attempt the author facets again
on a tokenized field of integers.

unfortunately it takes 4-9 days for the index to be built as it is a little
over 22 GB's


Lance Norskog-2 wrote:
> 
> Are you saying that faceting is faster on a tokenized field? Is this true?
> 
> On Tue, Oct 20, 2009 at 2:02 PM, DHast <ha...@gmail.com>
> wrote:
> ...
> , removing
>> that facet worked since the field was untokenizd and slow considering how
>> many values tehre were.
> ...
> 
>> View this message in context:
>> http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25982493.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> Lance Norskog
> goksron@gmail.com
> 
> 

-- 
View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25999252.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Slow Phrase Queries

Posted by Lance Norskog <go...@gmail.com>.
Are you saying that faceting is faster on a tokenized field? Is this true?

On Tue, Oct 20, 2009 at 2:02 PM, DHast <ha...@gmail.com> wrote:
...
, removing
> that facet worked since the field was untokenizd and slow considering how
> many values tehre were.
...

> View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25982493.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Slow Phrase Queries

Posted by DHast <ha...@gmail.com>.
ah, it turns out it was one of my 6 facets, the author.  in the data pool
tehre are over 1.9 million documents, and about 800,000 authors, removing
that facet worked since the field was untokenizd and slow considering how
many values tehre were.  Solr is definitely faster, and as fast and or
faster with facets
-- 
View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25982493.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Slow Phrase Queries

Posted by Yonik Seeley <ys...@gmail.com>.
Solr just uses a stock lucene phrase query.
What version of Lucene and Solr are you comparing?
Do the queries match the same number of documents?

-Yonik
http://www.lucidimagination.com

On Tue, Oct 20, 2009 at 2:18 PM, DHast <ha...@gmail.com> wrote:
>
> Hello,
> I have recently installed Solr as an alternative to our home made lucene
> search servers, and while in most respects the performance is better, i
> notice that phrase searches are incredibly slow compared to normal lucene,
> primarily when using facets
>
> example:
> "City of New York, Matter of" takes 11 seconds
> City of New York, Matter of takes 1 second
>
> the same searches using raw lucene take 5 seconds and 3 seconds
> respectively.
>
> i tried cutting out as much as i could from solrconfig without breaking it,
> is there anything else i could try doing to make solr perform similarly to
> raw lucene as far as phrase queries are concerned?
> thanks
> --
> View this message in context: http://www.nabble.com/Slow-Phrase-Queries-tp25979999p25979999.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>