You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Naomi Dushay <nd...@stanford.edu> on 2010/11/05 20:04:39 UTC

a Solr search recall problem you probably don't even know you're having

(sorry for cross postings - I think this is important information to  
disseminate)

Executive Summary:  you probably need to increase your query slop.  A  
lot.


We recently had a feedback ticket that a title search with a hyphen  
wasn't working properly.  This is especially curious because we solved  
a bunch of problems with hyphen searching AND WROTE TESTS in the  
process, and all the existing hyphen tests pass.  Tests like "hyphens  
with no spaces before or after, 3 significant terms, 2 stopwords" pass.

Our metadata contains:
record A with title:   Red-rose chain.
record B with title:   Prisoner in a red-rose chain.

A title search:  prisoner in a red-rose chain  returns no results

Further exploration (the following are all title searches):
red-rose chain  ==>  record A only
"red rose" chain ==>  record A only
"red rose chain" ==> record A only
"red-rose chain" ==> record A only
red rose chain ==>  records A and B
red "rose chain" ==>  records A and B  (!!)

For more details and more about the solution, see  http://discovery-grindstone.blogspot.com/2010/11/solr-and-hyphenated-words.html

- Naomi Dushay
Senior Developer
Stanford University Libraries
  

Re: a Solr search recall problem you probably don't even know you're having

Posted by Naomi Dushay <nd...@stanford.edu>.
Robert,

Thanks!   I've been using Solr 1.5 from trunk back in March - time to  
upgrade!  I also like the "put the stopword filter after the WDF  
filter" fix.

- Naomi

On Nov 5, 2010, at 12:36 PM, Robert Muir wrote:

> On Fri, Nov 5, 2010 at 3:04 PM, Naomi Dushay <nd...@stanford.edu>  
> wrote:
>> (sorry for cross postings - I think this is important information to
>> disseminate)
>>
>> Executive Summary:  you probably need to increase your query slop.   
>> A lot.
>>
>
> I looked at your example, and it really looks a lot like
> https://issues.apache.org/jira/browse/SOLR-1852
>
> This was fixed, and released in Solr 1.4.1... and of course from the
> upgrading notes:
> "However, a reindex is needed for some of the analysis fixes to take  
> effect."
>
> Your example "Prisoner in a red-rose chain" in Solr 1.4.1 no longer
> has the positions "1,4,7,8", but instead "1,4,5,6".
>
> I recommend upgrading to this bugfix release and re-indexing if you
> are having problems like this!!!!


Re: a Solr search recall problem you probably don't even know you're having

Posted by Robert Muir <rc...@gmail.com>.
On Fri, Nov 5, 2010 at 3:04 PM, Naomi Dushay <nd...@stanford.edu> wrote:
> (sorry for cross postings - I think this is important information to
> disseminate)
>
> Executive Summary:  you probably need to increase your query slop.  A lot.
>

I looked at your example, and it really looks a lot like
https://issues.apache.org/jira/browse/SOLR-1852

This was fixed, and released in Solr 1.4.1... and of course from the
upgrading notes:
"However, a reindex is needed for some of the analysis fixes to take effect."

Your example "Prisoner in a red-rose chain" in Solr 1.4.1 no longer
has the positions "1,4,7,8", but instead "1,4,5,6".

I recommend upgrading to this bugfix release and re-indexing if you
are having problems like this!!!!