You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by marship <ma...@126.com> on 2010/07/14 06:06:11 UTC

Strange "the" when search with dismax

Hi. All.
   I am using solr dismax to search over my books in db. I indexed them all using solr.
   the problem I noticed today is,
Everything start with I want to search for a book "
The Girl Who Kicked the Hornet's Nest
"
but nothing is returned. I'm sure I have this book in DB. So I stripped some keyword and finally I found when I search for "the girl who kicked hornet's nest" , I got the book.
Then I test more
when I search for "the first world war", solr return the book successfully to me.
But when I search for "the first world war the", solr returns NOTHING!


So strange!
So the issue is, if there are 2 "the" in query keywords, solr/dismax simply return nothing!


Why is this happening?


Please help.
Thanks.
Regards.
Scott




Re: Strange "the" when search with dismax

Posted by kenf_nc <ke...@realestate.com>.
Sounds like you want the 'text' fieldType (or equivalent) and are using
'string' or 'lowercase'. Those must match all exactly (well, case
insensitively in the case of 'lowercase').  The TextType field types (like
'text') do tokenizations so matches will occur under many more conditions.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Strange-the-when-search-with-dismax-tp965473p966524.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Strange "the" when search with dismax

Posted by Erick Erickson <er...@gmail.com>.
If the other suggestions don't work, you need to show us the relevant
portions of your schema.xml, and probably query output with
&debug=on tacked on...

Here are some pointers for getting help...

http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

2010/7/14 Jonathan Rochkind <ro...@jhu.edu>

> "the" sounds like it might be a stopword. Are you using stopwords in any
> of your fields covered by the dismax search? But not in some of the
> other fields covered by dismax? the combination of dismax and stopwords
> can result in unexpected behavior if you aren't careful.
>
> I wrote about this a bit here, you might find it helpful:
> http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/
>
> marship wrote:
> > Hi. All.
> >    I am using solr dismax to search over my books in db. I indexed them
> all using solr.
> >    the problem I noticed today is,
> > Everything start with I want to search for a book "
> > The Girl Who Kicked the Hornet's Nest
> > "
> > but nothing is returned. I'm sure I have this book in DB. So I stripped
> some keyword and finally I found when I search for "the girl who kicked
> hornet's nest" , I got the book.
> > Then I test more
> > when I search for "the first world war", solr return the book
> successfully to me.
> > But when I search for "the first world war the", solr returns NOTHING!
> >
> >
> > So strange!
> > So the issue is, if there are 2 "the" in query keywords, solr/dismax
> simply return nothing!
> >
> >
> > Why is this happening?
> >
> >
> > Please help.
> > Thanks.
> > Regards.
> > Scott
> >
> >
> >
> >
>

Re: Strange "the" when search with dismax

Posted by Jonathan Rochkind <ro...@jhu.edu>.
"the" sounds like it might be a stopword. Are you using stopwords in any
of your fields covered by the dismax search? But not in some of the
other fields covered by dismax? the combination of dismax and stopwords
can result in unexpected behavior if you aren't careful.

I wrote about this a bit here, you might find it helpful:
http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/

marship wrote:
> Hi. All.
>    I am using solr dismax to search over my books in db. I indexed them all using solr.
>    the problem I noticed today is,
> Everything start with I want to search for a book "
> The Girl Who Kicked the Hornet's Nest
> "
> but nothing is returned. I'm sure I have this book in DB. So I stripped some keyword and finally I found when I search for "the girl who kicked hornet's nest" , I got the book.
> Then I test more
> when I search for "the first world war", solr return the book successfully to me.
> But when I search for "the first world war the", solr returns NOTHING!
>
>
> So strange!
> So the issue is, if there are 2 "the" in query keywords, solr/dismax simply return nothing!
>
>
> Why is this happening?
>
>
> Please help.
> Thanks.
> Regards.
> Scott
>
>
>
>