You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Steve Fatula <co...@yahoo.com> on 2012/07/02 08:55:38 UTC

Dismax Question

Let's say a user types in:

DualHead2Go


The way solr is working, it splits this into:

Dual Head 2 Go

And searches the index for various fields, and finds records where any ONE of them matches.

Now, if I simply type the search terms Dual Head 2 Go, it finds records where ALL of them match. This is because we set q.op to AND.

Recently, we went from Solr 3.4 to 3.6, and, 3.4 used to work ok, 3.6 seems o behave differently, or, perhaps we mucked something up.

So, my question is how do we get Solr search to work with AND when it is splitting words? The splitting part is good, the bad part is that it is searching for any one of those split words.

Steve

Re: Dismax Question

Posted by Steve Fatula <co...@yahoo.com>.
>From: Vadim Kisselmann <v....@gmail.com>
>To: solr-user@lucene.apache.org; Steve Fatula <co...@yahoo.com> 
>Sent: Monday, July 2, 2012 4:31 AM
>Subject: Re: Dismax Question
> 
>in your schema.xml you can set the default query parser operator, in
>your case <solrQueryParser defaultOperator="AND"/>, but it's
>deprecated.
>
>
>I do set the default query operator, as shown by using separate words in my example, it correct ands them. The different is when using one words, SOLR splits it into 3 words, and does not and. I don't understand why it does not and them when solr splits the words, but does when solr does not split them.
>
>
>I've specified mm as 100% as well with no impact.
>
>This used to work on Solr 3.4.

Re: Dismax Question

Posted by Vadim Kisselmann <v....@gmail.com>.
in your schema.xml you can set the default query parser operator, in
your case <solrQueryParser defaultOperator="AND"/>, but it's
deprecated.
When you use the edismax, read this:http://drupal.org/node/1559394 .
mm-param is here the answer.

Best regards
Vadim





2012/7/2 Steve Fatula <co...@yahoo.com>:
> Let's say a user types in:
>
> DualHead2Go
>
>
> The way solr is working, it splits this into:
>
> Dual Head 2 Go
>
> And searches the index for various fields, and finds records where any ONE of them matches.
>
> Now, if I simply type the search terms Dual Head 2 Go, it finds records where ALL of them match. This is because we set q.op to AND.
>
> Recently, we went from Solr 3.4 to 3.6, and, 3.4 used to work ok, 3.6 seems o behave differently, or, perhaps we mucked something up.
>
> So, my question is how do we get Solr search to work with AND when it is splitting words? The splitting part is good, the bad part is that it is searching for any one of those split words.
>
> Steve

Re: Dismax Question

Posted by Steve Fatula <co...@yahoo.com>.
It turns out that Solr 3.5.0 does not have the dismax issue, so, we have reverted. Hopefully, the bug will be fixed.

Re: Dismax Question

Posted by Steve Fatula <co...@yahoo.com>.

From: Joel Rosen <jo...@gmail.com>
>To: solr-user@lucene.apache.org; Steve Fatula <co...@yahoo.com> 
>Cc: Ahmet Arslan <io...@yahoo.com>; Tom Burton-West <tb...@umich.edu> 
>Sent: Monday, July 2, 2012 10:31 AM
>Subject: Re: Dismax Question
> 
>I and another user recently posted about this exact same issue.  It sounds
>like maybe this is a new bug introduced in 3.6:
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E
>
>Does anyone happen to know if 3.5 (we went from 3.4 to 3.6) happens to have the problem? If not, we'd probably revert since we can't deal with the millions if extra search results that should not be there.

Re: Dismax Question

Posted by Steve Fatula <co...@yahoo.com>.
From: Joel Rosen <jo...@gmail.com>

To: solr-user@lucene.apache.org; Steve Fatula <co...@yahoo.com> 
>Cc: Ahmet Arslan <io...@yahoo.com>; Tom Burton-West <tb...@umich.edu> 
>Sent: Monday, July 2, 2012 10:31 AM
>Subject: Re: Dismax Question
> 
>
>I and another user recently posted about this exact same issue.  It sounds like maybe this is a new bug introduced in 3.6:
>
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E
>
>
>http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E
>
>That sounds like the same thing, I've noticed searching for SEP-100 actually does a SEP or 100, even though it's supposed to be AND. 

Has a bug report been filed? 

I've managed to figure out a fix that is working well enough for my own application right now.  I set autoGeneratePhraseQueries to "true" on my field, and also set qs=20000.  The high query slop value simulates the AND behavior that I want since my documents are relatively short, but this is obviously not the correct solution, and I don't know if there are any performance issues with using really high query slop values.
>
>Ok, so, I need to figure out autoGeneratePhraseQueries I guess. The way I understand it, if I search for WORD1 WORD2, it will only find "WORD1 WORD2", is that correct? That would be bad if so since I'd really want WORD1 AND WORD2, but not the phrase.

Trying to find some good doc for this feature!

Re: Dismax Question

Posted by Joel Rosen <jo...@gmail.com>.
I and another user recently posted about this exact same issue.  It sounds
like maybe this is a new bug introduced in 3.6:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMKKMTx_ybPqsbgU5NtQ19t%2B0kWdAHtq-CZTZxfYxdu6rS1u1g%40mail.gmail.com%3E

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201206.mbox/%3CCAMySt%2BE6Hr6%3DgOkkDeZU9PCTpgJ4Mb1i8YrzfAndfqUzdot8xw%40mail.gmail.com%3E

I've managed to figure out a fix that is working well enough for my own
application right now.  I set autoGeneratePhraseQueries to "true" on my
field, and also set qs=20000.  The high query slop value simulates the AND
behavior that I want since my documents are relatively short, but this is
obviously not the correct solution, and I don't know if there are any
performance issues with using really high query slop values.

On Mon, Jul 2, 2012 at 9:16 AM, Steve Fatula <co...@yahoo.com>wrote:

>
> >From: Ahmet Arslan <io...@yahoo.com>
> >To: solr-user@lucene.apache.org; Steve Fatula <co...@yahoo.com>
> >Sent: Monday, July 2, 2012 6:22 AM
> >Subject: Re: Dismax Question
> >
> >> So, my question is how do we get Solr search to work with
> >> AND when it is splitting words? The splitting part is good,
> >> the bad part is that it is searching for any one of those
> >> split words.
> >
> >Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.
> >
> ><fieldType name="text" class="solr.TextField"
> autoGeneratePhraseQueries="true">
> >
> >I set mm to 100%, no effect at all. It works only for words typed in that
> are separated already. Remember, the example here is:
> >
> >
> >DualHead2Go finds all kinds of matches (it splits into dual head 2 go)
> >
> >
> >Dial Head 2 Go finds the correct matches, indicating it is adding them
> based on q/op, defautOperator, and mm.

Re: Dismax Question

Posted by Steve Fatula <co...@yahoo.com>.
>From: Ahmet Arslan <io...@yahoo.com>
>To: solr-user@lucene.apache.org; Steve Fatula <co...@yahoo.com> 
>Sent: Monday, July 2, 2012 6:22 AM
>Subject: Re: Dismax Question
> 
>> So, my question is how do we get Solr search to work with
>> AND when it is splitting words? The splitting part is good,
>> the bad part is that it is searching for any one of those
>> split words.
>
>Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.
>
><fieldType name="text" class="solr.TextField" autoGeneratePhraseQueries="true">
>
>I set mm to 100%, no effect at all. It works only for words typed in that are separated already. Remember, the example here is:
>
>
>DualHead2Go finds all kinds of matches (it splits into dual head 2 go)
>
>
>Dial Head 2 Go finds the correct matches, indicating it is adding them based on q/op, defautOperator, and mm.

Re: Dismax Question

Posted by Ahmet Arslan <io...@yahoo.com>.
> So, my question is how do we get Solr search to work with
> AND when it is splitting words? The splitting part is good,
> the bad part is that it is searching for any one of those
> split words.

Setting autoGeneratePhraseQueries="true" and &mm=100% might help you.

<fieldType name="text" class="solr.TextField" autoGeneratePhraseQueries="true">

http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29