You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by leostro <le...@gmail.com> on 2015/01/10 10:05:07 UTC

edismax and mm: strange behaviour

Hi all

I'm studying SOLR for implement it in my website.
I've imported the db and I'm making some tests about edismax and mm.
I'm searching for documents containing "xbox 360".

- If I specifiy mm=100% (I have the same result setting default operator to
"AND") SOLR give me 5 documents:
[http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
360&rows=10&defType=edismax&mm=100%&fq=countryid:53]

Ps3 xbox 360 ps4 xbox one
Ps4, xbox one, ps3, xbox 360
Xbox one - scambio con 360 e supplemento
Playstation 3, 2, xbox 360, one
Xbox 360 ps3 xbox one ps4 notebook netbook i-phone

Ok, It's right they all contains both "xbox" and "360".

- BUT the same url specifying mm=0 gives me a lot of matching documents
(559, the same I have with default operator set to "OR")
[http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
360&rows=10&defType=edismax&mm=0%&fq=countryid:53]

Bur surprising, the results consists of a lot of documents containing both
"xbox" and "360" that aren't returned by the first query. 
Here the first 20 rows:

Ps3 xbox 360 ps4 xbox one
Ps4, xbox one, ps3, xbox 360
Xbox one - scambio con 360 e supplemento
Playstation 3, 2, xbox 360, one
Xbox 360 ps3 xbox one ps4 notebook netbook i-phone
Xbox 360
cerco xbox 360
Xbox 360
Xbox 360
xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360

How can it happens?
Hope someone can help me.

Regards,
Leo



--
View this message in context: http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: edismax and mm: strange behaviour

Posted by leostro <le...@gmail.com>.
Hi Jack,

I read the documentation here:
http://wiki.apache.org/solr/ExtendedDisMax#mm_.28Minimum_.27Should.27_Match.29

My question is quite simple, maybe it's not clear for my poor english.
As explained in the response to ahmet my goal is to get ALL and ONLY the
documents that contains the two words I specified in "q" ("xbox" and "360").

I can't uderstand why if I specify "q=xbox 360" and "mm=100%" some documents
containing both two words are not returned....

regards
leo





--
View this message in context: http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532p4178603.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: edismax and mm: strange behaviour

Posted by Jack Krupansky <ja...@gmail.com>.
Why are you using the mm parameter at all? In my experience, anyone setting
mm to 0 or 100% is misusing the mm feature. mm stands for "minimum should
match" and is designed to give expert users fine control over recall when
terms are optional ("should" occur but are not "required".) So, please
explain in plain English what effect you are trying to achieve. mm is not
for newbies!

Also, please point us to whatever doc or other material you were reading
that gave you the impression that mm was appropriate for your use case, so
that we can correct any bad documentation.

-- Jack Krupansky

On Sat, Jan 10, 2015 at 4:05 AM, leostro <le...@gmail.com> wrote:

> Hi all
>
> I'm studying SOLR for implement it in my website.
> I've imported the db and I'm making some tests about edismax and mm.
> I'm searching for documents containing "xbox 360".
>
> - If I specifiy mm=100% (I have the same result setting default operator to
> "AND") SOLR give me 5 documents:
> [http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
> 360&rows=10&defType=edismax&mm=100%&fq=countryid:53]
>
> Ps3 xbox 360 ps4 xbox one
> Ps4, xbox one, ps3, xbox 360
> Xbox one - scambio con 360 e supplemento
> Playstation 3, 2, xbox 360, one
> Xbox 360 ps3 xbox one ps4 notebook netbook i-phone
>
> Ok, It's right they all contains both "xbox" and "360".
>
> - BUT the same url specifying mm=0 gives me a lot of matching documents
> (559, the same I have with default operator set to "OR")
> [http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
> 360&rows=10&defType=edismax&mm=0%&fq=countryid:53]
>
> Bur surprising, the results consists of a lot of documents containing both
> "xbox" and "360" that aren't returned by the first query.
> Here the first 20 rows:
>
> Ps3 xbox 360 ps4 xbox one
> Ps4, xbox one, ps3, xbox 360
> Xbox one - scambio con 360 e supplemento
> Playstation 3, 2, xbox 360, one
> Xbox 360 ps3 xbox one ps4 notebook netbook i-phone
> Xbox 360
> cerco xbox 360
> Xbox 360
> Xbox 360
> xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
> Xbox 360
>
> How can it happens?
> Hope someone can help me.
>
> Regards,
> Leo
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: edismax and mm: strange behaviour

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

Basically, (e)dimax is designed to search over multiple fields. 
It could be used to search over single field.
Please see for more about it : https://lucidworks.com/blog/whats-a-dismax/

You mention title field but I don't see title in search URLs you provided.

So my question remains, what fields are you searching on? What are their types/analyzer?

Ahmet



On Saturday, January 10, 2015 5:43 PM, leostro <le...@gmail.com> wrote:
Hi Ahmet,

I don't specify any qf in this query.
Reading here
(http://wiki.apache.org/solr/ExtendedDisMax#mm_.28Minimum_.27Should.27_Match.29)
it seems that mm is referred to the text provided as "q" in querystring, I
am wrong?
Reading the doc above, my expectation is that if I specify a q value with 2
words ("xbox" and "one") with mm=100% solr should return EACH document that
contains BOTH the words in title (the only field I told it to search in).

Maybe I'm missing something...
Thanks
Leo



--
View this message in context: http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532p4178595.html

Sent from the Solr - User mailing list archive at Nabble.com.

Re: edismax and mm: strange behaviour

Posted by leostro <le...@gmail.com>.
Hi Ahmet,

I don't specify any qf in this query.
Reading here
(http://wiki.apache.org/solr/ExtendedDisMax#mm_.28Minimum_.27Should.27_Match.29)
it seems that mm is referred to the text provided as "q" in querystring, I
am wrong?
Reading the doc above, my expectation is that if I specify a q value with 2
words ("xbox" and "one") with mm=100% solr should return EACH document that
contains BOTH the words in title (the only field I told it to search in).

Maybe I'm missing something...
Thanks
Leo



--
View this message in context: http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532p4178595.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: edismax and mm: strange behaviour

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

What are query fields qf and their field types?

Ahmet



On Saturday, January 10, 2015 11:10 AM, leostro <le...@gmail.com> wrote:
Hi all

I'm studying SOLR for implement it in my website.
I've imported the db and I'm making some tests about edismax and mm.
I'm searching for documents containing "xbox 360".

- If I specifiy mm=100% (I have the same result setting default operator to
"AND") SOLR give me 5 documents:
[http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
360&rows=10&defType=edismax&mm=100%&fq=countryid:53]

Ps3 xbox 360 ps4 xbox one
Ps4, xbox one, ps3, xbox 360
Xbox one - scambio con 360 e supplemento
Playstation 3, 2, xbox 360, one
Xbox 360 ps3 xbox one ps4 notebook netbook i-phone

Ok, It's right they all contains both "xbox" and "360".

- BUT the same url specifying mm=0 gives me a lot of matching documents
(559, the same I have with default operator set to "OR")
[http://localhost:8983/solr/Collection1/select?wt=json&indent=true&q=xbox
360&rows=10&defType=edismax&mm=0%&fq=countryid:53]

Bur surprising, the results consists of a lot of documents containing both
"xbox" and "360" that aren't returned by the first query. 
Here the first 20 rows:

Ps3 xbox 360 ps4 xbox one
Ps4, xbox one, ps3, xbox 360
Xbox one - scambio con 360 e supplemento
Playstation 3, 2, xbox 360, one
Xbox 360 ps3 xbox one ps4 notebook netbook i-phone
Xbox 360
cerco xbox 360
Xbox 360
Xbox 360
xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360
Xbox 360

How can it happens?
Hope someone can help me.

Regards,
Leo



--
View this message in context: http://lucene.472066.n3.nabble.com/edismax-and-mm-strange-behaviour-tp4178532.html
Sent from the Solr - User mailing list archive at Nabble.com.