You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Eustache Felenc <eu...@idilia.com> on 2013/03/19 15:16:28 UTC
Boolean Query Scorer Over-weighting Query Terms With Synonyms
Hi,
I don't understand why the scorer is making a sum of the weight of the
OR clauses. It seems to me that it is unbalancing the query scoring
toward the term that has more alternatives. To me it would make more
sense to have the max of the weight of query term alternatives.
Here is an example:
I ran in the solr admin interface: gucci (handbag OR purse OR pocketbook)
By clicking debug I can see that the parsed query is as expected:
"parsedquery":"text:gucci (text:handbag text:purse text:pocketbook)"
The explain field shows that the scorer is making (I simplify a bit
here): weight(gucci) + sum( weight(handbag) + weight(purse) +
weight(pocketbook))
The consequence is that a result containing handbag, purse and
pocketbook is going to have a higher score than a result containing
gucci and handbag. I think this is counter-intuitive. To me the OR means
those terms are equivalent, not that they are more important. Besides I
could use query term boosting to do this independently.
I experimented with Edismax and it has similar behaviour.
The question are, am I missing something ? Is there a way to have an OR
clause which preserve query term relative "importance" (note that
playing with mm in edismax does not solve the issue) ?
Thanks !