You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hschillig <mo...@live.com> on 2014/10/29 18:30:26 UTC

Score phrases higher than the records containing the words?

So I have a few titles like so:

1. When a dog bites fight back : what you need to know, what to do, what not
to do / [prepared by the law firm] Slater & Zurz LLP.
2. First things first [book on cd] : [the rules of being a Warner-- what
works, what doesn't and what really matters most] / Kurt & Brenda Warner,
with Jennifer Schuchmann.
3. What if? : serious scientific answers to absurd hypothetical questions /
Randall Munroe.

Now when I put this in my query field:
title:what if

It returns the first two BEFORE it returns the book that has the actual
"what if" phrase.. when that one should be listed first..
If I do title:"what if", none of them get returned.

Here is my schema.xml file:
http://apaste.info/7r5 <http://apaste.info/7r5>  

I want the titles that contain the phrase "what if" to be returned first.
And then index by "what", "if".. The double quotes doesn't seem to contain
the phrase. I removed the stop words because "if" was in the list and I
didn't want the indexing to ignore that.

Thank you for any help!



--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Score phrases higher than the records containing the words?

Posted by Erick Erickson <er...@gmail.com>.
So what happens if you increase the boost to 100? or 20?

The problem is that boosting will always be "more art than science".

What about the other 3 possibilities I mentioned?

Basically, you have to tweak things to fit your corpus, and it's often
an empirically determined thing.

Best,
Erick

On Thu, Oct 30, 2014 at 9:14 AM, hschillig <mo...@live.com> wrote:
> The other ones are still rating higher. I think it's because the other two
> titles contain "what" 3 times.. the more it says what, the higher it scores.
> I'm not sure what else can be done. Does anybody else have any ideas?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166656.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Score phrases higher than the records containing the words?

Posted by hschillig <mo...@live.com>.
The other ones are still rating higher. I think it's because the other two
titles contain "what" 3 times.. the more it says what, the higher it scores.
I'm not sure what else can be done. Does anybody else have any ideas?



--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166656.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Score phrases higher than the records containing the words?

Posted by hschillig <mo...@live.com>.
Edit:
I filtered my query to author:randall so I could see the score that it's
getting from the query. This is the score of the record that contains "what
if":
"score": 0.004032644

The other two books are getting this score:
"score": 0.0069850935

So... the boost is obviously not hitting that record. I wonder why?



--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166615.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Score phrases higher than the records containing the words?

Posted by hschillig <mo...@live.com>.
How can I tell if the stop words is resolved? This is what I get when I turn
debugging on:

http://apaste.info/0Uz <http://apaste.info/0Uz>  

When I put:
q=title:(what if) OR title:"what if"^10
I get this:

"rawquerystring": "title:(what if) OR title:\"what if\"^10",
"querystring": "title:(what if) OR title:\"what if\"^10",
"parsedquery": "(+((title:what title:if) PhraseQuery(title:\"what
if\"^10.0)))/no_coord",
"parsedquery_toString": "+((title:what title:if) title:\"what if\"^10.0)"

The other two titles still appear on top of the one that contains the "what
if" phrase.
I tried turning edismax on and placing "title" in the pf field and the same
results appear.

Thanks for any help,
Haley



--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166608.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Score phrases higher than the records containing the words?

Posted by Erick Erickson <er...@gmail.com>.
First thing is add &debug=query to the URL and see what the parsed
form of the query is to be sure the stop words issue is resolved.

Once that's determined, add the phrase with a high boost, something like
q=title:(what if) OR title:"what if"^10

where the boost factor is TBD.

Or add the title field to the "pf" parameter if you're using edismax,
possibly with a boost.

Or add a "bq" clause to the edismax.

Or add a "boost" to the main query, similar to:
https://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents

Best,
Erick


On Wed, Oct 29, 2014 at 10:30 AM, hschillig <mo...@live.com> wrote:
> So I have a few titles like so:
>
> 1. When a dog bites fight back : what you need to know, what to do, what not
> to do / [prepared by the law firm] Slater & Zurz LLP.
> 2. First things first [book on cd] : [the rules of being a Warner-- what
> works, what doesn't and what really matters most] / Kurt & Brenda Warner,
> with Jennifer Schuchmann.
> 3. What if? : serious scientific answers to absurd hypothetical questions /
> Randall Munroe.
>
> Now when I put this in my query field:
> title:what if
>
> It returns the first two BEFORE it returns the book that has the actual
> "what if" phrase.. when that one should be listed first..
> If I do title:"what if", none of them get returned.
>
> Here is my schema.xml file:
> http://apaste.info/7r5 <http://apaste.info/7r5>
>
> I want the titles that contain the phrase "what if" to be returned first.
> And then index by "what", "if".. The double quotes doesn't seem to contain
> the phrase. I removed the stop words because "if" was in the list and I
> didn't want the indexing to ignore that.
>
> Thank you for any help!
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488.html
> Sent from the Solr - User mailing list archive at Nabble.com.