You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hschillig <mo...@live.com> on 2014/10/29 18:30:26 UTC
Score phrases higher than the records containing the words?
So I have a few titles like so:
1. When a dog bites fight back : what you need to know, what to do, what not
to do / [prepared by the law firm] Slater & Zurz LLP.
2. First things first [book on cd] : [the rules of being a Warner-- what
works, what doesn't and what really matters most] / Kurt & Brenda Warner,
with Jennifer Schuchmann.
3. What if? : serious scientific answers to absurd hypothetical questions /
Randall Munroe.
Now when I put this in my query field:
title:what if
It returns the first two BEFORE it returns the book that has the actual
"what if" phrase.. when that one should be listed first..
If I do title:"what if", none of them get returned.
Here is my schema.xml file:
http://apaste.info/7r5 <http://apaste.info/7r5>
I want the titles that contain the phrase "what if" to be returned first.
And then index by "what", "if".. The double quotes doesn't seem to contain
the phrase. I removed the stop words because "if" was in the list and I
didn't want the indexing to ignore that.
Thank you for any help!
--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Score phrases higher than the records containing the words?
Posted by Erick Erickson <er...@gmail.com>.
So what happens if you increase the boost to 100? or 20?
The problem is that boosting will always be "more art than science".
What about the other 3 possibilities I mentioned?
Basically, you have to tweak things to fit your corpus, and it's often
an empirically determined thing.
Best,
Erick
On Thu, Oct 30, 2014 at 9:14 AM, hschillig <mo...@live.com> wrote:
> The other ones are still rating higher. I think it's because the other two
> titles contain "what" 3 times.. the more it says what, the higher it scores.
> I'm not sure what else can be done. Does anybody else have any ideas?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166656.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Re: Score phrases higher than the records containing the words?
Posted by hschillig <mo...@live.com>.
The other ones are still rating higher. I think it's because the other two
titles contain "what" 3 times.. the more it says what, the higher it scores.
I'm not sure what else can be done. Does anybody else have any ideas?
--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166656.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Score phrases higher than the records containing the words?
Posted by hschillig <mo...@live.com>.
Edit:
I filtered my query to author:randall so I could see the score that it's
getting from the query. This is the score of the record that contains "what
if":
"score": 0.004032644
The other two books are getting this score:
"score": 0.0069850935
So... the boost is obviously not hitting that record. I wonder why?
--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166615.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Score phrases higher than the records containing the words?
Posted by hschillig <mo...@live.com>.
How can I tell if the stop words is resolved? This is what I get when I turn
debugging on:
http://apaste.info/0Uz <http://apaste.info/0Uz>
When I put:
q=title:(what if) OR title:"what if"^10
I get this:
"rawquerystring": "title:(what if) OR title:\"what if\"^10",
"querystring": "title:(what if) OR title:\"what if\"^10",
"parsedquery": "(+((title:what title:if) PhraseQuery(title:\"what
if\"^10.0)))/no_coord",
"parsedquery_toString": "+((title:what title:if) title:\"what if\"^10.0)"
The other two titles still appear on top of the one that contains the "what
if" phrase.
I tried turning edismax on and placing "title" in the pf field and the same
results appear.
Thanks for any help,
Haley
--
View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488p4166608.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Score phrases higher than the records containing the words?
Posted by Erick Erickson <er...@gmail.com>.
First thing is add &debug=query to the URL and see what the parsed
form of the query is to be sure the stop words issue is resolved.
Once that's determined, add the phrase with a high boost, something like
q=title:(what if) OR title:"what if"^10
where the boost factor is TBD.
Or add the title field to the "pf" parameter if you're using edismax,
possibly with a boost.
Or add a "bq" clause to the edismax.
Or add a "boost" to the main query, similar to:
https://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
Best,
Erick
On Wed, Oct 29, 2014 at 10:30 AM, hschillig <mo...@live.com> wrote:
> So I have a few titles like so:
>
> 1. When a dog bites fight back : what you need to know, what to do, what not
> to do / [prepared by the law firm] Slater & Zurz LLP.
> 2. First things first [book on cd] : [the rules of being a Warner-- what
> works, what doesn't and what really matters most] / Kurt & Brenda Warner,
> with Jennifer Schuchmann.
> 3. What if? : serious scientific answers to absurd hypothetical questions /
> Randall Munroe.
>
> Now when I put this in my query field:
> title:what if
>
> It returns the first two BEFORE it returns the book that has the actual
> "what if" phrase.. when that one should be listed first..
> If I do title:"what if", none of them get returned.
>
> Here is my schema.xml file:
> http://apaste.info/7r5 <http://apaste.info/7r5>
>
> I want the titles that contain the phrase "what if" to be returned first.
> And then index by "what", "if".. The double quotes doesn't seem to contain
> the phrase. I removed the stop words because "if" was in the list and I
> didn't want the indexing to ignore that.
>
> Thank you for any help!
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Score-phrases-higher-than-the-records-containing-the-words-tp4166488.html
> Sent from the Solr - User mailing list archive at Nabble.com.