You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Benson Margulies <bi...@gmail.com> on 2012/04/13 23:40:55 UTC

Can I discover what part of a score is attributable to a subquery?

Given a query including a subquery, is there any way for me to learn
that subquery's contribution to the overall document score?

I can provide 'why on earth would anyone ...' if someone wants to know.

Re: Can I discover what part of a score is attributable to a subquery?

Posted by Paul Libbrecht <pa...@hoplahup.net>.
Benson,

If  I remember well, the big big problem is that there's all sorts of recalibration of the scores based on the query. Therefore having it in one go is really nice.

I am not sure the different similarity can be put together well here though...

paul


Le 14 avr. 2012 à 18:58, Benson Margulies a écrit :

> On Sat, Apr 14, 2012 at 12:37 PM, Paul Libbrecht <pa...@hoplahup.net> wrote:
>> Benson,
>> 
>> it was in the Lucene world in May 2010:
>>        http://mail-archives.apache.org/mod_mbox/lucene-java-user/201005.mbox/%3C469705.48901.qm@web29016.mail.ird.yahoo.com%3E
>> Mark Harwood pointed me to a "FlagQuery" which was exactly what I needed.
>> His contribution sounds not to have been taken up, it worked for me in Lucene, 2.4.1.
>> We used this to create an auto-completion popup which selected the right language by flagging the right sub-query that was most matched.
> 
> Paul, it seems to me that the criticism in the JIRA (do you really
> want this calculation for every single document that matches?) applies
> to me. In our stuff, we run a query, and we look at the top 200 items,
> rearranging their order based on a name similarity metric that is too
> expensive to run in bulk. If the overall query is 'just us', we can
> discard the Lucene scores and reorder based on our own. If our query
> is combined with other terms, then we need to subtract out the
> contribution our part of the initial query. However, sending in a
> second query with (I suppose) ids=id1,id2,... and just our query, to
> retrieve the scores, should be pretty speedy for a mere 200 items.
> Maybe I'm missing some even easier way, given a DocList and a query,
> to obtain scores for those docs for that query?
> 
>> 
>> paul
>> 
>> Le 14 avr. 2012 à 15:34, Benson Margulies a écrit :
>> 
>>> yes please
>>> 
>>> On Apr 14, 2012, at 2:40 AM, Paul Libbrecht <pa...@hoplahup.net> wrote:
>>> 
>>>> Benson,
>>>> In mid 2009, I has such a question answered with a nifty score bitwise manipulation, and a little precision loss. For each result I could pick the language of a multilingual match.
>>>> If interested, I can dig.
>>>> Paul
>>>> --
>>>> Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.
>>>> 
>>>> 
>>>> Benson Margulies <bi...@gmail.com> a écrit :
>>>> 
>>>> Given a query including a subquery, is there any way for me to learn
>>>> that subquery's contribution to the overall document score?
>>>> 
>>>> I can provide 'why on earth would anyone ...' if someone wants to know.
>>>> 
>> 


Re: Can I discover what part of a score is attributable to a subquery?

Posted by Benson Margulies <bi...@gmail.com>.
On Sat, Apr 14, 2012 at 12:37 PM, Paul Libbrecht <pa...@hoplahup.net> wrote:
> Benson,
>
> it was in the Lucene world in May 2010:
>        http://mail-archives.apache.org/mod_mbox/lucene-java-user/201005.mbox/%3C469705.48901.qm@web29016.mail.ird.yahoo.com%3E
> Mark Harwood pointed me to a "FlagQuery" which was exactly what I needed.
> His contribution sounds not to have been taken up, it worked for me in Lucene, 2.4.1.
> We used this to create an auto-completion popup which selected the right language by flagging the right sub-query that was most matched.

Paul, it seems to me that the criticism in the JIRA (do you really
want this calculation for every single document that matches?) applies
to me. In our stuff, we run a query, and we look at the top 200 items,
rearranging their order based on a name similarity metric that is too
expensive to run in bulk. If the overall query is 'just us', we can
discard the Lucene scores and reorder based on our own. If our query
is combined with other terms, then we need to subtract out the
contribution our part of the initial query. However, sending in a
second query with (I suppose) ids=id1,id2,... and just our query, to
retrieve the scores, should be pretty speedy for a mere 200 items.
Maybe I'm missing some even easier way, given a DocList and a query,
to obtain scores for those docs for that query?

>
> paul
>
> Le 14 avr. 2012 à 15:34, Benson Margulies a écrit :
>
>> yes please
>>
>> On Apr 14, 2012, at 2:40 AM, Paul Libbrecht <pa...@hoplahup.net> wrote:
>>
>>> Benson,
>>> In mid 2009, I has such a question answered with a nifty score bitwise manipulation, and a little precision loss. For each result I could pick the language of a multilingual match.
>>> If interested, I can dig.
>>> Paul
>>> --
>>> Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.
>>>
>>>
>>> Benson Margulies <bi...@gmail.com> a écrit :
>>>
>>> Given a query including a subquery, is there any way for me to learn
>>> that subquery's contribution to the overall document score?
>>>
>>> I can provide 'why on earth would anyone ...' if someone wants to know.
>>>
>

Re: Can I discover what part of a score is attributable to a subquery?

Posted by Paul Libbrecht <pa...@hoplahup.net>.
Benson,

it was in the Lucene world in May 2010:
	http://mail-archives.apache.org/mod_mbox/lucene-java-user/201005.mbox/%3C469705.48901.qm@web29016.mail.ird.yahoo.com%3E
Mark Harwood pointed me to a "FlagQuery" which was exactly what I needed.
His contribution sounds not to have been taken up, it worked for me in Lucene, 2.4.1.
We used this to create an auto-completion popup which selected the right language by flagging the right sub-query that was most matched.

paul

Le 14 avr. 2012 à 15:34, Benson Margulies a écrit :

> yes please
> 
> On Apr 14, 2012, at 2:40 AM, Paul Libbrecht <pa...@hoplahup.net> wrote:
> 
>> Benson,
>> In mid 2009, I has such a question answered with a nifty score bitwise manipulation, and a little precision loss. For each result I could pick the language of a multilingual match.
>> If interested, I can dig.
>> Paul
>> --
>> Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.
>> 
>> 
>> Benson Margulies <bi...@gmail.com> a écrit :
>> 
>> Given a query including a subquery, is there any way for me to learn
>> that subquery's contribution to the overall document score?
>> 
>> I can provide 'why on earth would anyone ...' if someone wants to know.
>> 


Re: Can I discover what part of a score is attributable to a subquery?

Posted by Benson Margulies <bi...@gmail.com>.
yes please

On Apr 14, 2012, at 2:40 AM, Paul Libbrecht <pa...@hoplahup.net> wrote:

> Benson,
> In mid 2009, I has such a question answered with a nifty score bitwise manipulation, and a little precision loss. For each result I could pick the language of a multilingual match.
> If interested, I can dig.
> Paul
> --
> Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.
>
>
> Benson Margulies <bi...@gmail.com> a écrit :
>
> Given a query including a subquery, is there any way for me to learn
> that subquery's contribution to the overall document score?
>
> I can provide 'why on earth would anyone ...' if someone wants to know.
>

Re: Can I discover what part of a score is attributable to a subquery?

Posted by Paul Libbrecht <pa...@hoplahup.net>.
Benson,
In mid 2009, I has such a question answered with a nifty score bitwise manipulation, and a little precision loss. For each result I could pick the language of a multilingual match.
If interested, I can dig.
Paul
-- 
Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.


Benson Margulies <bi...@gmail.com> a écrit :

Given a query including a subquery, is there any way for me to learn
that subquery's contribution to the overall document score?

I can provide 'why on earth would anyone ...' if someone wants to know.


Re: Can I discover what part of a score is attributable to a subquery?

Posted by Benson Margulies <bi...@gmail.com>.
On Fri, Apr 13, 2012 at 6:43 PM, John Chee <Jo...@mylife.com> wrote:
> On Fri, Apr 13, 2012 at 2:40 PM, Benson Margulies <bi...@gmail.com> wrote:
>> Given a query including a subquery, is there any way for me to learn
>> that subquery's contribution to the overall document score?

I need this number to be available in a SearchComponent that runs
after QueryComponent.


>>
>> I can provide 'why on earth would anyone ...' if someone wants to know.
>
> Have you tried debugQuery=true?
> http://wiki.apache.org/solr/CommonQueryParameters#debugQuery The
> 'explain' field of the result explains the scoring of each document.

Re: Can I discover what part of a score is attributable to a subquery?

Posted by John Chee <Jo...@mylife.com>.
On Fri, Apr 13, 2012 at 2:40 PM, Benson Margulies <bi...@gmail.com> wrote:
> Given a query including a subquery, is there any way for me to learn
> that subquery's contribution to the overall document score?
>
> I can provide 'why on earth would anyone ...' if someone wants to know.

Have you tried debugQuery=true?
http://wiki.apache.org/solr/CommonQueryParameters#debugQuery The
'explain' field of the result explains the scoring of each document.

Re: Can I discover what part of a score is attributable to a subquery?

Posted by Benson Margulies <bi...@gmail.com>.
On Fri, Apr 13, 2012 at 7:07 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> : Given a query including a subquery, is there any way for me to learn
> : that subquery's contribution to the overall document score?
>
> You have to just execute the subquery itself ... doc collection
> and score calculation doesn't keep track the subscores.
>
> you could do this using functions in the "fl" but since you mentioned
> wanting this in SearchCOmponent just pass the "subquery" to
> SolrIndexSeracher using a DocSet filter of the current page (ie: make your
> own DocSet based on the current DocList)

I get it. Some fairly intricate dancing then can ensue with SolrCloud. Thanks.

>
>
> -Hoss

Re: Can I discover what part of a score is attributable to a subquery?

Posted by Chris Hostetter <ho...@fucit.org>.
: Given a query including a subquery, is there any way for me to learn
: that subquery's contribution to the overall document score?

You have to just execute the subquery itself ... doc collection 
and score calculation doesn't keep track the subscores.

you could do this using functions in the "fl" but since you mentioned 
wanting this in SearchCOmponent just pass the "subquery" to 
SolrIndexSeracher using a DocSet filter of the current page (ie: make your 
own DocSet based on the current DocList)


-Hoss