You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Darren Govoni <da...@ontrenet.com> on 2012/03/29 16:08:34 UTC

Custom scoring question

Hi,
 I have a situation I want to re-score document relevance.

Let's say I have two fields:

text: The quick brown fox jumped over the white fence.
terms: fox fence

Now my queries come in as:

terms:[* TO *]

and Solr scores them on that field. 

What I want is to rank them according to the distribution of field
"terms" within field "text". Which is a per document calculation.

Can this be done with any kind of dismax? I'm not searching for known
terms at query time.

If not, what is the best way to implement a custom scoring handler to
perform this calculation and re-score/sort the results?

thanks for any tips!!!

Re: Custom scoring question

Posted by Tomás Fernández Löbbe <to...@gmail.com>.

But if you have that "score" in a field, you could use that field as part
of a function-query instead of directly sorting on it, that would mix this
"score" with the score calculated with other fields.

On Thu, Mar 29, 2012 at 5:49 PM, Darren Govoni <da...@ontrenet.com> wrote:

> Yeah, I guess that would work. I wasn't sure if it would change relative
> to other documents. But if it were to be combined with other fields,
> that approach may not work because the calculation wouldn't include the
> scoring for other parts of the query. So then you have the dynamic score
> and what to do with it.
>
> On Thu, 2012-03-29 at 16:29 -0300, Tomás Fernández Löbbe wrote:
> > Can't you simply calculate that at index time and assign the result to a
> > field, then sort by that field.
> >
> > On Thu, Mar 29, 2012 at 12:07 PM, Darren Govoni <da...@ontrenet.com>
> wrote:
> >
> > > I'm going to try index time per-field boosting and do the boost
> > > computation at index time and see if that helps.
> > >
> > > On Thu, 2012-03-29 at 10:08 -0400, Darren Govoni wrote:
> > > > Hi,
> > > >  I have a situation I want to re-score document relevance.
> > > >
> > > > Let's say I have two fields:
> > > >
> > > > text: The quick brown fox jumped over the white fence.
> > > > terms: fox fence
> > > >
> > > > Now my queries come in as:
> > > >
> > > > terms:[* TO *]
> > > >
> > > > and Solr scores them on that field.
> > > >
> > > > What I want is to rank them according to the distribution of field
> > > > "terms" within field "text". Which is a per document calculation.
> > > >
> > > > Can this be done with any kind of dismax? I'm not searching for known
> > > > terms at query time.
> > > >
> > > > If not, what is the best way to implement a custom scoring handler to
> > > > perform this calculation and re-score/sort the results?
> > > >
> > > > thanks for any tips!!!
> > > >
> > >
> > >
> > >
>
>
>

Re: Custom scoring question

Posted by Darren Govoni <da...@ontrenet.com>.

Yeah, I guess that would work. I wasn't sure if it would change relative
to other documents. But if it were to be combined with other fields,
that approach may not work because the calculation wouldn't include the
scoring for other parts of the query. So then you have the dynamic score
and what to do with it.

On Thu, 2012-03-29 at 16:29 -0300, Tomás Fernández Löbbe wrote:
> Can't you simply calculate that at index time and assign the result to a
> field, then sort by that field.
> 
> On Thu, Mar 29, 2012 at 12:07 PM, Darren Govoni <da...@ontrenet.com> wrote:
> 
> > I'm going to try index time per-field boosting and do the boost
> > computation at index time and see if that helps.
> >
> > On Thu, 2012-03-29 at 10:08 -0400, Darren Govoni wrote:
> > > Hi,
> > >  I have a situation I want to re-score document relevance.
> > >
> > > Let's say I have two fields:
> > >
> > > text: The quick brown fox jumped over the white fence.
> > > terms: fox fence
> > >
> > > Now my queries come in as:
> > >
> > > terms:[* TO *]
> > >
> > > and Solr scores them on that field.
> > >
> > > What I want is to rank them according to the distribution of field
> > > "terms" within field "text". Which is a per document calculation.
> > >
> > > Can this be done with any kind of dismax? I'm not searching for known
> > > terms at query time.
> > >
> > > If not, what is the best way to implement a custom scoring handler to
> > > perform this calculation and re-score/sort the results?
> > >
> > > thanks for any tips!!!
> > >
> >
> >
> >

Re: Custom scoring question

Posted by Tomás Fernández Löbbe <to...@gmail.com>.

Can't you simply calculate that at index time and assign the result to a
field, then sort by that field.

On Thu, Mar 29, 2012 at 12:07 PM, Darren Govoni <da...@ontrenet.com> wrote:

> I'm going to try index time per-field boosting and do the boost
> computation at index time and see if that helps.
>
> On Thu, 2012-03-29 at 10:08 -0400, Darren Govoni wrote:
> > Hi,
> >  I have a situation I want to re-score document relevance.
> >
> > Let's say I have two fields:
> >
> > text: The quick brown fox jumped over the white fence.
> > terms: fox fence
> >
> > Now my queries come in as:
> >
> > terms:[* TO *]
> >
> > and Solr scores them on that field.
> >
> > What I want is to rank them according to the distribution of field
> > "terms" within field "text". Which is a per document calculation.
> >
> > Can this be done with any kind of dismax? I'm not searching for known
> > terms at query time.
> >
> > If not, what is the best way to implement a custom scoring handler to
> > perform this calculation and re-score/sort the results?
> >
> > thanks for any tips!!!
> >
>
>
>

Re: Custom scoring question

Posted by Darren Govoni <da...@ontrenet.com>.

I'm going to try index time per-field boosting and do the boost
computation at index time and see if that helps.

On Thu, 2012-03-29 at 10:08 -0400, Darren Govoni wrote:
> Hi,
>  I have a situation I want to re-score document relevance.
> 
> Let's say I have two fields:
> 
> text: The quick brown fox jumped over the white fence.
> terms: fox fence
> 
> Now my queries come in as:
> 
> terms:[* TO *]
> 
> and Solr scores them on that field. 
> 
> What I want is to rank them according to the distribution of field
> "terms" within field "text". Which is a per document calculation.
> 
> Can this be done with any kind of dismax? I'm not searching for known
> terms at query time.
> 
> If not, what is the best way to implement a custom scoring handler to
> perform this calculation and re-score/sort the results?
> 
> thanks for any tips!!!
>