You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Mike Tinnes <ti...@ecliptictech.com> on 2002/06/21 20:39:59 UTC

Weighted index

Hey all, is there any method that allows for the weighting of indexed fields? I'd like to implement a web search in which keywords occuring in certain elements (title, heading, metatags) score higher than others (body, links, etc).

Thanks, Mike


Re: Weighted index

Posted by Doug Cutting <cu...@lucene.com>.
Peter Carlson wrote:
> I don't know the actual algorithm, but when you type in the search
> 
> title:hello^3 AND heading:dolly^4
> 
> Will product different document scores than
> 
> title:hello AND heading:dolly^4
> 
> Lucene will get the score for a given document, not a field. So it does
> combine the results of the two fields together. Again, I don't know how it
> combines them.

Scores from different terms in the same document (and hence different 
fields) are summed.

Doug


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Doug Cutting <cu...@lucene.com>.
Peter Carlson wrote:
> I don't know the actual algorithm, but when you type in the search
> 
> title:hello^3 AND heading:dolly^4
> 
> Will product different document scores than
> 
> title:hello AND heading:dolly^4
> 
> Lucene will get the score for a given document, not a field. So it does
> combine the results of the two fields together. Again, I don't know how it
> combines them.

Scores from different terms in the same document (and hence different 
fields) are summed.

Doug


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Peter Carlson <ca...@bookandhammer.com>.
I don't know the actual algorithm, but when you type in the search

title:hello^3 AND heading:dolly^4

Will product different document scores than

title:hello AND heading:dolly^4

Lucene will get the score for a given document, not a field. So it does
combine the results of the two fields together. Again, I don't know how it
combines them.

I just tried it on my project and depending on what the boost factor is per
field, it changes the overall score.

I hope this helps

--Peter



On 6/21/02 1:33 PM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:

> 
> But is there a way to combine the scores of the individual fields to create
> one total score? The problem is that the highest ranking document for the
> 'title' query will not necessarily match the highest ranking document for
> the 'heading' query. I suppose I could simply add them up, but that would
> mean iterating through all the results of both queries and adding the scores
> to find the highest combined total. I don't suppose Lucene has a method of
> ANDing in this manner to create a combined total field score?
> 
> - Mike
> 
> ----- Original Message -----
> From: "Peter Carlson" <ca...@bookandhammer.com>
> To: "Lucene Users List" <lu...@jakarta.apache.org>
> Sent: Friday, June 21, 2002 2:35 PM
> Subject: Re: Weighted index
> 
> 
>> Or you could convert the query to
>> 
>> title:"use search string"^5
>> Or
>> heading:"user search string"^3
>> 
>> The ^ symbol will add the boosting factor to the score.
>> 
>> You could also do this once you create the Query from the QueryParser, but
>> this is up to you. Before or after the QueryParser, but I would probably
>> have Lucene do the scoring.
>> 
>> --Peter
>> 
>> On 6/21/02 12:15 PM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
>> 
>>> 
>>> I suppose I could take the users search string and use it to query on
>>> multiple fields using the query syntax. Then weight the resulting scores
> for
>>> the individual fields as I wish. Something like this..?
>>> 
>>> score_1 = results of searching with 'title:"user search string"'
>>> score_2 = results of searching with 'heading: "user search string"'
>>> ...
>>> score = ((score_1 * 5) + (score_2 * 10)) / 15
>>> 
>>> - Mike
>>> 
>>> 
>>> ----- Original Message -----
>>> From: "Peter Carlson" <ca...@bookandhammer.com>
>>> To: "Lucene Users List" <lu...@jakarta.apache.org>
>>> Sent: Friday, June 21, 2002 1:48 PM
>>> Subject: Re: Weighted index
>>> 
>>> 
>>>> Lucene supporting boosting of terms in a query which will increase it
>>>> relative weighting in the search results, but this is not supported on
> a
>>>> index level.
>>>> 
>>>> So you could get the users query and add a boost factor to those fields
>>> you
>>>> want to have higher relevancy. There has been some discussion on how to
> do
>>>> this at the indexing level, but nothing has started yet.
>>>> 
>>>> --Peter
>>>> 
>>>> 
>>>> On 6/21/02 11:39 AM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
>>>> 
>>>>> 
>>>>> Hey all, is there any method that allows for the weighting of indexed
>>> fields?
>>>>> I'd like to implement a web search in which keywords occuring in
> certain
>>>>> elements (title, heading, metatags) score higher than others (body,
>>> links,
>>>>> etc).
>>>>> 
>>>>> Thanks, Mike
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe, e-mail:
>>> <ma...@jakarta.apache.org>
>>>> For additional commands, e-mail:
>>> <ma...@jakarta.apache.org>
>>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe, e-mail:
> <ma...@jakarta.apache.org>
>>> For additional commands, e-mail:
> <ma...@jakarta.apache.org>
>>> 
>>> 
>> 
>> 
>> --
>> To unsubscribe, e-mail:
> <ma...@jakarta.apache.org>
>> For additional commands, e-mail:
> <ma...@jakarta.apache.org>
>> 
> 
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Mike Tinnes <ti...@ecliptictech.com>.
But is there a way to combine the scores of the individual fields to create
one total score? The problem is that the highest ranking document for the
'title' query will not necessarily match the highest ranking document for
the 'heading' query. I suppose I could simply add them up, but that would
mean iterating through all the results of both queries and adding the scores
to find the highest combined total. I don't suppose Lucene has a method of
ANDing in this manner to create a combined total field score?

- Mike

----- Original Message -----
From: "Peter Carlson" <ca...@bookandhammer.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, June 21, 2002 2:35 PM
Subject: Re: Weighted index


> Or you could convert the query to
>
> title:"use search string"^5
> Or
> heading:"user search string"^3
>
> The ^ symbol will add the boosting factor to the score.
>
> You could also do this once you create the Query from the QueryParser, but
> this is up to you. Before or after the QueryParser, but I would probably
> have Lucene do the scoring.
>
> --Peter
>
> On 6/21/02 12:15 PM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
>
> >
> > I suppose I could take the users search string and use it to query on
> > multiple fields using the query syntax. Then weight the resulting scores
for
> > the individual fields as I wish. Something like this..?
> >
> > score_1 = results of searching with 'title:"user search string"'
> > score_2 = results of searching with 'heading: "user search string"'
> > ...
> > score = ((score_1 * 5) + (score_2 * 10)) / 15
> >
> > - Mike
> >
> >
> > ----- Original Message -----
> > From: "Peter Carlson" <ca...@bookandhammer.com>
> > To: "Lucene Users List" <lu...@jakarta.apache.org>
> > Sent: Friday, June 21, 2002 1:48 PM
> > Subject: Re: Weighted index
> >
> >
> >> Lucene supporting boosting of terms in a query which will increase it
> >> relative weighting in the search results, but this is not supported on
a
> >> index level.
> >>
> >> So you could get the users query and add a boost factor to those fields
> > you
> >> want to have higher relevancy. There has been some discussion on how to
do
> >> this at the indexing level, but nothing has started yet.
> >>
> >> --Peter
> >>
> >>
> >> On 6/21/02 11:39 AM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
> >>
> >>>
> >>> Hey all, is there any method that allows for the weighting of indexed
> > fields?
> >>> I'd like to implement a web search in which keywords occuring in
certain
> >>> elements (title, heading, metatags) score higher than others (body,
> > links,
> >>> etc).
> >>>
> >>> Thanks, Mike
> >>>
> >>>
> >>
> >>
> >> --
> >> To unsubscribe, e-mail:
> > <ma...@jakarta.apache.org>
> >> For additional commands, e-mail:
> > <ma...@jakarta.apache.org>
> >>
> >
> >
> > --
> > To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> > For additional commands, e-mail:
<ma...@jakarta.apache.org>
> >
> >
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Peter Carlson <ca...@bookandhammer.com>.
Or you could convert the query to

title:"use search string"^5
Or
heading:"user search string"^3

The ^ symbol will add the boosting factor to the score.

You could also do this once you create the Query from the QueryParser, but
this is up to you. Before or after the QueryParser, but I would probably
have Lucene do the scoring.

--Peter

On 6/21/02 12:15 PM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:

> 
> I suppose I could take the users search string and use it to query on
> multiple fields using the query syntax. Then weight the resulting scores for
> the individual fields as I wish. Something like this..?
> 
> score_1 = results of searching with 'title:"user search string"'
> score_2 = results of searching with 'heading: "user search string"'
> ...
> score = ((score_1 * 5) + (score_2 * 10)) / 15
> 
> - Mike
> 
> 
> ----- Original Message -----
> From: "Peter Carlson" <ca...@bookandhammer.com>
> To: "Lucene Users List" <lu...@jakarta.apache.org>
> Sent: Friday, June 21, 2002 1:48 PM
> Subject: Re: Weighted index
> 
> 
>> Lucene supporting boosting of terms in a query which will increase it
>> relative weighting in the search results, but this is not supported on a
>> index level.
>> 
>> So you could get the users query and add a boost factor to those fields
> you
>> want to have higher relevancy. There has been some discussion on how to do
>> this at the indexing level, but nothing has started yet.
>> 
>> --Peter
>> 
>> 
>> On 6/21/02 11:39 AM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
>> 
>>> 
>>> Hey all, is there any method that allows for the weighting of indexed
> fields?
>>> I'd like to implement a web search in which keywords occuring in certain
>>> elements (title, heading, metatags) score higher than others (body,
> links,
>>> etc).
>>> 
>>> Thanks, Mike
>>> 
>>> 
>> 
>> 
>> --
>> To unsubscribe, e-mail:
> <ma...@jakarta.apache.org>
>> For additional commands, e-mail:
> <ma...@jakarta.apache.org>
>> 
> 
> 
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Mike Tinnes <ti...@ecliptictech.com>.
I suppose I could take the users search string and use it to query on
multiple fields using the query syntax. Then weight the resulting scores for
the individual fields as I wish. Something like this..?

score_1 = results of searching with 'title:"user search string"'
score_2 = results of searching with 'heading: "user search string"'
 ...
score = ((score_1 * 5) + (score_2 * 10)) / 15

- Mike


----- Original Message -----
From: "Peter Carlson" <ca...@bookandhammer.com>
To: "Lucene Users List" <lu...@jakarta.apache.org>
Sent: Friday, June 21, 2002 1:48 PM
Subject: Re: Weighted index


> Lucene supporting boosting of terms in a query which will increase it
> relative weighting in the search results, but this is not supported on a
> index level.
>
> So you could get the users query and add a boost factor to those fields
you
> want to have higher relevancy. There has been some discussion on how to do
> this at the indexing level, but nothing has started yet.
>
> --Peter
>
>
> On 6/21/02 11:39 AM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:
>
> >
> > Hey all, is there any method that allows for the weighting of indexed
fields?
> > I'd like to implement a web search in which keywords occuring in certain
> > elements (title, heading, metatags) score higher than others (body,
links,
> > etc).
> >
> > Thanks, Mike
> >
> >
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Weighted index

Posted by Peter Carlson <ca...@bookandhammer.com>.
Lucene supporting boosting of terms in a query which will increase it
relative weighting in the search results, but this is not supported on a
index level.

So you could get the users query and add a boost factor to those fields you
want to have higher relevancy. There has been some discussion on how to do
this at the indexing level, but nothing has started yet.

--Peter


On 6/21/02 11:39 AM, "Mike Tinnes" <ti...@ecliptictech.com> wrote:

> 
> Hey all, is there any method that allows for the weighting of indexed fields?
> I'd like to implement a web search in which keywords occuring in certain
> elements (title, heading, metatags) score higher than others (body, links,
> etc).
> 
> Thanks, Mike
> 
> 


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>