You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ayyanar <ay...@aspiresys.com> on 2008/12/10 13:24:45 UTC

Value based boosting - Design Help

We have a requirement for a keyword search in one of our projects and we are
using Solr/Lucene for the same.   

We have the data, link_id, title, url and a collection of keywords
associated to a link_id. Right now we have indexed link_id, title, url and
keywords (multivalued field) in a single index. 
 

Also, in our requirement each keyword value has a weight associated to it
and this weight is calculated based on certain factors like (if the keyword
exist in title then it takes a specific weight etc…). This weight should
drive the relevancy on the search result. For example, when a user enters a
keyword called “Biology” and clicks search, we search the keywords field in
the index. That document that contains the searched keyword with higher
weight should come first.

 

Eg:

 

Document 1:

LinkID = 100

Title = Biology

Keywords = Biology, BioNews, Bio, Bio chemistry

 

Document 2:

LinkID = 102

Title = Nutrition

Keywords = Biology, Nutrition, Dietics 

 

In the above example document 1 should come first because we will associate
more weight to the keyword biology for link id 100 in document 1

 

We understand that this weight can be applied as a boost to a field. The
problem is that in Solr/Lucene we cannot associate a different boost to
different values of a same field. 

 

It would be vey helpful for us if you can provide your thoughts/inputs on
how to achieve this requirement in Lucene:

 

Do we have a way to associate a different boost to different values of a
same field? 
Can we maintain the list of keywords associated to each link_id in a
separate index, so that we can associate weight to each keyword value? If
so, how do we relate the main index and the keyword index? 
 


-- 
View this message in context: http://www.nabble.com/Value-based--boosting---Design-Help-tp20934304p20934304.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Value based boosting - Design Help

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Wed, Dec 10, 2008 at 5:54 PM, ayyanar
<ay...@aspiresys.com>wrote:

>
> Also, in our requirement each keyword value has a weight associated to it
> and this weight is calculated based on certain factors like (if the keyword
> exist in title then it takes a specific weight etc…). This weight should
> drive the relevancy on the search result. For example, when a user enters a
> keyword called "Biology" and clicks search, we search the keywords field in
> the index. That document that contains the searched keyword with higher
> weight should come first.
>
> It would be vey helpful for us if you can provide your thoughts/inputs on
> how to achieve this requirement in Lucene:
>
> Do we have a way to associate a different boost to different values of a
> same field?


So you are searching only on the keywords field and not the title field? You
can search on both the title and the keywords field and provide different
boosts to the title field.

Why do you want to assign weights to keywords? If all keywords which are in
title are supposed to be more relevant than all keywords only in keywords
field then assigning a boost value to the title field is enough. Is there
any other use-case?


>
> Can we maintain the list of keywords associated to each link_id in a
> separate index, so that we can associate weight to each keyword value? If
> so, how do we relate the main index and the keyword index?
>

No joins like these are not possible in Lucene/Solr. Lucene has payloads
which can be used for boosting a particular term but that functionality is
not available in Solr. Look at BoostingTermQuery in Lucene on how to use it.

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/payloads/BoostingTermQuery.html

-- 
Regards,
Shalin Shekhar Mangar.