You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brett <br...@chopshop.org> on 2012/01/12 01:57:04 UTC

Search Specific Boosting

I'm implementing a feature where admins have the ability to control the 
order of the results by adding a boost to any specific search.

The search is a faceted interface (no text input) and which we take a 
hash of the search parameters (to form a unique search id) and then 
boost that field for the document.

The field is a wild card field so that it might look like this:

<field name="search395eff966b26a91d82935c8e1197330c_boost" 
boost="90">true</field>

The problem is that in these search results I am seeing is that my 
results are being grouped and the individual boost values are not having 
the granular effect I am looking for.

Say on a result set of 75 documents.  I see results with search boosts 
of 60-70 receiving the same score even though they were indexed with 
different boost values.  There are always more than one group.

Does anyone know what might be causing this?  Is there a better way to 
do what I am looking for?

Thanks,

Brett


Field Definition:

     <fieldType name="boost" class="solr.TextField" 
sortMissingLast="true" omitNorms="false" omitTermFreqAndPositions="true">
       <analyzer type="index">
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>      	
         <filter class="solr.LowerCaseFilterFactory"/>
       </analyzer>
       <analyzer type="query">
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>      	
         <filter class="solr.LowerCaseFilterFactory"/>      	
       </analyzer>	
	</fieldType>	


Re: Search Specific Boosting

Posted by Brett <br...@chopshop.org>.
Hi Erick,

Yeah, I've reviewed the debug output and can't make sense of why they 
are scoring the same.  I have double checked that they are being indexed 
with different boost values for the search field.  I've also increased 
the factors trying to get them be more granular so instead of boosting 
1,2,3,4,5 I did 100,200,300,400,500... Same result.

Here's and example of the debug output with two documents having 
different field boost values but receiving the same score.

Does anything stick out?  Any other ideas on how to get the results I am 
looking for?


69.694855 = (MATCH) product of: 104.54228 = (MATCH) sum of: 0.08869071 = 
(MATCH) MatchAllDocsQuery, product of: 0.08869071 = queryNorm 104.45359 
= (MATCH) weight(searchfe2684d248eab25404c3668711d4642e_boost:true in 
4016) [DefaultSimilarity], result of: 104.45359 = 
score(doc=4016,freq=1.0 = termFreq=1 ), product of: 0.48125002 = 
queryWeight, product of: 5.4261603 = idf(docFreq=81, maxDocs=6856) 
0.08869071 = queryNorm 217.04642 = fieldWeight in 4016, product of: 1.0 
= tf(freq=1.0), with freq of: 1.0 = termFreq=1 5.4261603 = 
idf(docFreq=81, maxDocs=6856) 40.0 = fieldNorm(doc=4016) 0.6666667 = 
coord(2/3)



69.694855 = (MATCH) product of: 104.54228 = (MATCH) sum of: 0.08869071 = 
(MATCH) MatchAllDocsQuery, product of: 0.08869071 = queryNorm 104.45359 
= (MATCH) weight(searchfe2684d248eab25404c3668711d4642e_boost:true in 
4106) [DefaultSimilarity], result of: 104.45359 = 
score(doc=4106,freq=1.0 = termFreq=1 ), product of: 0.48125002 = 
queryWeight, product of: 5.4261603 = idf(docFreq=81, maxDocs=6856) 
0.08869071 = queryNorm 217.04642 = fieldWeight in 4106, product of: 1.0 
= tf(freq=1.0), with freq of: 1.0 = termFreq=1 5.4261603 = 
idf(docFreq=81, maxDocs=6856) 40.0 = fieldNorm(doc=4106) 0.6666667 = 
coord(2/3)


On 1/11/2012 9:46 PM, Erick Erickson wrote:
> Boosts are fairly coarse-grained. I suspect your boost factors are just
> being rounded into the same buckets. Attaching&debugQuery=on and
> looking at how the scores were calculated should help you figure out
> if this is the case.
>
> Best
> Erick
>
> On Wed, Jan 11, 2012 at 7:57 PM, Brett<br...@chopshop.org>  wrote:
>> I'm implementing a feature where admins have the ability to control the
>> order of the results by adding a boost to any specific search.
>>
>> The search is a faceted interface (no text input) and which we take a hash
>> of the search parameters (to form a unique search id) and then boost that
>> field for the document.
>>
>> The field is a wild card field so that it might look like this:
>>
>> <field name="search395eff966b26a91d82935c8e1197330c_boost"
>> boost="90">true</field>
>>
>> The problem is that in these search results I am seeing is that my results
>> are being grouped and the individual boost values are not having the
>> granular effect I am looking for.
>>
>> Say on a result set of 75 documents.  I see results with search boosts of
>> 60-70 receiving the same score even though they were indexed with different
>> boost values.  There are always more than one group.
>>
>> Does anyone know what might be causing this?  Is there a better way to do
>> what I am looking for?
>>
>> Thanks,
>>
>> Brett
>>
>>
>> Field Definition:
>>
>>     <fieldType name="boost" class="solr.TextField" sortMissingLast="true"
>> omitNorms="false" omitTermFreqAndPositions="true">
>>       <analyzer type="index">
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>       </analyzer>
>>       <analyzer type="query">
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>       </analyzer>
>>         </fieldType>
>>

Re: Search Specific Boosting

Posted by Erick Erickson <er...@gmail.com>.
Boosts are fairly coarse-grained. I suspect your boost factors are just
being rounded into the same buckets. Attaching &debugQuery=on and
looking at how the scores were calculated should help you figure out
if this is the case.

Best
Erick

On Wed, Jan 11, 2012 at 7:57 PM, Brett <br...@chopshop.org> wrote:
> I'm implementing a feature where admins have the ability to control the
> order of the results by adding a boost to any specific search.
>
> The search is a faceted interface (no text input) and which we take a hash
> of the search parameters (to form a unique search id) and then boost that
> field for the document.
>
> The field is a wild card field so that it might look like this:
>
> <field name="search395eff966b26a91d82935c8e1197330c_boost"
> boost="90">true</field>
>
> The problem is that in these search results I am seeing is that my results
> are being grouped and the individual boost values are not having the
> granular effect I am looking for.
>
> Say on a result set of 75 documents.  I see results with search boosts of
> 60-70 receiving the same score even though they were indexed with different
> boost values.  There are always more than one group.
>
> Does anyone know what might be causing this?  Is there a better way to do
> what I am looking for?
>
> Thanks,
>
> Brett
>
>
> Field Definition:
>
>    <fieldType name="boost" class="solr.TextField" sortMissingLast="true"
> omitNorms="false" omitTermFreqAndPositions="true">
>      <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>      </analyzer>
>      <analyzer type="query">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>      </analyzer>
>        </fieldType>
>