You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/12/01 00:04:54 UTC

RE: schema-based Index-time field boosting

: I am talking about field boosting rather than document boosting, ie. I
: would like some fields (say eg. title) to be "louder" than others,
: across ALL documents.  I believe you are at least partially talking
: about document boosting, which clearly applies on a per-document basis.

index time boosts are all the same -- it doesn't matter if htey are field 
boosts or document boosts -- a document boost is just a field boost for 
every field in the document.

: If it helps, consider a schema version of the following, from
: org.apache.solr.common.SolrInputDocument:
: 
:   /**
:    * Adds a field with the given name, value and boost.  If a field with
: the name already exists, then it is updated to
:    * the new value and boost.
:    *
:    * @param name Name of the field to add
:    * @param value Value of the field
:    * @param boost Boost value for the field
:    */
:   public void addField(String name, Object value, float boost ) 

	...

: Where a constant boost value is applied consistently to a given field.
: That is what I was mistakenly hoping to achieve in the schema.  I still
: think it would be a good idea BTW.  Regards,

But now we're right back to what i was trying to explain before: index 
time boost values like these are only used as a multiplier in the 
fieldNorm.  when included as part of the document data, you can specify a 
fieldBoost for fieldX of docA that's greater then the boost for fieldX 
of docB and that will make docA score higher then docB when fieldX 
contains the same number of matches and is hte same length -- but if you 
apply a constant boost of B to fieldX for every doc (which is what a 
feature to hardcode boosts in schema.xml might give you) then the net 
effect would be zero when scoring docA and docB, because the fieldNorm's 
for fieldX in both docs would include the exact same multiplier.



-Hoss

Re: schema-based Index-time field boosting

Posted by Walter Underwood <wu...@wunderwood.org>.

An index-time boost means "this document is a better answer, regardless of the query."

To weight title matches higher than summary matches, use field boosts at query time. They work great. There is no need to modify Solr to get that behavior. 

wunder

On Dec 3, 2009, at 12:37 AM, Ian Smith wrote:

> Aaaaaaaargh!  OK, I would like a document with (eg.) a title containing
> a term to score higher than one on (eg.) a summary containing the same
> term, all other things being "equal".  You seem to be arguing against
> field boosting in general, and I don't understand why!
> 
> May as well let this drop since we don't seem to be talking about the
> same thing . . . but thanks anyway,
> 
> Ian.
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
> Sent: 30 November 2009 23:05
> To: solr-user@lucene.apache.org
> Subject: RE: schema-based Index-time field boosting 
> 
> 
> : I am talking about field boosting rather than document boosting, ie. I
> : would like some fields (say eg. title) to be "louder" than others,
> : across ALL documents.  I believe you are at least partially talking
> : about document boosting, which clearly applies on a per-document
> basis.
> 
> index time boosts are all the same -- it doesn't matter if htey are
> field boosts or document boosts -- a document boost is just a field
> boost for every field in the document.
> 
> : If it helps, consider a schema version of the following, from
> : org.apache.solr.common.SolrInputDocument:
> : 
> :   /**
> :    * Adds a field with the given name, value and boost.  If a field
> with
> : the name already exists, then it is updated to
> :    * the new value and boost.
> :    *
> :    * @param name Name of the field to add
> :    * @param value Value of the field
> :    * @param boost Boost value for the field
> :    */
> :   public void addField(String name, Object value, float boost ) 
> 
> 	...
> 
> : Where a constant boost value is applied consistently to a given field.
> : That is what I was mistakenly hoping to achieve in the schema.  I
> still
> : think it would be a good idea BTW.  Regards,
> 
> But now we're right back to what i was trying to explain before: index
> time boost values like these are only used as a multiplier in the
> fieldNorm.  when included as part of the document data, you can specify
> a fieldBoost for fieldX of docA that's greater then the boost for fieldX
> of docB and that will make docA score higher then docB when fieldX
> contains the same number of matches and is hte same length -- but if you
> apply a constant boost of B to fieldX for every doc (which is what a
> feature to hardcode boosts in schema.xml might give you) then the net
> effect would be zero when scoring docA and docB, because the fieldNorm's
> for fieldX in both docs would include the exact same multiplier.
> 
> 
> 
> -Hoss
>

Re: schema-based Index-time field boosting

Posted by Erik Hatcher <er...@gmail.com>.

Ian - now you're talking *term* boosting, which is a dynamic query- 
time factor, not something specified at index time.

Here's what I suggest as a starting point for this sort of thing, in  
Solr request format:

    http://localhost:8983/solr/select? 
defType=dismax&q=apple&qf=name^2+manu

Where the term "apple" is queried against both the name and  
manu(facturer) fields.  And matches in the name field get boosted by a  
factor of 2.  This is using the dismax query parser.

Using index-time boosts are becoming less and less favorable - rarely  
any need to do that given the more flexible dynamic control you can  
have over scoring at query-time.

And I'm sure Hoss isn't arguing against field boosting, given he's one  
of the gurus behind the magic of dismax.  He's simply saying if you  
apply a constant boost to all documents, you've effectively done  
nothing.

	Erik

On Dec 3, 2009, at 3:37 AM, Ian Smith wrote:

> Aaaaaaaargh!  OK, I would like a document with (eg.) a title  
> containing
> a term to score higher than one on (eg.) a summary containing the same
> term, all other things being "equal".  You seem to be arguing against
> field boosting in general, and I don't understand why!
>
> May as well let this drop since we don't seem to be talking about the
> same thing . . . but thanks anyway,
>
> Ian.
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 30 November 2009 23:05
> To: solr-user@lucene.apache.org
> Subject: RE: schema-based Index-time field boosting
>
>
> : I am talking about field boosting rather than document boosting,  
> ie. I
> : would like some fields (say eg. title) to be "louder" than others,
> : across ALL documents.  I believe you are at least partially talking
> : about document boosting, which clearly applies on a per-document
> basis.
>
> index time boosts are all the same -- it doesn't matter if htey are
> field boosts or document boosts -- a document boost is just a field
> boost for every field in the document.
>
> : If it helps, consider a schema version of the following, from
> : org.apache.solr.common.SolrInputDocument:
> :
> :   /**
> :    * Adds a field with the given name, value and boost.  If a field
> with
> : the name already exists, then it is updated to
> :    * the new value and boost.
> :    *
> :    * @param name Name of the field to add
> :    * @param value Value of the field
> :    * @param boost Boost value for the field
> :    */
> :   public void addField(String name, Object value, float boost )
>
> 	...
>
> : Where a constant boost value is applied consistently to a given  
> field.
> : That is what I was mistakenly hoping to achieve in the schema.  I
> still
> : think it would be a good idea BTW.  Regards,
>
> But now we're right back to what i was trying to explain before: index
> time boost values like these are only used as a multiplier in the
> fieldNorm.  when included as part of the document data, you can  
> specify
> a fieldBoost for fieldX of docA that's greater then the boost for  
> fieldX
> of docB and that will make docA score higher then docB when fieldX
> contains the same number of matches and is hte same length -- but if  
> you
> apply a constant boost of B to fieldX for every doc (which is what a
> feature to hardcode boosts in schema.xml might give you) then the net
> effect would be zero when scoring docA and docB, because the  
> fieldNorm's
> for fieldX in both docs would include the exact same multiplier.
>
>
>
> -Hoss
>

RE: schema-based Index-time field boosting

Posted by Ian Smith <Ia...@gossinteractive.com>.

Aaaaaaaargh!  OK, I would like a document with (eg.) a title containing
a term to score higher than one on (eg.) a summary containing the same
term, all other things being "equal".  You seem to be arguing against
field boosting in general, and I don't understand why!

May as well let this drop since we don't seem to be talking about the
same thing . . . but thanks anyway,

Ian.
-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: 30 November 2009 23:05
To: solr-user@lucene.apache.org
Subject: RE: schema-based Index-time field boosting 


: I am talking about field boosting rather than document boosting, ie. I
: would like some fields (say eg. title) to be "louder" than others,
: across ALL documents.  I believe you are at least partially talking
: about document boosting, which clearly applies on a per-document
basis.

index time boosts are all the same -- it doesn't matter if htey are
field boosts or document boosts -- a document boost is just a field
boost for every field in the document.

: If it helps, consider a schema version of the following, from
: org.apache.solr.common.SolrInputDocument:
: 
:   /**
:    * Adds a field with the given name, value and boost.  If a field
with
: the name already exists, then it is updated to
:    * the new value and boost.
:    *
:    * @param name Name of the field to add
:    * @param value Value of the field
:    * @param boost Boost value for the field
:    */
:   public void addField(String name, Object value, float boost ) 

	...

: Where a constant boost value is applied consistently to a given field.
: That is what I was mistakenly hoping to achieve in the schema.  I
still
: think it would be a good idea BTW.  Regards,

But now we're right back to what i was trying to explain before: index
time boost values like these are only used as a multiplier in the
fieldNorm.  when included as part of the document data, you can specify
a fieldBoost for fieldX of docA that's greater then the boost for fieldX
of docB and that will make docA score higher then docB when fieldX
contains the same number of matches and is hte same length -- but if you
apply a constant boost of B to fieldX for every doc (which is what a
feature to hardcode boosts in schema.xml might give you) then the net
effect would be zero when scoring docA and docB, because the fieldNorm's
for fieldX in both docs would include the exact same multiplier.



-Hoss