You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/12/01 00:04:54 UTC
RE: schema-based Index-time field boosting
: I am talking about field boosting rather than document boosting, ie. I
: would like some fields (say eg. title) to be "louder" than others,
: across ALL documents. I believe you are at least partially talking
: about document boosting, which clearly applies on a per-document basis.
index time boosts are all the same -- it doesn't matter if htey are field
boosts or document boosts -- a document boost is just a field boost for
every field in the document.
: If it helps, consider a schema version of the following, from
: org.apache.solr.common.SolrInputDocument:
:
: /**
: * Adds a field with the given name, value and boost. If a field with
: the name already exists, then it is updated to
: * the new value and boost.
: *
: * @param name Name of the field to add
: * @param value Value of the field
: * @param boost Boost value for the field
: */
: public void addField(String name, Object value, float boost )
...
: Where a constant boost value is applied consistently to a given field.
: That is what I was mistakenly hoping to achieve in the schema. I still
: think it would be a good idea BTW. Regards,
But now we're right back to what i was trying to explain before: index
time boost values like these are only used as a multiplier in the
fieldNorm. when included as part of the document data, you can specify a
fieldBoost for fieldX of docA that's greater then the boost for fieldX
of docB and that will make docA score higher then docB when fieldX
contains the same number of matches and is hte same length -- but if you
apply a constant boost of B to fieldX for every doc (which is what a
feature to hardcode boosts in schema.xml might give you) then the net
effect would be zero when scoring docA and docB, because the fieldNorm's
for fieldX in both docs would include the exact same multiplier.
-Hoss
Re: schema-based Index-time field boosting
Posted by Walter Underwood <wu...@wunderwood.org>.
An index-time boost means "this document is a better answer, regardless of the query."
To weight title matches higher than summary matches, use field boosts at query time. They work great. There is no need to modify Solr to get that behavior.
wunder
On Dec 3, 2009, at 12:37 AM, Ian Smith wrote:
> Aaaaaaaargh! OK, I would like a document with (eg.) a title containing
> a term to score higher than one on (eg.) a summary containing the same
> term, all other things being "equal". You seem to be arguing against
> field boosting in general, and I don't understand why!
>
> May as well let this drop since we don't seem to be talking about the
> same thing . . . but thanks anyway,
>
> Ian.
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 30 November 2009 23:05
> To: solr-user@lucene.apache.org
> Subject: RE: schema-based Index-time field boosting
>
>
> : I am talking about field boosting rather than document boosting, ie. I
> : would like some fields (say eg. title) to be "louder" than others,
> : across ALL documents. I believe you are at least partially talking
> : about document boosting, which clearly applies on a per-document
> basis.
>
> index time boosts are all the same -- it doesn't matter if htey are
> field boosts or document boosts -- a document boost is just a field
> boost for every field in the document.
>
> : If it helps, consider a schema version of the following, from
> : org.apache.solr.common.SolrInputDocument:
> :
> : /**
> : * Adds a field with the given name, value and boost. If a field
> with
> : the name already exists, then it is updated to
> : * the new value and boost.
> : *
> : * @param name Name of the field to add
> : * @param value Value of the field
> : * @param boost Boost value for the field
> : */
> : public void addField(String name, Object value, float boost )
>
> ...
>
> : Where a constant boost value is applied consistently to a given field.
> : That is what I was mistakenly hoping to achieve in the schema. I
> still
> : think it would be a good idea BTW. Regards,
>
> But now we're right back to what i was trying to explain before: index
> time boost values like these are only used as a multiplier in the
> fieldNorm. when included as part of the document data, you can specify
> a fieldBoost for fieldX of docA that's greater then the boost for fieldX
> of docB and that will make docA score higher then docB when fieldX
> contains the same number of matches and is hte same length -- but if you
> apply a constant boost of B to fieldX for every doc (which is what a
> feature to hardcode boosts in schema.xml might give you) then the net
> effect would be zero when scoring docA and docB, because the fieldNorm's
> for fieldX in both docs would include the exact same multiplier.
>
>
>
> -Hoss
>
Re: schema-based Index-time field boosting
Posted by Erik Hatcher <er...@gmail.com>.
Ian - now you're talking *term* boosting, which is a dynamic query-
time factor, not something specified at index time.
Here's what I suggest as a starting point for this sort of thing, in
Solr request format:
http://localhost:8983/solr/select?
defType=dismax&q=apple&qf=name^2+manu
Where the term "apple" is queried against both the name and
manu(facturer) fields. And matches in the name field get boosted by a
factor of 2. This is using the dismax query parser.
Using index-time boosts are becoming less and less favorable - rarely
any need to do that given the more flexible dynamic control you can
have over scoring at query-time.
And I'm sure Hoss isn't arguing against field boosting, given he's one
of the gurus behind the magic of dismax. He's simply saying if you
apply a constant boost to all documents, you've effectively done
nothing.
Erik
On Dec 3, 2009, at 3:37 AM, Ian Smith wrote:
> Aaaaaaaargh! OK, I would like a document with (eg.) a title
> containing
> a term to score higher than one on (eg.) a summary containing the same
> term, all other things being "equal". You seem to be arguing against
> field boosting in general, and I don't understand why!
>
> May as well let this drop since we don't seem to be talking about the
> same thing . . . but thanks anyway,
>
> Ian.
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 30 November 2009 23:05
> To: solr-user@lucene.apache.org
> Subject: RE: schema-based Index-time field boosting
>
>
> : I am talking about field boosting rather than document boosting,
> ie. I
> : would like some fields (say eg. title) to be "louder" than others,
> : across ALL documents. I believe you are at least partially talking
> : about document boosting, which clearly applies on a per-document
> basis.
>
> index time boosts are all the same -- it doesn't matter if htey are
> field boosts or document boosts -- a document boost is just a field
> boost for every field in the document.
>
> : If it helps, consider a schema version of the following, from
> : org.apache.solr.common.SolrInputDocument:
> :
> : /**
> : * Adds a field with the given name, value and boost. If a field
> with
> : the name already exists, then it is updated to
> : * the new value and boost.
> : *
> : * @param name Name of the field to add
> : * @param value Value of the field
> : * @param boost Boost value for the field
> : */
> : public void addField(String name, Object value, float boost )
>
> ...
>
> : Where a constant boost value is applied consistently to a given
> field.
> : That is what I was mistakenly hoping to achieve in the schema. I
> still
> : think it would be a good idea BTW. Regards,
>
> But now we're right back to what i was trying to explain before: index
> time boost values like these are only used as a multiplier in the
> fieldNorm. when included as part of the document data, you can
> specify
> a fieldBoost for fieldX of docA that's greater then the boost for
> fieldX
> of docB and that will make docA score higher then docB when fieldX
> contains the same number of matches and is hte same length -- but if
> you
> apply a constant boost of B to fieldX for every doc (which is what a
> feature to hardcode boosts in schema.xml might give you) then the net
> effect would be zero when scoring docA and docB, because the
> fieldNorm's
> for fieldX in both docs would include the exact same multiplier.
>
>
>
> -Hoss
>
RE: schema-based Index-time field boosting
Posted by Ian Smith <Ia...@gossinteractive.com>.
Aaaaaaaargh! OK, I would like a document with (eg.) a title containing
a term to score higher than one on (eg.) a summary containing the same
term, all other things being "equal". You seem to be arguing against
field boosting in general, and I don't understand why!
May as well let this drop since we don't seem to be talking about the
same thing . . . but thanks anyway,
Ian.
-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
Sent: 30 November 2009 23:05
To: solr-user@lucene.apache.org
Subject: RE: schema-based Index-time field boosting
: I am talking about field boosting rather than document boosting, ie. I
: would like some fields (say eg. title) to be "louder" than others,
: across ALL documents. I believe you are at least partially talking
: about document boosting, which clearly applies on a per-document
basis.
index time boosts are all the same -- it doesn't matter if htey are
field boosts or document boosts -- a document boost is just a field
boost for every field in the document.
: If it helps, consider a schema version of the following, from
: org.apache.solr.common.SolrInputDocument:
:
: /**
: * Adds a field with the given name, value and boost. If a field
with
: the name already exists, then it is updated to
: * the new value and boost.
: *
: * @param name Name of the field to add
: * @param value Value of the field
: * @param boost Boost value for the field
: */
: public void addField(String name, Object value, float boost )
...
: Where a constant boost value is applied consistently to a given field.
: That is what I was mistakenly hoping to achieve in the schema. I
still
: think it would be a good idea BTW. Regards,
But now we're right back to what i was trying to explain before: index
time boost values like these are only used as a multiplier in the
fieldNorm. when included as part of the document data, you can specify
a fieldBoost for fieldX of docA that's greater then the boost for fieldX
of docB and that will make docA score higher then docB when fieldX
contains the same number of matches and is hte same length -- but if you
apply a constant boost of B to fieldX for every doc (which is what a
feature to hardcode boosts in schema.xml might give you) then the net
effect would be zero when scoring docA and docB, because the fieldNorm's
for fieldX in both docs would include the exact same multiplier.
-Hoss