You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by David Philip <da...@gmail.com> on 2013/03/04 11:38:11 UTC

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

Hi Chris,

   Thank you for the reply. okay understood about *fieldWeight*.

I am actually curious to know how are the documents sequenced in this case
when the  product of tf idf and fieldnorm is same for both the documents?

Afaik, at the first step, documents are sequenced based on
fieldWeight(product of tf idf and fieldnorm) order by desc[correct?]. But
if both are same  then what is the next factor taken in  consideration to
sequence?

 In the below case , why doc 1 is come first and then doc2 when both scores
are same.

:  1.0469098 =,
: *(MATCH) fieldWeight(title:updated in 7), *
: product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
idf(docFreq=2,
: maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
:
:  1.0469098 =,
: *(MATCH) fieldWeight(title:updated in 9), *
: product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
idf(docFreq=2,
: maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)




Thanks  - David









On Sat, Mar 2, 2013 at 12:23 PM, Chris Hostetter
<ho...@fucit.org>wrote:

>
> :  In the explain tag  (debugQuery=true)
> : what does the *fieldWeight* value refer to?,
>
> fieldWeight is just a label being put on the the product of the tf, idf,
> and fieldNorm for that term.  (I don't remember why it's refered to as the
> "fieldWeight" ... i think it may just be historical, since these are all
> factors of the "field query" (ie: "term query", as opposed to a "boolean
> query" across multiple fields)
>
>
> : *1.0469098* is the product of tf, idf and fieldNorm,  for both the
> records.
> : But field weight is different. I would like to know what is the field
>
> what do you mean "field weight is different" ? ... in both of the examples
> you posted, the fieldWeight is 1.0469098 ?
>
> Are you perhaps refering to the numbers "7" and "9" that appear inside the
> fieldWeight(...) label?  Those are just refering to the (internal)
> docids (just like in the "fieldNorm(...)")
>
> :  1.0469098 =,
> : *(MATCH) fieldWeight(title:updated in 7), *
> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> idf(docFreq=2,
> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> :
> :  1.0469098 =,
> : *(MATCH) fieldWeight(title:updated in 9), *
> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> idf(docFreq=2,
> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
>
>
> -Hoss
>

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

Posted by Upayavira <uv...@odoko.co.uk>.
In the order they appear in the index. This used to be the order in
which they were indexed, but merging strategies can now change that
order. Tiy can control that order by sorting on the pseudo field
'score', then on mother field or function. Or you can include other
functions or fields in the score to influence it in directions you want.

Upayavira

On Wed, Mar 13, 2013, at 04:49 AM, David Philip wrote:
> Hi,
> 
>   Any reply on this: How are the documents sequenced in the case when the
>  product of tf idf , coord and fieldnorm is same for both the documents?
> 
> Thanks - David
> 
> 
> 
> P.S : This link was very useful to understand the scoring in detail:
> http://mail-archives.apache.org/mod_mbox/lucene-java-user/201008.mbox/%3CAANLkTi=JPph3X5TLkbJ_rax5qhEx6zRcgUiuNhQBF4xu@mail.gmail.com%3E
> 
> 
> 
> 
> 
> On Mon, Mar 4, 2013 at 4:08 PM, David Philip
> <da...@gmail.com>wrote:
> 
> > Hi Chris,
> >
> >    Thank you for the reply. okay understood about *fieldWeight*.
> >
> > I am actually curious to know how are the documents sequenced in this case
> > when the  product of tf idf and fieldnorm is same for both the documents?
> >
> > Afaik, at the first step, documents are sequenced based on
> > fieldWeight(product of tf idf and fieldnorm) order by desc[correct?]. But
> > if both are same  then what is the next factor taken in  consideration to
> > sequence?
> >
> >  In the below case , why doc 1 is come first and then doc2 when both
> > scores are same.
> >
> > :  1.0469098 =,
> > : *(MATCH) fieldWeight(title:updated in 7), *
> > : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> > idf(docFreq=2,
> > : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> > :
> > :  1.0469098 =,
> > : *(MATCH) fieldWeight(title:updated in 9), *
> > : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> > idf(docFreq=2,
> > : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> >
> >
> >
> >
> > Thanks  - David
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Sat, Mar 2, 2013 at 12:23 PM, Chris Hostetter <hossman_lucene@fucit.org
> > > wrote:
> >
> >>
> >> :  In the explain tag  (debugQuery=true)
> >> : what does the *fieldWeight* value refer to?,
> >>
> >> fieldWeight is just a label being put on the the product of the tf, idf,
> >> and fieldNorm for that term.  (I don't remember why it's refered to as the
> >> "fieldWeight" ... i think it may just be historical, since these are all
> >> factors of the "field query" (ie: "term query", as opposed to a "boolean
> >> query" across multiple fields)
> >>
> >>
> >> : *1.0469098* is the product of tf, idf and fieldNorm,  for both the
> >> records.
> >> : But field weight is different. I would like to know what is the field
> >>
> >> what do you mean "field weight is different" ? ... in both of the examples
> >> you posted, the fieldWeight is 1.0469098 ?
> >>
> >> Are you perhaps refering to the numbers "7" and "9" that appear inside the
> >> fieldWeight(...) label?  Those are just refering to the (internal)
> >> docids (just like in the "fieldNorm(...)")
> >>
> >> :  1.0469098 =,
> >> : *(MATCH) fieldWeight(title:updated in 7), *
> >> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> >> idf(docFreq=2,
> >> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> >> :
> >> :  1.0469098 =,
> >> : *(MATCH) fieldWeight(title:updated in 9), *
> >> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> >> idf(docFreq=2,
> >> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> >>
> >>
> >> -Hoss
> >>
> >
> >

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

Posted by David Philip <da...@gmail.com>.
Hi,

  Any reply on this: How are the documents sequenced in the case when the
 product of tf idf , coord and fieldnorm is same for both the documents?

Thanks - David



P.S : This link was very useful to understand the scoring in detail:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201008.mbox/%3CAANLkTi=JPph3X5TLkbJ_rax5qhEx6zRcgUiuNhQBF4xu@mail.gmail.com%3E





On Mon, Mar 4, 2013 at 4:08 PM, David Philip <da...@gmail.com>wrote:

> Hi Chris,
>
>    Thank you for the reply. okay understood about *fieldWeight*.
>
> I am actually curious to know how are the documents sequenced in this case
> when the  product of tf idf and fieldnorm is same for both the documents?
>
> Afaik, at the first step, documents are sequenced based on
> fieldWeight(product of tf idf and fieldnorm) order by desc[correct?]. But
> if both are same  then what is the next factor taken in  consideration to
> sequence?
>
>  In the below case , why doc 1 is come first and then doc2 when both
> scores are same.
>
> :  1.0469098 =,
> : *(MATCH) fieldWeight(title:updated in 7), *
> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> idf(docFreq=2,
> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
> :
> :  1.0469098 =,
> : *(MATCH) fieldWeight(title:updated in 9), *
> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
> idf(docFreq=2,
> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
>
>
>
>
> Thanks  - David
>
>
>
>
>
>
>
>
>
> On Sat, Mar 2, 2013 at 12:23 PM, Chris Hostetter <hossman_lucene@fucit.org
> > wrote:
>
>>
>> :  In the explain tag  (debugQuery=true)
>> : what does the *fieldWeight* value refer to?,
>>
>> fieldWeight is just a label being put on the the product of the tf, idf,
>> and fieldNorm for that term.  (I don't remember why it's refered to as the
>> "fieldWeight" ... i think it may just be historical, since these are all
>> factors of the "field query" (ie: "term query", as opposed to a "boolean
>> query" across multiple fields)
>>
>>
>> : *1.0469098* is the product of tf, idf and fieldNorm,  for both the
>> records.
>> : But field weight is different. I would like to know what is the field
>>
>> what do you mean "field weight is different" ? ... in both of the examples
>> you posted, the fieldWeight is 1.0469098 ?
>>
>> Are you perhaps refering to the numbers "7" and "9" that appear inside the
>> fieldWeight(...) label?  Those are just refering to the (internal)
>> docids (just like in the "fieldNorm(...)")
>>
>> :  1.0469098 =,
>> : *(MATCH) fieldWeight(title:updated in 7), *
>> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
>> idf(docFreq=2,
>> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
>> :
>> :  1.0469098 =,
>> : *(MATCH) fieldWeight(title:updated in 9), *
>> : product of: 1.0 = tf(termFreq(title:updated)=1), 2.7917595 =
>> idf(docFreq=2,
>> : maxDocs=18), 0.375 = fieldNorm(field=title, doc=7)
>>
>>
>> -Hoss
>>
>
>