You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Robert Young <bu...@gmail.com> on 2007/11/01 10:15:45 UTC

fieldNorm seems to be killing my score

Hi,

I've been trying to debug why one of my test cases doesn't work. I
have an index with two documents in, one talking mostly about apples
and one talking mostly about oranges (for the sake of this test case)
both of which have 'test_site' in their site field. If I run the query
+(apple^4 orange) +(site:"test_site") I would expect the document
which talks about apples to always apear first but it does not.
Looking at the debug output (below) it looks like fieldNorm is killing
the first part of the query. Why is this and how can I stop it?

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">4</int>
 <lst name="params">
  <str name="rows">10</str>
  <str name="start">0</str>

  <str name="indent">on</str>
  <str name="q">+(apple^4 orange) +(site:"test_site")</str>
  <str name="debugQuery">on</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="2" start="0">
 <doc>

  <str name="guid">test_index-test_site-integration:124</str>
  <str name="index">test_index</str>
  <str name="link">/oranges</str>
  <str name="site">test_site</str>
  <str name="snippet">orange orange orange</str>
  <str name="title">orange</str>

 </doc>
 <doc>
  <str name="guid">test_index-test_site-integration:123</str>
  <str name="index">test_index</str>
  <str name="link">/me</str>
  <str name="site">test_site</str>
  <str name="snippet">apple apple apple</str>

  <str name="title">apple</str>
 </doc>
</result>
<lst name="debug">
 <str name="rawquerystring">+(apple^4 orange) +(site:"test_site")</str>
 <str name="querystring">+(apple^4 orange) +(site:"test_site")</str>
 <str name="parsedquery">+(text:appl^4.0 text:orang) +site:test_site</str>
 <str name="parsedquery_toString">+(text:appl^4.0 text:orang)
+site:test_site</str>

 <lst name="explain">
  <str name="id=test_index-test_site-integration:124,internal_docid=13">
0.14332592 = (MATCH) sum of:
  0.0 = (MATCH) product of:
    0.0 = (MATCH) sum of:
      0.0 = (MATCH) weight(text:orang in 13), product of:
        0.24034579 = queryWeight(text:orang), product of:
          1.9162908 = idf(docFreq=5)
          0.1254224 = queryNorm
        0.0 = (MATCH) fieldWeight(text:orang in 13), product of:
          2.236068 = tf(termFreq(text:orang)=5)
          1.9162908 = idf(docFreq=5)
          0.0 = fieldNorm(field=text, doc=13)
    0.5 = coord(1/2)
  0.14332592 = (MATCH) weight(site:test_site in 13), product of:
    0.13407566 = queryWeight(site:test_site), product of:
      1.0689929 = idf(docFreq=13)
      0.1254224 = queryNorm
    1.0689929 = (MATCH) fieldWeight(site:test_site in 13), product of:
      1.0 = tf(termFreq(site:test_site)=1)
      1.0689929 = idf(docFreq=13)
      1.0 = fieldNorm(field=site, doc=13)
</str>
  <str name="id=test_index-test_site-integration:123,internal_docid=14">
0.14332592 = (MATCH) sum of:
  0.0 = (MATCH) product of:
    0.0 = (MATCH) sum of:
      0.0 = (MATCH) weight(text:appl^4.0 in 14), product of:
        0.96138316 = queryWeight(text:appl^4.0), product of:
          4.0 = boost
          1.9162908 = idf(docFreq=5)
          0.1254224 = queryNorm
        0.0 = (MATCH) fieldWeight(text:appl in 14), product of:
          2.236068 = tf(termFreq(text:appl)=5)
          1.9162908 = idf(docFreq=5)
          0.0 = fieldNorm(field=text, doc=14)
    0.5 = coord(1/2)
  0.14332592 = (MATCH) weight(site:test_site in 14), product of:
    0.13407566 = queryWeight(site:test_site), product of:
      1.0689929 = idf(docFreq=13)
      0.1254224 = queryNorm
    1.0689929 = (MATCH) fieldWeight(site:test_site in 14), product of:
      1.0 = tf(termFreq(site:test_site)=1)
      1.0689929 = idf(docFreq=13)
      1.0 = fieldNorm(field=site, doc=14)
</str>
 </lst>
</lst>
</response>

Re: fieldNorm seems to be killing my score

Posted by Robert Young <bu...@gmail.com>.
Oooh! I think I'll just get my coat...

My indexer was defaulting to zero for document boosts rather than 1.

On 11/1/07, Yonik Seeley <yo...@apache.org> wrote:
> Hmmm, a norm of 0.0???  That implies that the boost for that field
> (text) was set to zero when it was indexed.
> How did you index the data (straight HTTP, SolrJ, etc)?  What does
> your schema for this field (and copyFields) look like?
>
> -Yonik
>
> On 11/1/07, Robert Young <bu...@gmail.com> wrote:
> > Hi,
> >
> > I've been trying to debug why one of my test cases doesn't work. I
> > have an index with two documents in, one talking mostly about apples
> > and one talking mostly about oranges (for the sake of this test case)
> > both of which have 'test_site' in their site field. If I run the query
> > +(apple^4 orange) +(site:"test_site") I would expect the document
> > which talks about apples to always apear first but it does not.
> > Looking at the debug output (below) it looks like fieldNorm is killing
> > the first part of the query. Why is this and how can I stop it?
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <response>
> >
> > <lst name="responseHeader">
> >  <int name="status">0</int>
> >  <int name="QTime">4</int>
> >  <lst name="params">
> >   <str name="rows">10</str>
> >   <str name="start">0</str>
> >
> >   <str name="indent">on</str>
> >   <str name="q">+(apple^4 orange) +(site:"test_site")</str>
> >   <str name="debugQuery">on</str>
> >   <str name="version">2.2</str>
> >  </lst>
> > </lst>
> > <result name="response" numFound="2" start="0">
> >  <doc>
> >
> >   <str name="guid">test_index-test_site-integration:124</str>
> >   <str name="index">test_index</str>
> >   <str name="link">/oranges</str>
> >   <str name="site">test_site</str>
> >   <str name="snippet">orange orange orange</str>
> >   <str name="title">orange</str>
> >
> >  </doc>
> >  <doc>
> >   <str name="guid">test_index-test_site-integration:123</str>
> >   <str name="index">test_index</str>
> >   <str name="link">/me</str>
> >   <str name="site">test_site</str>
> >   <str name="snippet">apple apple apple</str>
> >
> >   <str name="title">apple</str>
> >  </doc>
> > </result>
> > <lst name="debug">
> >  <str name="rawquerystring">+(apple^4 orange) +(site:"test_site")</str>
> >  <str name="querystring">+(apple^4 orange) +(site:"test_site")</str>
> >  <str name="parsedquery">+(text:appl^4.0 text:orang) +site:test_site</str>
> >  <str name="parsedquery_toString">+(text:appl^4.0 text:orang)
> > +site:test_site</str>
> >
> >  <lst name="explain">
> >   <str name="id=test_index-test_site-integration:124,internal_docid=13">
> > 0.14332592 = (MATCH) sum of:
> >   0.0 = (MATCH) product of:
> >     0.0 = (MATCH) sum of:
> >       0.0 = (MATCH) weight(text:orang in 13), product of:
> >         0.24034579 = queryWeight(text:orang), product of:
> >           1.9162908 = idf(docFreq=5)
> >           0.1254224 = queryNorm
> >         0.0 = (MATCH) fieldWeight(text:orang in 13), product of:
> >           2.236068 = tf(termFreq(text:orang)=5)
> >           1.9162908 = idf(docFreq=5)
> >           0.0 = fieldNorm(field=text, doc=13)
> >     0.5 = coord(1/2)
> >   0.14332592 = (MATCH) weight(site:test_site in 13), product of:
> >     0.13407566 = queryWeight(site:test_site), product of:
> >       1.0689929 = idf(docFreq=13)
> >       0.1254224 = queryNorm
> >     1.0689929 = (MATCH) fieldWeight(site:test_site in 13), product of:
> >       1.0 = tf(termFreq(site:test_site)=1)
> >       1.0689929 = idf(docFreq=13)
> >       1.0 = fieldNorm(field=site, doc=13)
> > </str>
> >   <str name="id=test_index-test_site-integration:123,internal_docid=14">
> > 0.14332592 = (MATCH) sum of:
> >   0.0 = (MATCH) product of:
> >     0.0 = (MATCH) sum of:
> >       0.0 = (MATCH) weight(text:appl^4.0 in 14), product of:
> >         0.96138316 = queryWeight(text:appl^4.0), product of:
> >           4.0 = boost
> >           1.9162908 = idf(docFreq=5)
> >           0.1254224 = queryNorm
> >         0.0 = (MATCH) fieldWeight(text:appl in 14), product of:
> >           2.236068 = tf(termFreq(text:appl)=5)
> >           1.9162908 = idf(docFreq=5)
> >           0.0 = fieldNorm(field=text, doc=14)
> >     0.5 = coord(1/2)
> >   0.14332592 = (MATCH) weight(site:test_site in 14), product of:
> >     0.13407566 = queryWeight(site:test_site), product of:
> >       1.0689929 = idf(docFreq=13)
> >       0.1254224 = queryNorm
> >     1.0689929 = (MATCH) fieldWeight(site:test_site in 14), product of:
> >       1.0 = tf(termFreq(site:test_site)=1)
> >       1.0689929 = idf(docFreq=13)
> >       1.0 = fieldNorm(field=site, doc=14)
> > </str>
> >  </lst>
> > </lst>
> > </response>
> >
>

Re: fieldNorm seems to be killing my score

Posted by Yonik Seeley <yo...@apache.org>.
Hmmm, a norm of 0.0???  That implies that the boost for that field
(text) was set to zero when it was indexed.
How did you index the data (straight HTTP, SolrJ, etc)?  What does
your schema for this field (and copyFields) look like?

-Yonik

On 11/1/07, Robert Young <bu...@gmail.com> wrote:
> Hi,
>
> I've been trying to debug why one of my test cases doesn't work. I
> have an index with two documents in, one talking mostly about apples
> and one talking mostly about oranges (for the sake of this test case)
> both of which have 'test_site' in their site field. If I run the query
> +(apple^4 orange) +(site:"test_site") I would expect the document
> which talks about apples to always apear first but it does not.
> Looking at the debug output (below) it looks like fieldNorm is killing
> the first part of the query. Why is this and how can I stop it?
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">4</int>
>  <lst name="params">
>   <str name="rows">10</str>
>   <str name="start">0</str>
>
>   <str name="indent">on</str>
>   <str name="q">+(apple^4 orange) +(site:"test_site")</str>
>   <str name="debugQuery">on</str>
>   <str name="version">2.2</str>
>  </lst>
> </lst>
> <result name="response" numFound="2" start="0">
>  <doc>
>
>   <str name="guid">test_index-test_site-integration:124</str>
>   <str name="index">test_index</str>
>   <str name="link">/oranges</str>
>   <str name="site">test_site</str>
>   <str name="snippet">orange orange orange</str>
>   <str name="title">orange</str>
>
>  </doc>
>  <doc>
>   <str name="guid">test_index-test_site-integration:123</str>
>   <str name="index">test_index</str>
>   <str name="link">/me</str>
>   <str name="site">test_site</str>
>   <str name="snippet">apple apple apple</str>
>
>   <str name="title">apple</str>
>  </doc>
> </result>
> <lst name="debug">
>  <str name="rawquerystring">+(apple^4 orange) +(site:"test_site")</str>
>  <str name="querystring">+(apple^4 orange) +(site:"test_site")</str>
>  <str name="parsedquery">+(text:appl^4.0 text:orang) +site:test_site</str>
>  <str name="parsedquery_toString">+(text:appl^4.0 text:orang)
> +site:test_site</str>
>
>  <lst name="explain">
>   <str name="id=test_index-test_site-integration:124,internal_docid=13">
> 0.14332592 = (MATCH) sum of:
>   0.0 = (MATCH) product of:
>     0.0 = (MATCH) sum of:
>       0.0 = (MATCH) weight(text:orang in 13), product of:
>         0.24034579 = queryWeight(text:orang), product of:
>           1.9162908 = idf(docFreq=5)
>           0.1254224 = queryNorm
>         0.0 = (MATCH) fieldWeight(text:orang in 13), product of:
>           2.236068 = tf(termFreq(text:orang)=5)
>           1.9162908 = idf(docFreq=5)
>           0.0 = fieldNorm(field=text, doc=13)
>     0.5 = coord(1/2)
>   0.14332592 = (MATCH) weight(site:test_site in 13), product of:
>     0.13407566 = queryWeight(site:test_site), product of:
>       1.0689929 = idf(docFreq=13)
>       0.1254224 = queryNorm
>     1.0689929 = (MATCH) fieldWeight(site:test_site in 13), product of:
>       1.0 = tf(termFreq(site:test_site)=1)
>       1.0689929 = idf(docFreq=13)
>       1.0 = fieldNorm(field=site, doc=13)
> </str>
>   <str name="id=test_index-test_site-integration:123,internal_docid=14">
> 0.14332592 = (MATCH) sum of:
>   0.0 = (MATCH) product of:
>     0.0 = (MATCH) sum of:
>       0.0 = (MATCH) weight(text:appl^4.0 in 14), product of:
>         0.96138316 = queryWeight(text:appl^4.0), product of:
>           4.0 = boost
>           1.9162908 = idf(docFreq=5)
>           0.1254224 = queryNorm
>         0.0 = (MATCH) fieldWeight(text:appl in 14), product of:
>           2.236068 = tf(termFreq(text:appl)=5)
>           1.9162908 = idf(docFreq=5)
>           0.0 = fieldNorm(field=text, doc=14)
>     0.5 = coord(1/2)
>   0.14332592 = (MATCH) weight(site:test_site in 14), product of:
>     0.13407566 = queryWeight(site:test_site), product of:
>       1.0689929 = idf(docFreq=13)
>       0.1254224 = queryNorm
>     1.0689929 = (MATCH) fieldWeight(site:test_site in 14), product of:
>       1.0 = tf(termFreq(site:test_site)=1)
>       1.0689929 = idf(docFreq=13)
>       1.0 = fieldNorm(field=site, doc=14)
> </str>
>  </lst>
> </lst>
> </response>
>