You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by jason <gi...@gmail.com> on 2006/02/06 10:19:15 UTC

understand the queryNorm and the fieldNorm.

Hi,

I have a problem of understanding the queryNorm and fieldNorm.

The following is an example. I try to follow what said in the Javadoc
"Computes the normalization value for a query given the sum of the squared
weights of each of the query terms". But the result is different.

ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initial rank: 0
0.31900567 = sum of:
  0.03968133 = weight(contents:associ in 920), product of:
    0.60161763 = queryWeight(contents:associ), product of:
      1.326625 = idf(docFreq=830)
      0.45349488 = queryNorm
    0.065957725 = fieldWeight(contents:associ in 920), product of:
      4.2426405 = tf(termFreq(contents:associ)=18)
      1.326625 = idf(docFreq=830)
      0.01171875 = fieldNorm(field=contents, doc=920)
  0.27932435 = weight(contents:rule in 920), product of:
    0.7987842 = queryWeight(contents:rule), product of:
      1.7613963 = idf(docFreq=537)
      0.45349488 = queryNorm
    0.34968686 = fieldWeight(contents:rule in 920), product of:
      16.941074 = tf(termFreq(contents:rule)=287)
      1.7613963 = idf(docFreq=537)
      0.01171875 = fieldNorm(field=contents, doc=920)

regards
jiang xing

Re: understand the queryNorm and the fieldNorm.

Posted by jason <gi...@gmail.com>.

hi, thx.

I think i forget the ^0.5

cheers
Jason


On 2/6/06, Yonik Seeley <ys...@gmail.com> wrote:
>
> Hi Jason,
> I get the same thing for the queryNorm when I calculate it by hand:
> 1/((1.7613963**2 + 1.326625**2)**.5)  = 0.45349488111693986
>
> -Yonik
>
> On 2/6/06, jason <gi...@gmail.com> wrote:
> > Hi,
> >
> > I have a problem of understanding the queryNorm and fieldNorm.
> >
> > The following is an example. I try to follow what said in the Javadoc
> > "Computes the normalization value for a query given the sum of the
> squared
> > weights of each of the query terms". But the result is different.
> >
> > ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initialrank: 0
> > 0.31900567 = sum of:
> >   0.03968133 = weight(contents:associ in 920), product of:
> >     0.60161763 = queryWeight(contents:associ), product of:
> >       1.326625 = idf(docFreq=830)
> >       0.45349488 = queryNorm
> >     0.065957725 = fieldWeight(contents:associ in 920), product of:
> >       4.2426405 = tf(termFreq(contents:associ)=18)
> >       1.326625 = idf(docFreq=830)
> >       0.01171875 = fieldNorm(field=contents, doc=920)
> >   0.27932435 = weight(contents:rule in 920), product of:
> >     0.7987842 = queryWeight(contents:rule), product of:
> >       1.7613963 = idf(docFreq=537)
> >       0.45349488 = queryNorm
> >     0.34968686 = fieldWeight(contents:rule in 920), product of:
> >       16.941074 = tf(termFreq(contents:rule)=287)
> >       1.7613963 = idf(docFreq=537)
> >       0.01171875 = fieldNorm(field=contents, doc=920)
> >
> > regards
> > jiang xing
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: time of search for an index with the file .FDT much large

Posted by Yonik Seeley <ys...@gmail.com>.

20 seconds does seem like a long time to retrieve the stored fields of
the 3000 documents.  However, you should also step back and determine
if you really need to do that, or if there is another way to narrow
the number of documents that need to be read from disk.

-Yonik


On 2/6/06, Antonio Bruno <nt...@yahoo.it> wrote:
> Hi,
>   I have an index with 2,5 million documents.
> A document is formed in this way:
> - 15 fields index
> - 1 field stored but not indexed, whose value is one string of 500 byte.
> A search in average gives back the 3000 document. As 3000 id of documents is given back a lot fastly, the 3000 documents instead demands at least 20sec. for being given back!!!
>   I have tried with a string of 5 byte instead that of 500, and the 3000 documents are given back in only 1sec!
>   The question seems had to the size of the file .fdt
>   How I can make in order to reduce the time in the first case?
>
>
>
>
>
>
>   Ing. Antonio Bruno
>   Software Analyst
>   http://xoomer.virgilio.it/lnb
>   cell: (+39) 3402347684
>   T&S S.r.l - Technologies And Solutions
>   email T&S: bruno@tessrl.com
>   email Yahoo: ntnbrn80@yahoo.it
>
>
>
>
>
>
>
>
> ---------------------------------
> Yahoo! Mail: gratis 1GB per i messaggi, antispam, antivirus, POP3
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

time of search for an index with the file .FDT much large

Posted by Antonio Bruno <nt...@yahoo.it>.

Hi,
  I have an index with 2,5 million documents. 
A document is formed in this way: 
- 15 fields index 
- 1 field stored but not indexed, whose value is one string of 500 byte. 
A search in average gives back the 3000 document. As 3000 id of documents is given back a lot fastly, the 3000 documents instead demands at least 20sec. for being given back!!! 
  I have tried with a string of 5 byte instead that of 500, and the 3000 documents are given back in only 1sec!
  The question seems had to the size of the file .fdt
  How I can make in order to reduce the time in the first case?




             
   
  Ing. Antonio Bruno
  Software Analyst
  http://xoomer.virgilio.it/lnb
  cell: (+39) 3402347684
  T&S S.r.l - Technologies And Solutions
  email T&S: bruno@tessrl.com
  email Yahoo: ntnbrn80@yahoo.it







		
---------------------------------
Yahoo! Mail: gratis 1GB per i messaggi, antispam, antivirus, POP3

Re: understand the queryNorm and the fieldNorm.

Posted by Yonik Seeley <ys...@gmail.com>.

Hi Jason,
I get the same thing for the queryNorm when I calculate it by hand:
1/((1.7613963**2 + 1.326625**2)**.5)  = 0.45349488111693986

-Yonik

On 2/6/06, jason <gi...@gmail.com> wrote:
> Hi,
>
> I have a problem of understanding the queryNorm and fieldNorm.
>
> The following is an example. I try to follow what said in the Javadoc
> "Computes the normalization value for a query given the sum of the squared
> weights of each of the query terms". But the result is different.
>
> ID:0 C:/PDF2Text/SearchEngine/File/SIG/sigkdd/p374-zhang.pdf|initial rank: 0
> 0.31900567 = sum of:
>   0.03968133 = weight(contents:associ in 920), product of:
>     0.60161763 = queryWeight(contents:associ), product of:
>       1.326625 = idf(docFreq=830)
>       0.45349488 = queryNorm
>     0.065957725 = fieldWeight(contents:associ in 920), product of:
>       4.2426405 = tf(termFreq(contents:associ)=18)
>       1.326625 = idf(docFreq=830)
>       0.01171875 = fieldNorm(field=contents, doc=920)
>   0.27932435 = weight(contents:rule in 920), product of:
>     0.7987842 = queryWeight(contents:rule), product of:
>       1.7613963 = idf(docFreq=537)
>       0.45349488 = queryNorm
>     0.34968686 = fieldWeight(contents:rule in 920), product of:
>       16.941074 = tf(termFreq(contents:rule)=287)
>       1.7613963 = idf(docFreq=537)
>       0.01171875 = fieldNorm(field=contents, doc=920)
>
> regards
> jiang xing
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org