You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Clara Vania <ch...@yahoo.com> on 2010/04/27 12:02:35 UTC

Range Score in Lucene

Hi all,

I am new to Lucene and I want to ask about range score that Lucene used, because I got score greater than 1.
I'm using lucene-3.0.1 and using 
MoreLikeThis to do document similarity and ScoreDoc class to get hits of my search.

 Thanks,


-Clara Vania-


      

Re: Range Score in Lucene

Posted by Clara Vania <ch...@yahoo.com>.
Thanks again for your help!! :)

 
Regards,

-Clara Vania-





________________________________
From: Uwe Schindler <uw...@thetaphi.de>
To: java-user@lucene.apache.org
Sent: Wed, April 28, 2010 12:38:04 AM
Subject: RE: Range Score in Lucene

This hast o do with combining multiple terms in a Boolean query. If you have only one term and no boost factors involved, you will get 1. I just repeat, the score numbers are arbitrary scale, only compareable within one query.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Clara Vania [mailto:chubz_fun@yahoo.com]
> Sent: Tuesday, April 27, 2010 7:32 PM
> To: java-user@lucene.apache.org
> Subject: Re: Range Score in Lucene
> 
> Really thanks for the quick reply,
> 
> I want to find documents similar to one document (let's call it
> document A) in my index. To do this I use the MoreLikeThis class to
> help create query from document A. I also included document A in my
> index, so I assumed that I will have document A at the first rank. When
> I do the searching, I print out the top 10 documents along with their
> score. As I assumed before, I have document A in the first rank, but
> the score is above 1 (score=1.841656).
> 
> I think lucene score have range [0..1] because it uses cosine
> similarity, where score 1 means that two documents are perfectly
> similar and 0 means that there are no similarity between two document,
> but why I get document score above 1?
> 
> 
> 
> Thanks,
> 
> -Clara
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      

RE: Range Score in Lucene

Posted by Uwe Schindler <uw...@thetaphi.de>.
This hast o do with combining multiple terms in a Boolean query. If you have only one term and no boost factors involved, you will get 1. I just repeat, the score numbers are arbitrary scale, only compareable within one query.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Clara Vania [mailto:chubz_fun@yahoo.com]
> Sent: Tuesday, April 27, 2010 7:32 PM
> To: java-user@lucene.apache.org
> Subject: Re: Range Score in Lucene
> 
> Really thanks for the quick reply,
> 
> I want to find documents similar to one document (let's call it
> document A) in my index. To do this I use the MoreLikeThis class to
> help create query from document A. I also included document A in my
> index, so I assumed that I will have document A at the first rank. When
> I do the searching, I print out the top 10 documents along with their
> score. As I assumed before, I have document A in the first rank, but
> the score is above 1 (score=1.841656).
> 
> I think lucene score have range [0..1] because it uses cosine
> similarity, where score 1 means that two documents are perfectly
> similar and 0 means that there are no similarity between two document,
> but why I get document score above 1?
> 
> 
> 
> Thanks,
> 
> -Clara
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Range Score in Lucene

Posted by Clara Vania <ch...@yahoo.com>.
Really thanks for the quick reply,

I want to find documents similar to one document (let's call it document A) in my index. To do this I use the MoreLikeThis class to help create query from document A. I also included document A in my index, so I assumed that I will have document A at the first rank. When I do the searching, I print out the top 10 documents along with their 
score. As I assumed before, I have document A in the first rank, but the score is above 1 (score=1.841656). 

I think lucene score have range [0..1] because it uses cosine similarity, where score 1 means that two documents are perfectly similar and 0 means that there are no similarity between two document, but why I get document score above 1?


 
Thanks,

-Clara


      

RE: Range Score in Lucene

Posted by Uwe Schindler <uw...@thetaphi.de>.
The score is an arbitrary number > 0. It's not normalized to anything, it should only be used to e.g. sort the results. You cannot even compare scores between two searches. They should only be used to compare hits *within* one result set (e.g. sort as done in top docs).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Clara Vania [mailto:chubz_fun@yahoo.com]
> Sent: Tuesday, April 27, 2010 12:03 PM
> To: java-user@lucene.apache.org
> Subject: Range Score in Lucene
> 
> Hi all,
> 
> I am new to Lucene and I want to ask about range score that Lucene
> used, because I got score greater than 1.
> I'm using lucene-3.0.1 and using
> MoreLikeThis to do document similarity and ScoreDoc class to get hits
> of my search.
> 
>  Thanks,
> 
> 
> -Clara Vania-
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Range Score in Lucene

Posted by Anshum <an...@gmail.com>.
Hi Clara,
Any particular reason why you'd need the score?  Perhaps this would be of
help
http://lucene.apache.org/java/2_9_1/scoring.html
http://lucene.apache.org/java/2_3_2/scoring.pdf

Hope this explains whatever you were looking for.

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Tue, Apr 27, 2010 at 3:32 PM, Clara Vania <ch...@yahoo.com> wrote:

> Hi all,
>
> I am new to Lucene and I want to ask about range score that Lucene used,
> because I got score greater than 1.
> I'm using lucene-3.0.1 and using
> MoreLikeThis to do document similarity and ScoreDoc class to get hits of my
> search.
>
>  Thanks,
>
>
> -Clara Vania-
>
>
>