You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Hartmann <mi...@web.de> on 2004/12/24 17:24:20 UTC

AW: Poor Lucene Ranking for Short Text

Hi Kevin,

Seem like you have some knowledge about the lenghtNorm value in Lucene.
Comparing it to the formula in "Modern Information Retrieval" does it sum up
the denominator sqrt((sum(tf_d*idf_t)²)) * sqrt((sum(tf_q*idf_t)²))

Just a quick note is ok.

Besides that could you invite me to rojo. There beta status seem to be quite
long.

Thanks
Michael

| -----Ursprüngliche Nachricht-----
| Von: 
| lucene-user-return-10902-michael.hartmann=web.de@jakarta.apach
| e.org 
| [mailto:lucene-user-return-10902-michael.hartmann=web.de@jakar
| ta.apache.org] Im Auftrag von Kevin A. Burton
| Gesendet: Mittwoch, 27. Oktober 2004 22:48
| An: Lucene Users List
| Betreff: Re: Poor Lucene Ranking for Short Text
| 
| Daniel Naber wrote:
| 
| > (Kevin complains about shorter documents ranked higher)
| >
| >This is something that can easily be fixed. Just use a Similarity 
| >implementation that extends DefaultSimilarity and that overwrites
| >lengthNorm: just return 1.0f there. You need to use that 
| Similarity for 
| >indexing and searching, i.e. it requires reindexing.
| >  
| >
| What happens when I do this with an existing index? I don't 
| want to have to rewrite this index as it will take FOREVER
| 
| If the current behavior is all that happens this is fine... 
| this way I can just get this behavior for new documents that 
| are added.
| 
| Also... why isn't this the default?
| 
| Kevin
| 
| -- 
| 
| Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask 
| me for an invite!  Also see irc.freenode.net #rojo if you 
| want to chat.
| 
| Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
| 
| If you're interested in RSS, Weblogs, Social Networking, 
| etc... then you should work for Rojo!  If you recommend 
| someone and we hire them you'll get a free iPod!
|     
| Kevin A. Burton, Location - San Francisco, CA
|        AIM/YIM - sfburtonator,  Web - http://peerfear.org/ 
| GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
| 
| 
| ---------------------------------------------------------------------
| To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
| For additional commands, e-mail: lucene-user-help@jakarta.apache.org
| 
| 



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org