You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Armbrust, Daniel C." <Ar...@mayo.edu> on 2004/04/14 18:16:25 UTC
Result scoring question
I know that the lucene scoring algorithm is pretty complicated, I know I don't understand all the pieces. But given these documents:
A) - <preferred_designation> left renal calculus
B) - <other_designation> renal calculus
Should a query of
other_designation:("renal calculus") OR preferred_designation:("renal calculus")
Score document B higher than document A?
Those documents are a made up example. Here are the documents and scores I am getting back from the query on my real index:
Score 1.0 - Document<Text<first_word:left> Text<preferred_designation:left renal calculus in calyceal diverticulum> Unindexed<frequency:4> Text<codeTokenized:M00004001> Keyword<code:M00004001> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:48270>>
Score 0.85714287 - Document<Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:514631> Keyword<code:M00035214> Text<codeTokenized:M00035214> Unindexed<frequency:4> Text<preferred_designation:left renal calculus in a solitary left kidney> Text<first_word:left>>
Score 0.7409672 - Document<Text<first_word:renal> Text<other_designation:renal calculus> Unindexed<frequency:3> Text<codeTokenized:M00032753> Keyword<code:M00032753> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:481129>>
Am I just making a dumb mistake somewhere?
Thanks,
Dan
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
RE: Software for suggesting alternative words or sentences
Posted by Tate Avery <ta...@nstein.com>.
Also...
http://jazzy.sourceforge.net/
-----Original Message-----
From: Felix Huber [mailto:huberfelix@webtopia.de]
Sent: Friday, April 16, 2004 1:17 PM
To: Lucene Users List
Subject: Re: Software for suggesting alternative words or sentences
Check http://www.iu.hio.no/~frodes/sprell/sprell.html - it includes a german
and a norwegian dictionary.
Regards,
Felix Huber
Venu Durgam wrote:
> I was wondering if there is any open source software for suggesting
> alternative words or sentences for search queries like Google.
>
> Thanks
> Venu Durgam
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Software for suggesting alternative words or sentences
Posted by Felix Huber <hu...@webtopia.de>.
Check http://www.iu.hio.no/~frodes/sprell/sprell.html - it includes a german
and a norwegian dictionary.
Regards,
Felix Huber
Venu Durgam wrote:
> I was wondering if there is any open source software for suggesting
> alternative words or sentences for search queries like Google.
>
> Thanks
> Venu Durgam
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Software for suggesting alternative words or sentences
Posted by Venu Durgam <vd...@yahoo.com>.
I was wondering if there is any open source software for suggesting alternative words or sentences for search queries like Google.
Thanks
Venu Durgam
Re: Result scoring question
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Try using IndexSearcher.explain (and then a toString on the resulting
Explanation object) to see the details of why things are scoring how
they are. This can be most enlightening!
Erik
On Apr 14, 2004, at 12:16 PM, Armbrust, Daniel C. wrote:
> I know that the lucene scoring algorithm is pretty complicated, I know
> I don't understand all the pieces. But given these documents:
>
> A) - <preferred_designation> left renal calculus
> B) - <other_designation> renal calculus
>
> Should a query of
>
> other_designation:("renal calculus") OR preferred_designation:("renal
> calculus")
>
> Score document B higher than document A?
>
> Those documents are a made up example. Here are the documents and
> scores I am getting back from the query on my real index:
>
> Score 1.0 - Document<Text<first_word:left>
> Text<preferred_designation:left renal calculus in calyceal
> diverticulum> Unindexed<frequency:4> Text<codeTokenized:M00004001>
> Keyword<code:M00004001>
> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:48270>>
>
> Score 0.85714287 -
> Document<Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:514631>
> Keyword<code:M00035214> Text<codeTokenized:M00035214>
> Unindexed<frequency:4> Text<preferred_designation:left renal calculus
> in a solitary left kidney> Text<first_word:left>>
>
> Score 0.7409672 - Document<Text<first_word:renal>
> Text<other_designation:renal calculus> Unindexed<frequency:3>
> Text<codeTokenized:M00032753> Keyword<code:M00032753>
> Keyword<UNIQUE_DOCUMENT_IDENTIFIER_FIELD:481129>>
>
>
> Am I just making a dumb mistake somewhere?
>
> Thanks,
>
> Dan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org