You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by pcmanprogrammeur <pc...@neuf.fr> on 2010/02/26 15:30:45 UTC

Highest frequency

Hello all (sorry if my english is bad, i'm french) !

I have a Solr Index with ads which contain a title and a description !
For exemple : 
add 1 : title = test / description = [empty]
add 2 : title = test on test / description = this is a test
And now, if I execute the request "test" in solr/admin, the add 1 is the
first result whereas the add 2 is more pertinent  because the word "test" is
more present !
So, is it possible to say to Solr, to sort the result in fact of the word
frequency ?

Thanks for your help !
-- 
View this message in context: http://old.nabble.com/Highest-frequency-tp27718930p27718930.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highest frequency

Posted by Erick Erickson <er...@gmail.com>.
The underlying Lucene automatically takes this into account.the term
frequency
in relation to the length of the field rather than just a term count. So in
your
example doc 1 has a complete field match on title, so it scores higher.

Also, depending upon how you set things up you may not be searching
on description. Unless you specify it searches only go against the default
field (see your schema for the default field).

Which brings up the question whether you really want to override this
behavior.
Do you really want a document with 10,000 tokens in it that mentions "test"
five
times to score higher than a document with 3 tokens that mentions "test"
three times?

This page may help you resolve this kind of question...

http://lucene.apache.org/java/2_4_0/scoring.html

<http://lucene.apache.org/java/2_4_0/scoring.html>HTH
Erick

On Fri, Feb 26, 2010 at 9:30 AM, pcmanprogrammeur
<pc...@neuf.fr>wrote:

>
> Hello all (sorry if my english is bad, i'm french) !
>
> I have a Solr Index with ads which contain a title and a description !
> For exemple :
> add 1 : title = test / description = [empty]
> add 2 : title = test on test / description = this is a test
> And now, if I execute the request "test" in solr/admin, the add 1 is the
> first result whereas the add 2 is more pertinent  because the word "test"
> is
> more present !
> So, is it possible to say to Solr, to sort the result in fact of the word
> frequency ?
>
> Thanks for your help !
> --
> View this message in context:
> http://old.nabble.com/Highest-frequency-tp27718930p27718930.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Highest frequency

Posted by Marc Sturlese <ma...@gmail.com>.
As far as I know it's not suported by default. I thing you should implement
your custom Lucene Similarity class and plug it into Solr via solrconfig.xml

pcmanprogrammeur wrote:
> 
> Hello all (sorry if my english is bad, i'm french) !
> 
> I have a Solr Index with ads which contain a title and a description !
> For exemple : 
> add 1 : title = test / description = [empty]
> add 2 : title = test on test / description = this is a test
> And now, if I execute the request "test" in solr/admin, the add 1 is the
> first result whereas the add 2 is more pertinent  because the word "test"
> is more present !
> So, is it possible to say to Solr, to sort the result in fact of the word
> frequency ?
> 
> Thanks for your help !
> 

-- 
View this message in context: http://old.nabble.com/Highest-frequency-tp27718930p27719107.html
Sent from the Solr - User mailing list archive at Nabble.com.