You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Philippe <ma...@gmail.com> on 2010/01/21 15:15:49 UTC

Sort by the number of matching fields

Hi everyone,

I want to sort my results independent from my query string. Matching 
"documents" should be ordered by the number of fields containing a 
specific String.

Let's assume my query returns 2 Documents:

Doc1 contains 5 "ID"-fields (1,2,3,4,5)
Doc2 contains 3 "ID"-fields (5,6,7)

I'm more interested in documents containing the ids (5,6) so Document2 
should be ranked higher than Document1.

What would be the best way to perform this?

Regards,
    Philippe

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Sort by the number of matching fields

Posted by Philippe <ma...@gmail.com>.
Hi Ian,

you are probably right that doc2 will be ranked higher as it contains 
both id-fields. However, I am sure that this is not always the case in 
the default TF/IDF-ranking scheme.

As you mentioned in the second part, I could add a new field holding the 
number of matching fields. The Problem is that the list  of relevant ids 
is often changing...

Regards,
    Philippe

Ian Lea schrieb:
> I'm not sure I understand what you are asking, but if you search for
> id:5 id:6 then I think doc2 will be ranked higher, because it contains
> both fields.
>
> Or are you saying you want to rank based on the number of id fields in
> the document i.e. doc2 better than doc1 because 3 better than 5?  If
> that is the case you could add a new field to each document holding
> the number of id fields in that document and sort on that.
>
> Or maybe you just need to boost searches for preferred id values.
> id:5^4 or Query.setBoost().
>
>
>
> --
> Ian.
>
> On Thu, Jan 21, 2010 at 2:15 PM, Philippe <ma...@gmail.com> wrote:
>   
>> Hi everyone,
>>
>> I want to sort my results independent from my query string. Matching
>> "documents" should be ordered by the number of fields containing a specific
>> String.
>>
>> Let's assume my query returns 2 Documents:
>>
>> Doc1 contains 5 "ID"-fields (1,2,3,4,5)
>> Doc2 contains 3 "ID"-fields (5,6,7)
>>
>> I'm more interested in documents containing the ids (5,6) so Document2
>> should be ranked higher than Document1.
>>
>> What would be the best way to perform this?
>>
>> Regards,
>>   Philippe
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Sort by the number of matching fields

Posted by Ian Lea <ia...@gmail.com>.
I'm not sure I understand what you are asking, but if you search for
id:5 id:6 then I think doc2 will be ranked higher, because it contains
both fields.

Or are you saying you want to rank based on the number of id fields in
the document i.e. doc2 better than doc1 because 3 better than 5?  If
that is the case you could add a new field to each document holding
the number of id fields in that document and sort on that.

Or maybe you just need to boost searches for preferred id values.
id:5^4 or Query.setBoost().



--
Ian.

On Thu, Jan 21, 2010 at 2:15 PM, Philippe <ma...@gmail.com> wrote:
> Hi everyone,
>
> I want to sort my results independent from my query string. Matching
> "documents" should be ordered by the number of fields containing a specific
> String.
>
> Let's assume my query returns 2 Documents:
>
> Doc1 contains 5 "ID"-fields (1,2,3,4,5)
> Doc2 contains 3 "ID"-fields (5,6,7)
>
> I'm more interested in documents containing the ids (5,6) so Document2
> should be ranked higher than Document1.
>
> What would be the best way to perform this?
>
> Regards,
>   Philippe
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org