You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "M. Mokotov" <mk...@mokotov.org> on 2005/06/09 13:14:50 UTC

OR query on multiple fields causes low coord

Hi,
 
I have a question with regards to an OR query on multiple fields.
 
It seems that the more fields I'm splitting the documents into, the lower
the coord is getting.
As a result when I want to query the string S on many fields (a query like
F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which causes a
poor matching score.
I assume (and forgive me for assuming) that the reason is when calling
coord( overlap, maxOverlap ), maxOverlap=|S|*n (where n is the number of
fields on the query)
 
Is there any way to avoid that?
Can I have the coord computed per field? 
 
Thanks a lot for the help!

RE: OR query on multiple fields causes low coord

Posted by "M. Mokotov" <mk...@mokotov.org>.
Hi Paul,

Thanks for the help.

-----Original Message-----
From: Paul Elschot [mailto:paul.elschot@xs4all.nl] 
Sent: Thursday, June 09, 2005 8:11 PM
To: java-user@lucene.apache.org
Subject: Re: OR query on multiple fields causes low coord


On Thursday 09 June 2005 13:14, M. Mokotov wrote:
> Hi,
>  
> I have a question with regards to an OR query on multiple fields.
>  
> It seems that the more fields I'm splitting the documents into, the 
> lower the coord is getting. As a result when I want to query the 
> string S on many fields (a query like
> F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which 
> causes a poor matching score. I assume (and forgive me for assuming) 
> that the reason is when calling coord( overlap, maxOverlap ), 
> maxOverlap=|S|*n (where n is the number of fields on the query)
>  
> Is there any way to avoid that?
> Can I have the coord computed per field?

Yes. For the query above, use a BooleanQuery with a Similarity that has a
constant returning coord() method. This is difficult to do the QueryParser,
but it is easy to construct it in your own code. For the subqueries on the
fields you can still use the default similarity, as you see fit.

Have a look at the MultiFieldQueryParser in the source:
http://svn.apache.org/viewcvs.cgi/lucene/java/tags/lucene_1_4_3/src/java/org
/apache/lucene/queryParser/
Instead of the BooleanQuery constructed there, use a BooleanQuery that
overrides getSimilarity().

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: OR query on multiple fields causes low coord

Posted by Paul Elschot <pa...@xs4all.nl>.
On Thursday 09 June 2005 13:14, M. Mokotov wrote:
> Hi,
>  
> I have a question with regards to an OR query on multiple fields.
>  
> It seems that the more fields I'm splitting the documents into, the lower
> the coord is getting.
> As a result when I want to query the string S on many fields (a query like
> F1:(S) F2:(S) ... Fn:(S) ) I'm getting close-to-zero coords, which causes a
> poor matching score.
> I assume (and forgive me for assuming) that the reason is when calling
> coord( overlap, maxOverlap ), maxOverlap=|S|*n (where n is the number of
> fields on the query)
>  
> Is there any way to avoid that?
> Can I have the coord computed per field? 

Yes. For the query above, use a BooleanQuery with a Similarity that
has a constant returning coord() method. This is difficult to do the
QueryParser, but it is easy to construct it in your own code.
For the subqueries on the fields you can still use the default similarity,
as you see fit.

Have a look at the MultiFieldQueryParser in the source:
http://svn.apache.org/viewcvs.cgi/lucene/java/tags/lucene_1_4_3/src/java/org/apache/lucene/queryParser/
Instead of the BooleanQuery constructed there, use a BooleanQuery
that overrides getSimilarity().

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org