You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Rich Livingstone <na...@bananasoft.net> on 2009/10/17 22:24:21 UTC

How to make each term of query appear in at least one field of multiple field query

I am a bit stumped by how to ensure that, where there are multiple terms in
my query, that each term must appear at least once across all specified
fields of my document. I am using MultiFieldQueryParser and
BooleanClause.Occur.SHOULD in the query; I tried putting all fields into a
single giant field and prefixing each term with + but that loses the ability
to boost individual fields in the search (unless there is a way to add a
boost per term indexed at index time?) so it doesn't seem a good solution as
I need the boosting too.

Anyone any idea how to do this ? BooleanClause.Occur.MUST would force all
terms to be in all fields, I would have thought - and the ability to do
weighting is important to me.

I'm using Lucene 2.2.0.

Thanks for any help !
-- 
View this message in context: http://www.nabble.com/How-to-make-each-term-of-query-appear-in-at-least-one-field-of-multiple-field-query-tp25941499p25941499.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to make each term of query appear in at least one field of multiple field query

Posted by Rich Livingstone <na...@bananasoft.net>.
Good idea, Steve, thanks for that. It could end up being a very large query,
though, and my search is even more complicated by the fact I have to do
synonym expansion on the search too so

+(field1:term1^field1boost field2:term1^field2boost ...)
+(field1:term2^field1boost field2:term2^field2boost ...)

could turn into

+(field1:term1^field1boost field1:term1synonym1^field1boost
term1synonym2^field1boost  field2:term1^field2boost ...)
...

I thought about doing the expansion at indexing time but that makes it less
dynamic as changes require a reindexing and this is driving an ecommerce
website which might not take kindly to search downtime too often.

The javadocs for MultiFieldQueryParser also suggest that using this class by
creating an object & then calling parse, rather than using
MultiFieldQueryParser.parse which is what I do, could achieve the same
effect. At the moment I'm sort of working round this by doing 2 queries, one
on a concatenated field which determines the valid set of hits and doing a
union of that with a search that actually takes into account weightings
properly. Seems to work but 2 searches is obviously not so good !

I'll give it a whirl and post my results.


Steven A Rowe wrote:
> 
> Hi Rich,
> 
> On 10/17/2009 at 4:24 PM, Rich Livingstone wrote:
>> I am a bit stumped by how to ensure that, where there are multiple
>> terms in my query, that each term must appear at least once across
>> all specified fields of my document.
> 
> You can create a MUST clause for each term, with a boosted SHOULD term
> query for each field, e.g.:
> 
> +(field1:term1^field1boost field2:term1^field2boost ...)
> +(field1:term2^field1boost field2:term2^field2boost ...)
> ...
> 
> Steve
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-make-each-term-of-query-appear-in-at-least-one-field-of-multiple-field-query-tp25941499p25970508.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to make each term of query appear in at least one field of multiple field query

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Rich,

On 10/17/2009 at 4:24 PM, Rich Livingstone wrote:
> I am a bit stumped by how to ensure that, where there are multiple
> terms in my query, that each term must appear at least once across
> all specified fields of my document.

You can create a MUST clause for each term, with a boosted SHOULD term query for each field, e.g.:

+(field1:term1^field1boost field2:term1^field2boost ...)
+(field1:term2^field1boost field2:term2^field2boost ...)
...

Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org