You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2010/06/01 07:20:53 UTC

Re: Should I avoid MultiFieldQueryParser?

What you lose by aggregating all real fields into 1 field is the ability to give fields different scoring weights.
Is a match in the post title equally important as a match in the body or in one of the comments?
If yes, then aggregate.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Bob Eastbrook <ba...@gmail.com>
> To: general@lucene.apache.org
> Sent: Mon, May 17, 2010 12:49:32 AM
> Subject: Should I avoid MultiFieldQueryParser?
> 
> Imagine a blog that needs to be searched.  I first thought I'd 
> index
posts and comments using these 
> fields:

BlogPostTitle
BlogPostContent
BlogComment

There 
> could be any number of BlogComments.

I have this working fine and use 
> MultiFieldQueryParser to generate a
query.  It seems to work.  A 
> search for "picnic" matches that term in
post titles, post contents, and 
> comments.

However, "Lucene in Action" (2nd edition MEAP proof, chapter 5 
> section
4) seems to advocate against using MultiFieldQueryParser and 
> instead
suggests using a single synthetic field to hold all searchable 
> text.
Perhaps this field would be called "contents" or "keywords".

Is 
> this accepted to be a best practice?  Should I dump a
BlogPostTitle, 
> BlogPostContent, and its BlogComments into a single
field?

Bob