You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joe Attardi <ja...@gmail.com> on 2007/07/25 23:05:14 UTC

Assembling a query from multiple fields

Hi all,

Apologies for the cryptic subject line, but I couldn't think of a more
descriptive one-liner to describe my problem/question to you all. Still
fairly new to Lucene here, although I'm hoping to have more of a clue once I
get a chance to read "Lucene In Action".

I am implementing a search engine using Lucene for a web application. It is
not really a free-text search like some other, more standard
implementations.
The requirement is for the search to be as easy and user-friendly as
possible, so instead of specifying the field to search in the query itself -
such as ip:192.168.102.230 - and being parsed with QueryParser, the field is
being selected via a HTML <select> element, and the search keywords are
entered in a text field.

As far as I can tell, I basically have two options:
(1) Manually prepend the field identifier to the query text, for example:
          String fullQuery = field + ":" + queryText;
     then parse this query normally with QueryParser, OR
(2) Since I know it is only going to be searching one term, manually create
a TermQuery with a Term object representing what the user typed in, for
example:
          Query query = new TermQuery(new Term(field, queryText));

Is there any advantage or disadvantage to any of these, or is one preferable
over the other? My gut tells me that directly creating the TermQuery is more
efficient since it doesn't have to perform parsing, but I'm not sure.

I have other questions, too, but I don't want to get ahead of myself. One at
a time... :)

Appreciate any help you all might have!

-- 
Joe Attardi

Re: Assembling a query from multiple fields

Posted by Erick Erickson <er...@gmail.com>.
Since you say you're new, I'll risk stating the obvious <G>. Do you
know about the BooleanQuery class?

But do note that the TermQuery isn't quite what
you might expect. For instance, making a TermQuery from the
string "this is junk" is different from making three TermQuery
objects, one for each word. If that's confusing, it'll save you
some grief if you understand why before pushing on.....

But do pay attention to Erik's response. The QueryParser
will give you surprising results if, say, there's a colon in the data
the user typed in. And manually making TermQueries will give
you surprising results if you case differently.........

Best
Erick

On 7/25/07, Joe Attardi <ja...@gmail.com> wrote:
>
> Hi all,
>
> Apologies for the cryptic subject line, but I couldn't think of a more
> descriptive one-liner to describe my problem/question to you all. Still
> fairly new to Lucene here, although I'm hoping to have more of a clue once
> I
> get a chance to read "Lucene In Action".
>
> I am implementing a search engine using Lucene for a web application. It
> is
> not really a free-text search like some other, more standard
> implementations.
> The requirement is for the search to be as easy and user-friendly as
> possible, so instead of specifying the field to search in the query itself
> -
> such as ip:192.168.102.230 - and being parsed with QueryParser, the field
> is
> being selected via a HTML <select> element, and the search keywords are
> entered in a text field.
>
> As far as I can tell, I basically have two options:
> (1) Manually prepend the field identifier to the query text, for example:
>           String fullQuery = field + ":" + queryText;
>      then parse this query normally with QueryParser, OR
> (2) Since I know it is only going to be searching one term, manually
> create
> a TermQuery with a Term object representing what the user typed in, for
> example:
>           Query query = new TermQuery(new Term(field, queryText));
>
> Is there any advantage or disadvantage to any of these, or is one
> preferable
> over the other? My gut tells me that directly creating the TermQuery is
> more
> efficient since it doesn't have to perform parsing, but I'm not sure.
>
> I have other questions, too, but I don't want to get ahead of myself. One
> at
> a time... :)
>
> Appreciate any help you all might have!
>
> --
> Joe Attardi
>

Re: Assembling a query from multiple fields

Posted by Askar Zaidi <as...@gmail.com>.
I did this yesterday. Manually appended an extra field to the query. It
works fine.

On 7/26/07, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
>
> On Jul 25, 2007, at 5:05 PM, Joe Attardi wrote:
> > As far as I can tell, I basically have two options:
> > (1) Manually prepend the field identifier to the query text, for
> > example:
> >          String fullQuery = field + ":" + queryText;
> >     then parse this query normally with QueryParser, OR
> > (2) Since I know it is only going to be searching one term,
> > manually create
> > a TermQuery with a Term object representing what the user typed in,
> > for
> > example:
> >          Query query = new TermQuery(new Term(field, queryText));
> >
> > Is there any advantage or disadvantage to any of these, or is one
> > preferable
> > over the other? My gut tells me that directly creating the
> > TermQuery is more
> > efficient since it doesn't have to perform parsing, but I'm not sure.
>
> I recommend constructing the Query manually whenever possible to
> avoid the possibility of QueryParser escaping or other syntax getting
> in the way.  The only note to that is to be sure that the terms you
> pass to things like TermQuery are in the same state as they got
> indexed (lowercased, stemmed, whatever).  You can manually run
> through an Analyzer if you need to get the terms normalized in some
> fashion.
>
>         Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Assembling a query from multiple fields

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jul 25, 2007, at 5:05 PM, Joe Attardi wrote:
> As far as I can tell, I basically have two options:
> (1) Manually prepend the field identifier to the query text, for  
> example:
>          String fullQuery = field + ":" + queryText;
>     then parse this query normally with QueryParser, OR
> (2) Since I know it is only going to be searching one term,  
> manually create
> a TermQuery with a Term object representing what the user typed in,  
> for
> example:
>          Query query = new TermQuery(new Term(field, queryText));
>
> Is there any advantage or disadvantage to any of these, or is one  
> preferable
> over the other? My gut tells me that directly creating the  
> TermQuery is more
> efficient since it doesn't have to perform parsing, but I'm not sure.

I recommend constructing the Query manually whenever possible to  
avoid the possibility of QueryParser escaping or other syntax getting  
in the way.  The only note to that is to be sure that the terms you  
pass to things like TermQuery are in the same state as they got  
indexed (lowercased, stemmed, whatever).  You can manually run  
through an Analyzer if you need to get the terms normalized in some  
fashion.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org