You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Dmitry Serebrennikov <dm...@earthlink.net> on 2004/06/14 19:32:46 UTC
Re: multiple fields to be indexed
jitender ahuja wrote:
>Hi all,
>
>I am trying to do a search on multiple fields of which some are indexed. Now, a query can be posed to search in the indexed fields but the Hits class object can have only one query object and the query class similarly can have only one field on which the query is performed.
>
Take a look at BooleanQuery. That will allow you to combine multiple
TermQueries, such that you can create a single query on different
fields. That should do the trick.
Good luck!
Dmitry.
>Now, to have multiple queries I need to obtain multiple Hits class objects that are then merged together in one array. But, I do not know how to maintain the scorings obtained by different Hits class objects. If somebody has tried such a thing or a different mechanism to feed a query to different query objects, then pl. respond.
>Actually I have not tried the above option of merging the results in a final array but I feel that it should work.
>
>Regards
>Jitender
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
Re: multiple fields to be indexed
Posted by Dmitry Serebrennikov <dm...@earthlink.net>.
Jitender,
I hope you don't mind if I move this thread to the user list.
A "term" in lucene specifies not only the text string, but also a field
in which it occurs. In the query parser terms are notated as <fieldname>
":" <text>, so "name:blue" will be a string 'blue' in field 'name',
whereas "desc:blue" will be the same string but in the 'desc' field. So
if you type something like the following into the query parser you will
have a query that matches occurances of 'blue' in either field:
"name:blue desc:blue"
If you want to match documents that have 'blue' in both fields, the
query would look like this:
"+name:blue +desc:blue"
There are other notations that may also be useful. A good explanation of
the query parser can be found here:
http://jakarta.apache.org/lucene/docs/queryparsersyntax.html.
In the API, the same thing can be achieved with two TermQuery objects
joined together with a BooleanQuery. For example:
TermQuery tq1 = new TermQuery( new Term("name", "blue"));
TermQuery tq2 = new TermQuery( new Term("desc", "blue"));
BooleanQuery q = new BooleanQuery();
q.add(tq1, false, false);
q.add(tq2, false, false);
// At this point, query q will match documents that have the word
'blue' in either 'name' or 'desc' field.
// So this is essentially an "or" query. If you want an "and" query,
change the last two lines as follows:
// q.add(tq1, true, false);
// q.add(tq2, true, false);
Does this make more sense now?
The boolean logic at the API level is a bit unusual, if you are used to
relational databases. Lucene is more like Google in this respect, rather
than an SQL database. A boolean query component can be either required,
prohibited, or neither. If it is required (+ prefix in the query
parser), it MUST match for the entire query to match. If it is
prohibited (- prefix in the query parser), it MUST NOT match for the
query to match. If it is neither (no prefix in the query parser), it is
not required to match for the query to match, provided some other
component of the query does. This last one may seem useless, except that
if this query component does match, the score will be boosted. So
documents that do match this component of the query will be returned
ahead of those that do not.
Good luck!
Dmitry
jitender ahuja wrote:
>Hi,
> Can u pl. clarify some more how to use the BooleanQuery class as I am
>clueless still.
>As far as I can gather it deals with multiple query terms and not with
>searching a (may be single) given query term in multiple (indexed) fields.
>
>Regards,
>Jitender
>----- Original Message -----
>From: "Dmitry Serebrennikov" <dm...@earthlink.net>
>To: "Lucene Developers List" <lu...@jakarta.apache.org>
>Sent: Monday, June 14, 2004 11:02 PM
>Subject: Re: multiple fields to be indexed
>
>
>
>
>>jitender ahuja wrote:
>>
>>
>>
>>>Hi all,
>>>
>>>I am trying to do a search on multiple fields of which some are indexed.
>>>
>>>
>Now, a query can be posed to search in the indexed fields but the Hits class
>object can have only one query object and the query class similarly can have
>only one field on which the query is performed.
>
>
>>Take a look at BooleanQuery. That will allow you to combine multiple
>>TermQueries, such that you can create a single query on different
>>fields. That should do the trick.
>>
>>Good luck!
>>Dmitry.
>>
>>
>>
>>>Now, to have multiple queries I need to obtain multiple Hits class
>>>
>>>
>objects that are then merged together in one array. But, I do not know how
>to maintain the scorings obtained by different Hits class objects. If
>somebody has tried such a thing or a different mechanism to feed a query to
>different query objects, then pl. respond.
>
>
>>>Actually I have not tried the above option of merging the results in a
>>>
>>>
>final array but I feel that it should work.
>
>
>>>Regards
>>>Jitender
>>>
>>>
>>>
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>>
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: multiple fields to be indexed
Posted by jitender ahuja <aj...@aalayance.com>.
Hi,
Can u pl. clarify some more how to use the BooleanQuery class as I am
clueless still.
As far as I can gather it deals with multiple query terms and not with
searching a (may be single) given query term in multiple (indexed) fields.
Regards,
Jitender
----- Original Message -----
From: "Dmitry Serebrennikov" <dm...@earthlink.net>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Sent: Monday, June 14, 2004 11:02 PM
Subject: Re: multiple fields to be indexed
> jitender ahuja wrote:
>
> >Hi all,
> >
> >I am trying to do a search on multiple fields of which some are indexed.
Now, a query can be posed to search in the indexed fields but the Hits class
object can have only one query object and the query class similarly can have
only one field on which the query is performed.
> >
> Take a look at BooleanQuery. That will allow you to combine multiple
> TermQueries, such that you can create a single query on different
> fields. That should do the trick.
>
> Good luck!
> Dmitry.
>
> >Now, to have multiple queries I need to obtain multiple Hits class
objects that are then merged together in one array. But, I do not know how
to maintain the scorings obtained by different Hits class objects. If
somebody has tried such a thing or a different mechanism to feed a query to
different query objects, then pl. respond.
> >Actually I have not tried the above option of merging the results in a
final array but I feel that it should work.
> >
> >Regards
> >Jitender
> >
> >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org