You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Deepak Shakya <ju...@gmail.com> on 2012/07/22 17:17:35 UTC

QueryParser and BooleanQuery

Hi,

I have following dataset indexed in Lucene.
2010-04-21 02:24:01 GET /blank 200 120
2010-04-21 02:24:01 GET /US/registrationFrame 200 605
2010-04-21 02:24:02 GET /US/kids/boys 200 785
2010-04-21 02:24:02 POST /blank 304 56
2010-04-21 02:24:04 GET /blank 304 233
2010-04-21 02:24:04 GET /blank 500 567
2010-04-21 02:24:04 GET /blank 200 897
2010-04-21 02:24:04 POST /blank 200 567
2010-04-21 02:24:05 GET /US/search 200 658
2010-04-21 02:24:05 POST /US/shop 200 768
2010-04-21 02:24:05 GET /blank 200 347

I am querying it in two ways, first with QueryParser and other with
BooleanQuery.

*QueryParser version:*
Query q = new QueryParser(version, "cs-method", new
StandardAnalyzer(version)).parse("cs-method:GET AND cs-uri:/blank");

*BooleanQuery version:*
BooleanQuery q = new BooleanQuery();
q.add(new TermQuery(new Term("cs-method", "GET"),
BooleanClause.Occur.SHOULD);
q.add(new TermQuery(new Term("cs-uri", "/blank"),
BooleanClause.Occur.SHOULD);

When I run the two version, I am able to match the documents with the
QueryParser version, but not with BooleanQuery. The output is as follows:

*QueryParser output:*
Total Number of Documents - 11
Query --> +cs-method:get +cs-uri:blank
Total Clues Found - 5

*BooleanQuery output:*
Total Number of Documents - 11
Query --> cs-method:GET cs-uri:/blank
Total Clues Found - 0

Does anybody know why the BooleanQuery doesn't return any documents while
QueryParser does? Also, how can I change the BooleanQuery to work for the
above case?

-- 
With Regards,
Deepak Shakya

Re: QueryParser and BooleanQuery

Posted by Ian Lea <ia...@gmail.com>.
QueryParser returns a query.  Just add that to the BooleanQuery.

QueryParser qp = ...;
BooleanQuery bq = new BooleanQuery();
Query parsedq = qp.parse("...);
bq.add(parsedq, ...);



--
Ian.


On Mon, Jul 23, 2012 at 1:16 PM, Deepak Shakya <ju...@gmail.com> wrote:
> Hey Jack,
>
> Can you let me know how should I do that? I am using the Lucene 3.6 version
> and I dont see any parse() method for StandardAnalyzer.
>
>
> On Mon, Jul 23, 2012 at 8:47 AM, Jack Krupansky <ja...@basetechnology.com>wrote:
>
>> Yes, I failed to notice that the removal of the slash was yet another
>> instance of the analyzer transforming its input. But the bottom line is
>> that you must do 100% of the same steps that analysis performs. If in
>> doubt, pass your literals through the standard analyzer itself.
>>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Deepak Shakya
>> Sent: Sunday, July 22, 2012 9:35 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: QueryParser and BooleanQuery
>>
>>
>> I tried changing the case to lower case, but still the BooleanQuery doesn't
>> return any documents.
>>
>> I see that the text "/blank" is converted to "blank" in the QueryParser.
>> But in BooleanQuery it remains the same. When I remove the forward slash
>> sign from the input string, I get the matched documents with BooleanQuery.
>> Does the Standard Analyzer does this stripping of special characters as
>> well?
>>
>> On Sun, Jul 22, 2012 at 8:58 PM, Jack Krupansky <ja...@basetechnology.com>*
>> *wrote:
>>
>>  The query parser/analyzer is lower-casing the query terms automatically.
>>> You have to do the same with with terms for BooleanQuery -
>>> Term("cs-method", "GET") should be "Term("cs-method", "get")".
>>>
>>> StandardAnalyzer is doing the lower-casing.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Deepak Shakya
>>> Sent: Sunday, July 22, 2012 10:17 AM
>>> To: java-user@lucene.apache.org
>>> Subject: QueryParser and BooleanQuery
>>>
>>>
>>> Hi,
>>>
>>> I have following dataset indexed in Lucene.
>>> 2010-04-21 02:24:01 GET /blank 200 120
>>> 2010-04-21 02:24:01 GET /US/registrationFrame 200 605
>>> 2010-04-21 02:24:02 GET /US/kids/boys 200 785
>>> 2010-04-21 02:24:02 POST /blank 304 56
>>> 2010-04-21 02:24:04 GET /blank 304 233
>>> 2010-04-21 02:24:04 GET /blank 500 567
>>> 2010-04-21 02:24:04 GET /blank 200 897
>>> 2010-04-21 02:24:04 POST /blank 200 567
>>> 2010-04-21 02:24:05 GET /US/search 200 658
>>> 2010-04-21 02:24:05 POST /US/shop 200 768
>>> 2010-04-21 02:24:05 GET /blank 200 347
>>>
>>> I am querying it in two ways, first with QueryParser and other with
>>> BooleanQuery.
>>>
>>> *QueryParser version:*
>>>
>>> Query q = new QueryParser(version, "cs-method", new
>>> StandardAnalyzer(version)).****parse("cs-method:GET AND cs-uri:/blank");
>>>
>>>
>>> *BooleanQuery version:*
>>>
>>> BooleanQuery q = new BooleanQuery();
>>> q.add(new TermQuery(new Term("cs-method", "GET"),
>>> BooleanClause.Occur.SHOULD);
>>> q.add(new TermQuery(new Term("cs-uri", "/blank"),
>>> BooleanClause.Occur.SHOULD);
>>>
>>> When I run the two version, I am able to match the documents with the
>>> QueryParser version, but not with BooleanQuery. The output is as follows:
>>>
>>> *QueryParser output:*
>>>
>>> Total Number of Documents - 11
>>> Query --> +cs-method:get +cs-uri:blank
>>> Total Clues Found - 5
>>>
>>> *BooleanQuery output:*
>>>
>>> Total Number of Documents - 11
>>> Query --> cs-method:GET cs-uri:/blank
>>> Total Clues Found - 0
>>>
>>> Does anybody know why the BooleanQuery doesn't return any documents while
>>> QueryParser does? Also, how can I change the BooleanQuery to work for the
>>> above case?
>>>
>>> --
>>> With Regards,
>>> Deepak Shakya
>>>
>>> ------------------------------****----------------------------**
>>> --**---------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.****apache.org<
>>> java-user-**unsubscribe@lucene.apache.org<ja...@lucene.apache.org>
>>> >
>>> For additional commands, e-mail: java-user-help@lucene.apache.****org<
>>> java-user-help@lucene.**apache.org <ja...@lucene.apache.org>>
>>>
>>>
>>>
>>
>> --
>> With Regards,
>> Deepak Shakya
>> http://www.google.com/**profiles/justdpk<http://www.google.com/profiles/justdpk>
>>
>> ------------------------------**------------------------------**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
>> For additional commands, e-mail: java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>>
>>
>
>
> --
> With Regards,
> Deepak Shakya
> http://www.google.com/profiles/justdpk

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser and BooleanQuery

Posted by Deepak Shakya <ju...@gmail.com>.
Hi Trejkaz,

I am using the Standard Analyzer for my indexing. Can you provide an
example with which I can use the BooleanQuery to add both the fields in the
same forms and yet the documents are searched?

I would be really grateful if you can provide example for parsing the text
as well which is passed to the fields while I am searching.

On Tue, Jul 24, 2012 at 3:31 AM, Trejkaz <tr...@trypticon.org> wrote:

> On Mon, Jul 23, 2012 at 10:16 PM, Deepak Shakya <ju...@gmail.com> wrote:
> > Hey Jack,
> >
> > Can you let me know how should I do that? I am using the Lucene 3.6
> version
> > and I dont see any parse() method for StandardAnalyzer.
>
> In your case, presumably at indexing time you should be using a
> PerFieldAnalyzerWrapper with cs-uri getting a KeywordAnalyser.
>
> If you pass that analyser in when you construct the QueryParser, it
> won't remove the slash.
>
> The main thing is that you should use the same analyser for indexing
> and searching.
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
With Regards,
Deepak Shakya
http://www.google.com/profiles/justdpk

Re: QueryParser and BooleanQuery

Posted by Trejkaz <tr...@trypticon.org>.
On Mon, Jul 23, 2012 at 10:16 PM, Deepak Shakya <ju...@gmail.com> wrote:
> Hey Jack,
>
> Can you let me know how should I do that? I am using the Lucene 3.6 version
> and I dont see any parse() method for StandardAnalyzer.

In your case, presumably at indexing time you should be using a
PerFieldAnalyzerWrapper with cs-uri getting a KeywordAnalyser.

If you pass that analyser in when you construct the QueryParser, it
won't remove the slash.

The main thing is that you should use the same analyser for indexing
and searching.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser and BooleanQuery

Posted by Deepak Shakya <ju...@gmail.com>.
Hey Jack,

Can you let me know how should I do that? I am using the Lucene 3.6 version
and I dont see any parse() method for StandardAnalyzer.


On Mon, Jul 23, 2012 at 8:47 AM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Yes, I failed to notice that the removal of the slash was yet another
> instance of the analyzer transforming its input. But the bottom line is
> that you must do 100% of the same steps that analysis performs. If in
> doubt, pass your literals through the standard analyzer itself.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Deepak Shakya
> Sent: Sunday, July 22, 2012 9:35 PM
> To: java-user@lucene.apache.org
> Subject: Re: QueryParser and BooleanQuery
>
>
> I tried changing the case to lower case, but still the BooleanQuery doesn't
> return any documents.
>
> I see that the text "/blank" is converted to "blank" in the QueryParser.
> But in BooleanQuery it remains the same. When I remove the forward slash
> sign from the input string, I get the matched documents with BooleanQuery.
> Does the Standard Analyzer does this stripping of special characters as
> well?
>
> On Sun, Jul 22, 2012 at 8:58 PM, Jack Krupansky <ja...@basetechnology.com>*
> *wrote:
>
>  The query parser/analyzer is lower-casing the query terms automatically.
>> You have to do the same with with terms for BooleanQuery -
>> Term("cs-method", "GET") should be "Term("cs-method", "get")".
>>
>> StandardAnalyzer is doing the lower-casing.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Deepak Shakya
>> Sent: Sunday, July 22, 2012 10:17 AM
>> To: java-user@lucene.apache.org
>> Subject: QueryParser and BooleanQuery
>>
>>
>> Hi,
>>
>> I have following dataset indexed in Lucene.
>> 2010-04-21 02:24:01 GET /blank 200 120
>> 2010-04-21 02:24:01 GET /US/registrationFrame 200 605
>> 2010-04-21 02:24:02 GET /US/kids/boys 200 785
>> 2010-04-21 02:24:02 POST /blank 304 56
>> 2010-04-21 02:24:04 GET /blank 304 233
>> 2010-04-21 02:24:04 GET /blank 500 567
>> 2010-04-21 02:24:04 GET /blank 200 897
>> 2010-04-21 02:24:04 POST /blank 200 567
>> 2010-04-21 02:24:05 GET /US/search 200 658
>> 2010-04-21 02:24:05 POST /US/shop 200 768
>> 2010-04-21 02:24:05 GET /blank 200 347
>>
>> I am querying it in two ways, first with QueryParser and other with
>> BooleanQuery.
>>
>> *QueryParser version:*
>>
>> Query q = new QueryParser(version, "cs-method", new
>> StandardAnalyzer(version)).****parse("cs-method:GET AND cs-uri:/blank");
>>
>>
>> *BooleanQuery version:*
>>
>> BooleanQuery q = new BooleanQuery();
>> q.add(new TermQuery(new Term("cs-method", "GET"),
>> BooleanClause.Occur.SHOULD);
>> q.add(new TermQuery(new Term("cs-uri", "/blank"),
>> BooleanClause.Occur.SHOULD);
>>
>> When I run the two version, I am able to match the documents with the
>> QueryParser version, but not with BooleanQuery. The output is as follows:
>>
>> *QueryParser output:*
>>
>> Total Number of Documents - 11
>> Query --> +cs-method:get +cs-uri:blank
>> Total Clues Found - 5
>>
>> *BooleanQuery output:*
>>
>> Total Number of Documents - 11
>> Query --> cs-method:GET cs-uri:/blank
>> Total Clues Found - 0
>>
>> Does anybody know why the BooleanQuery doesn't return any documents while
>> QueryParser does? Also, how can I change the BooleanQuery to work for the
>> above case?
>>
>> --
>> With Regards,
>> Deepak Shakya
>>
>> ------------------------------****----------------------------**
>> --**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.****apache.org<
>> java-user-**unsubscribe@lucene.apache.org<ja...@lucene.apache.org>
>> >
>> For additional commands, e-mail: java-user-help@lucene.apache.****org<
>> java-user-help@lucene.**apache.org <ja...@lucene.apache.org>>
>>
>>
>>
>
> --
> With Regards,
> Deepak Shakya
> http://www.google.com/**profiles/justdpk<http://www.google.com/profiles/justdpk>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
> For additional commands, e-mail: java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>
>


-- 
With Regards,
Deepak Shakya
http://www.google.com/profiles/justdpk

Re: QueryParser and BooleanQuery

Posted by Jack Krupansky <ja...@basetechnology.com>.
Yes, I failed to notice that the removal of the slash was yet another 
instance of the analyzer transforming its input. But the bottom line is that 
you must do 100% of the same steps that analysis performs. If in doubt, pass 
your literals through the standard analyzer itself.

-- Jack Krupansky

-----Original Message----- 
From: Deepak Shakya
Sent: Sunday, July 22, 2012 9:35 PM
To: java-user@lucene.apache.org
Subject: Re: QueryParser and BooleanQuery

I tried changing the case to lower case, but still the BooleanQuery doesn't
return any documents.

I see that the text "/blank" is converted to "blank" in the QueryParser.
But in BooleanQuery it remains the same. When I remove the forward slash
sign from the input string, I get the matched documents with BooleanQuery.
Does the Standard Analyzer does this stripping of special characters as
well?

On Sun, Jul 22, 2012 at 8:58 PM, Jack Krupansky 
<ja...@basetechnology.com>wrote:

> The query parser/analyzer is lower-casing the query terms automatically.
> You have to do the same with with terms for BooleanQuery -
> Term("cs-method", "GET") should be "Term("cs-method", "get")".
>
> StandardAnalyzer is doing the lower-casing.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Deepak Shakya
> Sent: Sunday, July 22, 2012 10:17 AM
> To: java-user@lucene.apache.org
> Subject: QueryParser and BooleanQuery
>
>
> Hi,
>
> I have following dataset indexed in Lucene.
> 2010-04-21 02:24:01 GET /blank 200 120
> 2010-04-21 02:24:01 GET /US/registrationFrame 200 605
> 2010-04-21 02:24:02 GET /US/kids/boys 200 785
> 2010-04-21 02:24:02 POST /blank 304 56
> 2010-04-21 02:24:04 GET /blank 304 233
> 2010-04-21 02:24:04 GET /blank 500 567
> 2010-04-21 02:24:04 GET /blank 200 897
> 2010-04-21 02:24:04 POST /blank 200 567
> 2010-04-21 02:24:05 GET /US/search 200 658
> 2010-04-21 02:24:05 POST /US/shop 200 768
> 2010-04-21 02:24:05 GET /blank 200 347
>
> I am querying it in two ways, first with QueryParser and other with
> BooleanQuery.
>
> *QueryParser version:*
>
> Query q = new QueryParser(version, "cs-method", new
> StandardAnalyzer(version)).**parse("cs-method:GET AND cs-uri:/blank");
>
> *BooleanQuery version:*
>
> BooleanQuery q = new BooleanQuery();
> q.add(new TermQuery(new Term("cs-method", "GET"),
> BooleanClause.Occur.SHOULD);
> q.add(new TermQuery(new Term("cs-uri", "/blank"),
> BooleanClause.Occur.SHOULD);
>
> When I run the two version, I am able to match the documents with the
> QueryParser version, but not with BooleanQuery. The output is as follows:
>
> *QueryParser output:*
>
> Total Number of Documents - 11
> Query --> +cs-method:get +cs-uri:blank
> Total Clues Found - 5
>
> *BooleanQuery output:*
>
> Total Number of Documents - 11
> Query --> cs-method:GET cs-uri:/blank
> Total Clues Found - 0
>
> Does anybody know why the BooleanQuery doesn't return any documents while
> QueryParser does? Also, how can I change the BooleanQuery to work for the
> above case?
>
> --
> With Regards,
> Deepak Shakya
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>
>


-- 
With Regards,
Deepak Shakya
http://www.google.com/profiles/justdpk 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser and BooleanQuery

Posted by Deepak Shakya <ju...@gmail.com>.
I tried changing the case to lower case, but still the BooleanQuery doesn't
return any documents.

I see that the text "/blank" is converted to "blank" in the QueryParser.
But in BooleanQuery it remains the same. When I remove the forward slash
sign from the input string, I get the matched documents with BooleanQuery.
Does the Standard Analyzer does this stripping of special characters as
well?

On Sun, Jul 22, 2012 at 8:58 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> The query parser/analyzer is lower-casing the query terms automatically.
> You have to do the same with with terms for BooleanQuery -
> Term("cs-method", "GET") should be "Term("cs-method", "get")".
>
> StandardAnalyzer is doing the lower-casing.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Deepak Shakya
> Sent: Sunday, July 22, 2012 10:17 AM
> To: java-user@lucene.apache.org
> Subject: QueryParser and BooleanQuery
>
>
> Hi,
>
> I have following dataset indexed in Lucene.
> 2010-04-21 02:24:01 GET /blank 200 120
> 2010-04-21 02:24:01 GET /US/registrationFrame 200 605
> 2010-04-21 02:24:02 GET /US/kids/boys 200 785
> 2010-04-21 02:24:02 POST /blank 304 56
> 2010-04-21 02:24:04 GET /blank 304 233
> 2010-04-21 02:24:04 GET /blank 500 567
> 2010-04-21 02:24:04 GET /blank 200 897
> 2010-04-21 02:24:04 POST /blank 200 567
> 2010-04-21 02:24:05 GET /US/search 200 658
> 2010-04-21 02:24:05 POST /US/shop 200 768
> 2010-04-21 02:24:05 GET /blank 200 347
>
> I am querying it in two ways, first with QueryParser and other with
> BooleanQuery.
>
> *QueryParser version:*
>
> Query q = new QueryParser(version, "cs-method", new
> StandardAnalyzer(version)).**parse("cs-method:GET AND cs-uri:/blank");
>
> *BooleanQuery version:*
>
> BooleanQuery q = new BooleanQuery();
> q.add(new TermQuery(new Term("cs-method", "GET"),
> BooleanClause.Occur.SHOULD);
> q.add(new TermQuery(new Term("cs-uri", "/blank"),
> BooleanClause.Occur.SHOULD);
>
> When I run the two version, I am able to match the documents with the
> QueryParser version, but not with BooleanQuery. The output is as follows:
>
> *QueryParser output:*
>
> Total Number of Documents - 11
> Query --> +cs-method:get +cs-uri:blank
> Total Clues Found - 5
>
> *BooleanQuery output:*
>
> Total Number of Documents - 11
> Query --> cs-method:GET cs-uri:/blank
> Total Clues Found - 0
>
> Does anybody know why the BooleanQuery doesn't return any documents while
> QueryParser does? Also, how can I change the BooleanQuery to work for the
> above case?
>
> --
> With Regards,
> Deepak Shakya
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<ja...@lucene.apache.org>
> For additional commands, e-mail: java-user-help@lucene.apache.**org<ja...@lucene.apache.org>
>
>


-- 
With Regards,
Deepak Shakya
http://www.google.com/profiles/justdpk

Re: QueryParser and BooleanQuery

Posted by Jack Krupansky <ja...@basetechnology.com>.
The query parser/analyzer is lower-casing the query terms automatically. You 
have to do the same with with terms for BooleanQuery - Term("cs-method", 
"GET") should be "Term("cs-method", "get")".

StandardAnalyzer is doing the lower-casing.

-- Jack Krupansky

-----Original Message----- 
From: Deepak Shakya
Sent: Sunday, July 22, 2012 10:17 AM
To: java-user@lucene.apache.org
Subject: QueryParser and BooleanQuery

Hi,

I have following dataset indexed in Lucene.
2010-04-21 02:24:01 GET /blank 200 120
2010-04-21 02:24:01 GET /US/registrationFrame 200 605
2010-04-21 02:24:02 GET /US/kids/boys 200 785
2010-04-21 02:24:02 POST /blank 304 56
2010-04-21 02:24:04 GET /blank 304 233
2010-04-21 02:24:04 GET /blank 500 567
2010-04-21 02:24:04 GET /blank 200 897
2010-04-21 02:24:04 POST /blank 200 567
2010-04-21 02:24:05 GET /US/search 200 658
2010-04-21 02:24:05 POST /US/shop 200 768
2010-04-21 02:24:05 GET /blank 200 347

I am querying it in two ways, first with QueryParser and other with
BooleanQuery.

*QueryParser version:*
Query q = new QueryParser(version, "cs-method", new
StandardAnalyzer(version)).parse("cs-method:GET AND cs-uri:/blank");

*BooleanQuery version:*
BooleanQuery q = new BooleanQuery();
q.add(new TermQuery(new Term("cs-method", "GET"),
BooleanClause.Occur.SHOULD);
q.add(new TermQuery(new Term("cs-uri", "/blank"),
BooleanClause.Occur.SHOULD);

When I run the two version, I am able to match the documents with the
QueryParser version, but not with BooleanQuery. The output is as follows:

*QueryParser output:*
Total Number of Documents - 11
Query --> +cs-method:get +cs-uri:blank
Total Clues Found - 5

*BooleanQuery output:*
Total Number of Documents - 11
Query --> cs-method:GET cs-uri:/blank
Total Clues Found - 0

Does anybody know why the BooleanQuery doesn't return any documents while
QueryParser does? Also, how can I change the BooleanQuery to work for the
above case?

-- 
With Regards,
Deepak Shakya 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org