You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Martin Dietze <di...@fh-wedel.de> on 2007/10/09 09:55:47 UTC

Weird operator precedence with default operator AND

Hi,

 I've been going nuts trying to use LuceneParser parse query
strings using the default operator AND correctly:

String queryString = getQueryString();
QueryParser parser = new QueryParser("text", new StandardAnalyzer());
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
try {
  Query q = parser.parse(queryString);
  LOG.info("q: " + q.toString());
  /* [...] */

Here's two example queries and the results I get with and
without the `setDefaultOperator()' statetment:

Query: hose AND cat:Wohnen cat:Mode OR color:blau

- Default-Op OR:  (+text:hose +cat:Wohnen) cat:Mode color:blau
- Default-Op AND: +(+text:hose +cat:Wohnen) cat:Mode color:blau

Query: hose AND ( cat:Wohnen cat:Mode ) OR color:blau

- Default-Op OR:  (+text:hose +(cat:Wohnen cat:Mode)) color:blau
- Default-Op AND: (+text:hose +(+cat:Wohnen +cat:Mode)) color:blau

It seems like theparser handles the default case well, but what
I get with the default operator set to AND is completely
incorrect. I've seen this behaviour with both version 2.1.0 and
2.2.0.

Any hints?

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
I got it good, I got it bad. I got the sweetest sadness I ever had.
      --- the The

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Mark Miller <ma...@gmail.com>.

Martin Dietze wrote:
> On Wed, October 10, 2007, Mark Miller wrote:
>
>   
>> Back in the day you might have been able to call Query.toString() as the 
>> Query contract says that toString() should output valid QueryParser syntax. 
>> This does not work for many queries though (most notably Span Queries -- 
>> QueryParser knows nothing about Span queries).
>>     
>
> I see, so my old code which was based on QueryParser was not
> completely flawed :) Are there any other queries besides span
> queries which can occur with qsol and do not produce valid
> QueryParser syntax? 
>   
I'm not sure, I'd have to look into it.
> Also I wonder why a facette query, like `foo:bar' results in a
> SpanQuery `+spanNear([foo, bar], 0, true)' (I may not understand
> the concept here).
>   
Qsol has a different field search syntax: foo(bar).

If you give something like foo:bar or foo-bar, the results will depend 
on your analyzer. If using the standard analyzer, the ':' or '-' is 
thrown out and two tokens are generated: foo and bar. Like the standard 
Lucene QueryParser, if more than one token is generated from a single 
'queryparser token', they are looked for next to each other. The 
difference is that the standard Lucene QueryParser uses PhraseQuery's 
for this. Qsol uses SpanQuery's instead so that results are consistent 
if the clause needs to be in a SpanQuery rather than a BooleanQuery 
(PhraseQuery's cannot be nested in SpanQuery's). This is required 
because Qsol allows the mixing of Span/Non-Span queries.

If you want to get around this, I may be able to help.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
On Wed, October 10, 2007, Mark Miller wrote:

> Back in the day you might have been able to call Query.toString() as the 
> Query contract says that toString() should output valid QueryParser syntax. 
> This does not work for many queries though (most notably Span Queries -- 
> QueryParser knows nothing about Span queries).

I see, so my old code which was based on QueryParser was not
completely flawed :) Are there any other queries besides span
queries which can occur with qsol and do not produce valid
QueryParser syntax? 

Also I wonder why a facette query, like `foo:bar' results in a
SpanQuery `+spanNear([foo, bar], 0, true)' (I may not understand
the concept here).

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
Who the fsck is "General Failure", and why is he reading my disk?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
Chris,

On Thu, October 11, 2007, Chris Hostetter wrote:

> ... are you talking about preventing people from including field 
> specific queries in their query string? i'm guessing that you mean 
> something like this is okay...
> 
>         solr title:bobby body:boy
> 
> ...but this isn't...
> 
> 	solr title:bobby body:boy secret_field:xyzyq
> 
> ...is that the idea?

 yes that's just about it. We have two search engines for
different purposes. The first one indexes more fields than the
second and we want to prevent "good" search queries from failing
on the second. Supporting all theses fields on the second SE is
not a good idea since indexing all this additonal data would
have an impact on performance and index size.

> the easiest approach is to do your own simple pass over the query string, 
> and escape any metacharacters in clauses you don't like ... they'll be 
> treated as "terms" and either be ignored (if they are optional) or cause 
> the query to not match anything (if they are required)...

This is a very interesting idea. Yet I wonder how to deal with
such terms if they are part of an AND query (actually AND is our
default operator, so that a query "body:boy secret_field\:xyzyq"
would always fail. It seems obvious that in any case you end up
parsing the query in some way...

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
My family says I'm a psychopath, but the voices in my head disagree

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Chris Hostetter <ho...@fucit.org>.
: This would be nice, but unfortunately I do not have direct access
: to the solr server in my application. I need to parse queries,
: filter out blacklisted facettes and then parse them on to solr
: using solrj.

that depends ... what do you mean by a blacklisted facet?

facet counts are controlled by seperate query params then the query string 
... are you talking about preventing people from including field 
specific queries in their query string? i'm guessing that you mean 
something like this is okay...

        solr title:bobby body:boy

...but this isn't...

	solr title:bobby body:boy secret_field:xyzyq

...is that the idea?

the easiest approach is to do your own simple pass over the query string, 
and escape any metacharacters in clauses you don't like ... they'll be 
treated as "terms" and either be ignored (if they are optional) or cause 
the query to not match anything (if they are required)...

        solr title:bobby body:boy secret_field\:xyzyq







-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
On Wed, October 10, 2007, Chris Hostetter wrote:

> Eh ... not really.  it would be easier to just load the Qsol parser in 
> solr ... or toString() the query...

This would be nice, but unfortunately I do not have direct access
to the solr server in my application. I need to parse queries,
filter out blacklisted facettes and then parse them on to solr
using solrj.

Maybe I am missing out on something obvious, and there's an
entirely simple way to accomplish this?

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
Yoda of Borg I am. Assimilated you will be.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Chris Hostetter <ho...@fucit.org>.
: As usual, thank you to the gruff but brilliant Mr Hostetter.

Doh! ... sorry if i've been gruffer then usual ... i've been rotating my 
sleep schedule so my days start an hour earlier each day for the last 6 
days.  it's been throwing my psyche for a loop.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Mark Miller <ma...@gmail.com>.
As usual, thank you to the gruff but brilliant Mr Hostetter.

- Mark

Chris Hostetter wrote:
> : I have only taken passing glances at Solr, so I am afraid I cannot be of much
> : help. Certainly one of the Solr guys will be able to be of assistance though.
>
> the StandardRequestHandler in solr will accept anythign the lucene  
> QueryParser will accept ... sublcassing StandardRequestHandler to use the 
> Qsol parser instead would be fairly easy (there are some open feature 
> requests in Jira that will make it trivial, but they're still in flux)
>
> : Since Qsol generates Query objects, you just need to find out how to bypass
> : sending solr a query String and instead give it a Query object. I assume this
> : must be possible.
>
> Eh ... not really.  it would be easier to just load the Qsol parser in 
> solr ... or toString() the query...
>
> : Back in the day you might have been able to call Query.toString() as the Query
> : contract says that toString() should output valid QueryParser syntax. This
>
> Back in 1.4.3 it said "The representation used is one that is readable by 
> QueryParser" but that wasn't really a "contract" as much as it was a 
> statement about how the "core" queries behaved (hence the wording was 
> changed) ... a contract would imply that *anyone* subclassing Query must 
> obey the contract, and that would be an impossible contract for anyone but 
> lucene commiters to satisfy.
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Chris Hostetter <ho...@fucit.org>.
: I have only taken passing glances at Solr, so I am afraid I cannot be of much
: help. Certainly one of the Solr guys will be able to be of assistance though.

the StandardRequestHandler in solr will accept anythign the lucene  
QueryParser will accept ... sublcassing StandardRequestHandler to use the 
Qsol parser instead would be fairly easy (there are some open feature 
requests in Jira that will make it trivial, but they're still in flux)

: Since Qsol generates Query objects, you just need to find out how to bypass
: sending solr a query String and instead give it a Query object. I assume this
: must be possible.

Eh ... not really.  it would be easier to just load the Qsol parser in 
solr ... or toString() the query...

: Back in the day you might have been able to call Query.toString() as the Query
: contract says that toString() should output valid QueryParser syntax. This

Back in 1.4.3 it said "The representation used is one that is readable by 
QueryParser" but that wasn't really a "contract" as much as it was a 
statement about how the "core" queries behaved (hence the wording was 
changed) ... a contract would imply that *anyone* subclassing Query must 
obey the contract, and that would be an impossible contract for anyone but 
lucene commiters to satisfy.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Mark Miller <ma...@gmail.com>.
I have only taken passing glances at Solr, so I am afraid I cannot be of 
much help. Certainly one of the Solr guys will be able to be of 
assistance though.

Since Qsol generates Query objects, you just need to find out how to 
bypass sending solr a query String and instead give it a Query object. I 
assume this must be possible.

Back in the day you might have been able to call Query.toString() as the 
Query contract says that toString() should output valid QueryParser 
syntax. This does not work for many queries though (most notably Span 
Queries -- QueryParser knows nothing about Span queries).

- Mark

Martin Dietze wrote:
> Mark,
>
> On Wed, October 10, 2007, Martin Dietze wrote:
>
>   
>>> Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully 
>>> customizable precedence support - don't be fooled by the stale website...I 
>>> am actually working on version 2 as i have time)
>>>       
>> That sounds promising, I will check this out right now!
>>     
>
>  as far as I can judge this from what I've tested now it seem
> like qsol does handle operator precedence correctly for my
> test cases. However - excuse a possibly dumb question - how
> do I get out my query in a form accepted by solr?
>
> Cheers,
>
> Martin
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
Mark,

On Wed, October 10, 2007, Martin Dietze wrote:

> > Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully 
> > customizable precedence support - don't be fooled by the stale website...I 
> > am actually working on version 2 as i have time)
> 
> That sounds promising, I will check this out right now!

 as far as I can judge this from what I've tested now it seem
like qsol does handle operator precedence correctly for my
test cases. However - excuse a possibly dumb question - how
do I get out my query in a form accepted by solr?

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
I now declare this bizarre open!

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
Mark,

 this reply was just in time :)

On Wed, October 10, 2007, Mark Miller wrote:

> Precedence QueryParser (I think its in Lucene contrib packages - I don't 
> believe its perfect but I have not tried it)

I checked that one out, and while it improves things with
default settings I found it to exhibit the same incorrect
behaviour with default operator AND.

> Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully 
> customizable precedence support - don't be fooled by the stale website...I 
> am actually working on version 2 as i have time)

That sounds promising, I will check this out right now!

Thannk you!

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
Die Freiheit ist uns ein schoenes Weib. 
Sie hat einen Ober- und Unterleib.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Mark Miller <ma...@gmail.com>.
There is a lot on this topic if you search the archives.

Things to check out:

Precedence QueryParser (I think its in Lucene contrib packages - I don't 
believe its perfect but I have not tried it)

Qsol: myhardshadow.com/qsol (A query parser I wrote that has fully 
customizable precedence support - don't be fooled by the stale 
website...I am actually working on version 2 as i have time)

- Mark

Martin Dietze wrote:
> Hi,
>
>  I've been going nuts trying to use LuceneParser parse query
> strings using the default operator AND correctly:
>
> String queryString = getQueryString();
> QueryParser parser = new QueryParser("text", new StandardAnalyzer());
> parser.setDefaultOperator(QueryParser.AND_OPERATOR);
> try {
>   Query q = parser.parse(queryString);
>   LOG.info("q: " + q.toString());
>   /* [...] */
>
> Here's two example queries and the results I get with and
> without the `setDefaultOperator()' statetment:
>
> Query: hose AND cat:Wohnen cat:Mode OR color:blau
>
> - Default-Op OR:  (+text:hose +cat:Wohnen) cat:Mode color:blau
> - Default-Op AND: +(+text:hose +cat:Wohnen) cat:Mode color:blau
>
> Query: hose AND ( cat:Wohnen cat:Mode ) OR color:blau
>
> - Default-Op OR:  (+text:hose +(cat:Wohnen cat:Mode)) color:blau
> - Default-Op AND: (+text:hose +(+cat:Wohnen +cat:Mode)) color:blau
>
> It seems like theparser handles the default case well, but what
> I get with the default operator set to AND is completely
> incorrect. I've seen this behaviour with both version 2.1.0 and
> 2.2.0.
>
> Any hints?
>
> Cheers,
>
> Martin
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Martin Dietze <di...@fh-wedel.de>.
On Tue, October 09, 2007, Daniel Naber wrote:

> The operator precedence is known to be buggy. You need to use parenthesis, 
> e.g. (aa AND bb) OR (cc AND dd)

This would be fine with me but unfortunately not for my users.
More precisely, I need to analyze a query string from one search
engine, filter out a black list of facette queries and pass the
result on to a second search engine. This means that I have no
control over the way people enter their queries.

Is there any known query parser which handles this correctly?

Also, how does solr do this? It uses a parser derived from the
Lucene QueryParser, and I found it produces the same output,
however the search queries are still handled correctly, i.e. the
results I get indicate that deep down inside it seems to get it
right in the end.

Cheers,

Martin

-- 
----------- / http://herbert.the-little-red-haired-girl.org / -------------
=+= 
My name is spelled Luxury Yacht but it's pronounced Throatwabbler Mangrove.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Weird operator precedence with default operator AND

Posted by Daniel Naber <lu...@danielnaber.de>.
On Tuesday 09 October 2007 09:55, Martin Dietze wrote:

>  I've been going nuts trying to use LuceneParser parse query
> strings using the default operator AND correctly:

The operator precedence is known to be buggy. You need to use parenthesis, 
e.g. (aa AND bb) OR (cc AND dd)

regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org