You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Doron Cohen (JIRA)" <ji...@apache.org> on 2007/06/20 02:09:25 UTC
[jira] Commented: (LUCENE-933) QueryParser can produce empty sub
BooleanQueries when Analyzer proudces no tokens for input
[ https://issues.apache.org/jira/browse/LUCENE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506342 ]
Doron Cohen commented on LUCENE-933:
------------------------------------
> a) +foo:BBB +()
> I have no idea what the "right" thing to do for situation (a) is.
Interestingly, see TestQueryParser.testQPA():
assertQueryEquals("term +stop term", qpAnalyzer, "term term");
assertQueryEquals("term -stop term", qpAnalyzer, "term term");
So today already requiring word W to not/appear become a non-requirement in case W is a stopword.
Currently adding any of these would cause failure:
assertQueryEquals("term +(stop) term", qpAnalyzer, "term term");
assertQueryEquals("term -(stop) term", qpAnalyzer, "term term");
assertQueryEquals("term +(stop stop) term", qpAnalyzer, "term term");
assertQueryEquals("term -(stop stop) term", qpAnalyzer, "term term");
I feel comfortable with applying the logic we have for a single (stop)word on a group of (stop)words, i.e. making the added lines pass.
Interestingly, consider this query:
A B +(+C -C)
Regularly it would have no match, because
X AND NOT X == FALSE
but if C is a stopword, with the fixed(?) logic the query would become:
A B
and might have matches.
Now is that a glitch? I'd like to think not.
> QueryParser can produce empty sub BooleanQueries when Analyzer proudces no tokens for input
> -------------------------------------------------------------------------------------------
>
> Key: LUCENE-933
> URL: https://issues.apache.org/jira/browse/LUCENE-933
> Project: Lucene - Java
> Issue Type: Bug
> Reporter: Hoss Man
>
> as triggered by SOLR-261, if you have a query like this...
> +foo:BBB +(yak:AAA baz:CCC)
> ...where the analyzer produces no tokens for the "yak:AAA" or "baz:CCC" portions of the query (posisbly because they are stop words) the resulting query produced by the QueryParser will be...
> +foo:BBB +()
> ...that is a BooleanQuery with two required clauses, one of which is an empty BooleanQuery with no clauses.
> this does not appear to be "good" behavior.
> In general, QueryParser should be smarter about what it does when parsing encountering parens whose contents result in an empty BooleanQuery -- but what exactly it should do in the following situations...
> a) +foo:BBB +()
> b) +foo:BBB ()
> c) +foo:BBB -()
> ...is up for interpretation. I would think situation (b) clearly lends itself to dropping the sub-BooleanQuery completely. situation (c) may also lend itself to that solution, since semanticly it means "don't allow a match on any queries in the empty set of queries". .... I have no idea what the "right" thing to do for situation (a) is.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org