You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2013/11/09 12:47:18 UTC
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse
human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818114#comment-13818114 ]
Michael McCandless commented on LUCENE-5336:
--------------------------------------------
This is AWESOME. I love how the operators (even whitespace!) are
optional. And I love the name :) And it's great that it NEVER throws
an exc no matter how awful the input is. And I love that it does not
use a lexer/parser generator: this makes it much more approachable
to those devs that don't have experience with parser generators.
Small javadoc fix: instead of "any {@code -} characters beyond the
first character in a term may not need to be escaped," I think it
should say "any {@code -} characters beyond the first character do not
need to be escaped" (and same for * operator)"?
How does it handle mal-formed input, e.g. a missing closing " for a
phrase query? If I enter "foo bar will it just make a term query for
"foo and a term query for bar? Or, does it strip that " and do query
foo instead? (Same for missing closing paren?). It looks like it
drops the " and ( and does a simple term query (good).
Maybe you could add fangs to the random test by more frequently mixing
in these operator characters ...
> Add a simple QueryParser to parse human-entered queries.
> --------------------------------------------------------
>
> Key: LUCENE-5336
> URL: https://issues.apache.org/jira/browse/LUCENE-5336
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Jack Conradson
> Attachments: LUCENE-5336.patch
>
>
> I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost).
> All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following:
> * AND specified as '+'
> * OR specified as '|'
> * NOT specified as '-'
> * PHRASE surrounded by double quotes
> * PREFIX specified as '*'
> * PRECEDENCE surrounded by '(' and ')'
> * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used
> * ESCAPE specified as '\' will allow operators to be used in terms
> The key differences between this parser and other existing parsers will be the following:
> * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered.
> * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters.
> * The parser is hand-written and in a single Java file making it easy to modify.
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org