You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Brian Goetz <br...@quiotix.com> on 2002/06/03 03:23:20 UTC
Re: Bug? QueryParser may not correctly interpret RangeQuery
text
>It's true that the unsofisticated end-user would not
>use SQL, but between range (inclusive, exclusive),
>boolean, fuzzy, etc., the simple query parser you have
>is evolving into something more complex than SQL.
Which is a reasonable argument that range queries are outside the scope of
what the query parser is supposed to do.
>While SQL supports them with key words, we are getting
>into an endless quest for unused characters to mark
>the latest variation of the query.
I wasn't too happy about having added ranges in the first place for exactly
this reason. The query parser is supposed to be a convenience, a 90/10
(actually, more like a 99/1) solution (one which handles 90% of the queries
with 10% of the work.) Pushing for that last 10% at the expense of the
first 90% is a bad tradeoff IMO. The raw query classes still work fine for
that last 10% (or 1%).
>By the way, it
>seems that you already have support for the "WHERE
>..." part (AND, OR, NOT, NEAR). If we had "LIKE" and
>"BETWEEN ... AND ..." we would have almost everything
>SQL has for the matching part.
Two responses to this:
1. Wrong. We don't have NEAR at all, and AND, OR, and NOT are simple
operators which give hints to the BooleanQuery class, they don't impart
structure to the query. They are, in fact, a convenience for expressing
(+a +b)
as
a AND b
mostly because mainstream search engines support AND and OR.
2. The argument that "we already have half of it, lets go all the way" is
a siren song. In lots of cases, this is basically equivalent to "two
wrongs make a right." As in "we already violated the XYZ principle for
some purpose, so there's no point in letting principle get in the way of
further 'progress'".
In this particular case, its not quite as bad as that, but its taking us in
a dangerous direction. The query parser is not a structured language for
free text queries -- if it was, it should be designed from the ground up to
be so. In cramming in too many features, it would be easy for it to lose
its most valuable feature -- simplicity. We may have already done that,
but there's no point in pushing further just in case there's any doubt.
>I think that the only way to have a query that does
>NOT look like a programming language is to have
>natural language understanding (which we won't have
>for a while.) Once the end user is forced to learn the
>difference between terms and operators, he already is
>in the realm of programming languages.
This is a strong argument for backing out some of the features already
added so far, but I'm sure that's not what you're suggesting (although
maybe you should be.)
But I think this argument is basically hogwash. Don't forget we're arguing
about features which will be used by less than 1% of the user base, and
probably less than 100th or maybe 1000th of 1% of all queries entered
through the query parser.
Right now, we have several ways of building queries:
- a simple query parser, which can handle the basics (terms, phrases,
field search, slop, wildcards);
- a flexible and powerful set of query classes with which developers can
build arbitrary queries;
- we can combine the above, letting the user enter query terms and
produce a Query, and then combine that with other query terms based on
input in a user interface (such as a date picker.)
Now, if you want to design a new query language, one which is actually
designed for its intended purpose (rather than having features accreted
every time someone feels that XYZ query structure is critical enough to go
in the query parser), be my guest -- I'll help, I'll even write the parser
for you. We can call it the AdvancedQueryParser or whatever you want to
call it, and I won't throw stones at your design.
But I'm going to vigourously -1 any proposal for the query parser that
makes the Joe Users out there pay for features that are only of interest to
Joe Gooroo.
Nobody has convinced me at all that the existing query parser is inadequate
for its intended purpose.
--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032
http://www.quiotix.com
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>