You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Erik Hatcher <li...@ehatchersolutions.com> on 2003/01/01 02:37:10 UTC
Re: QueryParser question

Doug,

Your points are well taken and I appreciate your time in replying to 
this.  I'm on the same wavelength with this thinking about QueryParser, 
and I realize I'm attempting to push it past it designed simplicity.  
I'm not as knowledgeable (and who is?!) on Lucene's API and design as 
you and many others here and I've learned a lot in the past couple of 
days.  I'll explore the Analyzer idea of returning different tokenizers 
based on the field that is being dealt with - that might just be the 
ticket.

I do (and I haven't thought this through more) think that having a 
Keyword field stay that way rather than allowing other tokenized text 
to be added to it is a better way.  I could change my mind on that as I 
evolve my experience with Lucene, and will, of course, have to live 
with how it is now.

Since you brought up the dates with QueryParser - its implementation 
seems a bit rough.  What's the point in supporting the date ranges with 
QueryParser if you cannot use a human readable date?  Its my 
understanding that you have to convert a Date to a collatable 
representation just to use it with QueryParser, right?  So its got to 
be computer generated anyway, so I might as well use the API to 
construct the query for date ranges.  If I'm wrong in my understanding 
of QueryParser date support, please by all means correct me.

And for the record, I am constructing some queries through QueryParser, 
and some through the API and gluing them together as a BooleanQuery.  
My questions here are to increase my understanding of how to use the 
API more effectively, and leverage what is already easily available.  
And life is made easier by letting QueryParser take care of much of the 
dirty work, so you can't blame me for pushing the limits of what it 
can/should do.  :)

Thanks again for your time.

	Erik


On Tuesday, December 31, 2002, at 02:51  PM, Doug Cutting wrote:

> Doug Cutting wrote:
>> However, in most cases where this is an issue, the real problem is 
>> that folks are placing too much reliance on the query parser.  The 
>> query parser is designed for user-entered queries.  If you're 
>> programmatically generating query strings that are then fed to the 
>> query parser, then you would be better served by directly 
>> constructing queries.
>
> This bears emphasis.  Abuse of the query parser may be the single most 
> common source of problems with Lucene.  We should probably add 
> guidelines for query parser use to the FAQ and/or query parser 
> documentation.
>
> Some rules of thumb are:
>
> - If you are programmatically generating a query string and then 
> parsing it with the query parser then you should seriously consider 
> building your queries directly with the query API.  In other words, 
> the query parser is designed for human-entered text, not for 
> program-generated text.
>
> - Untokenized fields are best added directly to queries, and not 
> through the query parser.  If a field's values are generated 
> programmatically by the application, then so should query clauses for 
> this field. Analyzers, like the query parser, are designed to convert 
> human-entered text to terms.  Program-generated values, like dates, 
> keywords, etc., should be consistently program-generated.
>
> - In a query form, fields which are general text should use the query 
> parser.  All others, e.g., date ranges, keywords, etc. are better 
> added directly through the query API.  A field with a limit set of 
> values, that can be specified with a pulldown menu should not be added 
> to a query string which is subsequently parsed, but rather added as a 
> TermQuery clause.
>
> I hope that by saying the same thing several times in slightly 
> different ways folks will get the idea!  Of course, these are not 
> absolute rules: there are exceptions.  The query parser can do more 
> than it should.  But when this is done, problems frequently occur.  
> Caveat emptor.
>
> Doug
>
>
> --
> To unsubscribe, e-mail:   
> <ma...@jakarta.apache.org>
> For additional commands, e-mail: 
> <ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>