You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Brian Goetz <br...@quiotix.com> on 2002/08/28 22:03:15 UTC

Re: [Bug 12137] New: - Can '*' or '?' symbol be used as the first character of a search?

On Wed, Aug 28, 2002 at 07:52:01PM -0000, bugzilla@apache.org wrote:
> Do get me wrong, I did read the Parser Syntax, and understand that:
> "Note: You cannot use a * or ? symbol as the first character of a search."  
> However, It would have been nice for this feature.  I made the following 
> changes to QueryParser.jj, and it seems work fine.  I am not sure if there is 
> any side effect though.  Can someone verify this?

I think this is a bad idea.  

First of all, the query parser is a CONVENIENCE, not the only way to
build query objects.  If the query parser language is too restrictive,
then build the query objects programmatically.  Its not that hard.

There were reasons why the query language was designed this way.  If
you think that's an error, first you need to lobby for your position
to change the design, THEN we can think about changing the parser.

Parser are tricky.  Small changes can have big, unexpected effects.
Lets make sure we want to do this first (which I think we don't), and
then we can look at the implementation. 

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Bug 12137] New: - Can '*' or '?' symbol be used as the first character of a search?

Posted by Peter Carlson <ca...@bookandhammer.com>.
Hi,

Is the rationale why this is a "bad idea" mostly a performance 
argument? So if you don't have to search through every term in the 
index, then the results will return much faster -- right.

I understand the concern, but without some benchmark if the desired 
result is beneficial to a user then we might want to explore it more. 
Or should we just say that's it's a bad idea based on the inherent 
issues with the design?

I would like to have benchmarks for a few reasons
1) To be able to help resolve these kind of questions
2) provide people performance benchmarks when evaluating Lucene.

What would be a reasonable performance benchmark to test this against?

1) CPU speed - Pentium III/800Mhz+? Pentium 4/1.5GHz+? Ultrasparc 
IIi/440Mhz+?
2) Index size (# terms) - 100K,  500K, 1M, 2M
?What does the index store - the terms, the terms and data?
3) Query - single term, 5 terms (AND), 5 terms (OR), wildcard (END), 
wildcard (start), wildcard (Start and end)

Kelvin put out something a while ago on this.

Thoughts.

--Peter

On Wednesday, August 28, 2002, at 01:03 PM, Brian Goetz wrote:

> On Wed, Aug 28, 2002 at 07:52:01PM -0000, bugzilla@apache.org wrote:
>> Do get me wrong, I did read the Parser Syntax, and understand that:
>> "Note: You cannot use a * or ? symbol as the first character of a 
>> search."
>> However, It would have been nice for this feature.  I made the 
>> following
>> changes to QueryParser.jj, and it seems work fine.  I am not sure if 
>> there is
>> any side effect though.  Can someone verify this?
>
> I think this is a bad idea.
>
> First of all, the query parser is a CONVENIENCE, not the only way to
> build query objects.  If the query parser language is too restrictive,
> then build the query objects programmatically.  Its not that hard.
>
> There were reasons why the query language was designed this way.  If
> you think that's an error, first you need to lobby for your position
> to change the design, THEN we can think about changing the parser.
>
> Parser are tricky.  Small changes can have big, unexpected effects.
> Lets make sure we want to do this first (which I think we don't), and
> then we can look at the implementation.
>
> --
> To unsubscribe, e-mail:   
> <ma...@jakarta.apache.org>
> For additional commands, e-mail: 
> <ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>