You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Pravin Shinde <ge...@gmail.com> on 2006/07/28 11:46:47 UTC

Leading wildcard query

Hi,
I am trying to use Leading wildcard query, but I am not able to do it.
Any query with leading wildcard is failing with lexical error.

query = parser.parse( "*hi" )
JavaError: org.apache.lucene.queryParser.ParseException:
Lexical error at line 1, column 1.  Encountered: "*" (42), after : ""


I came across some documentation in Lucene FAQ which says

http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695

Note: Leading wildcards (e.g. *ook) are not supported by the
QueryParser (although Lucene could handle them -- see the comment in
QueryParser.jj to enable these kind of queries -- search for "OG: to
support prefix queries:").

Is there any way I can do Leading wildchard query ?

-- 
Regards,
Pravin Shinde

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Leading wildcard query

Posted by Pravin Shinde <ge...@gmail.com>.
Thanx for reply Miles
So, avoiding leading wildcard query was design decision
for sake of efficiency. Thanx for information.

On 7/28/06, Miles Barr <mi...@magpie.net> wrote:
> Pravin Shinde wrote:
>
> > I am trying to use Leading wildcard query, but I am not able to do it.
> > Any query with leading wildcard is failing with lexical error.
> >
> > query = parser.parse( "*hi" )
> > JavaError: org.apache.lucene.queryParser.ParseException:
> > Lexical error at line 1, column 1.  Encountered: "*" (42), after : ""
> >
> >
> > I came across some documentation in Lucene FAQ which says
> >
> > http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
> >
> >
> > Note: Leading wildcards (e.g. *ook) are not supported by the
> > QueryParser (although Lucene could handle them -- see the comment in
> > QueryParser.jj to enable these kind of queries -- search for "OG: to
> > support prefix queries:").
> >
> > Is there any way I can do Leading wildchard query ?
>
>
> You could implement something, but it would have to be done differently
> to how wildcard queries are currently done. A wildcard query expands to
> match all possible tokens that match that pattern currently in the index
> (restricted to that field). I think the way the index is set up makes it
> possible to build this list when you know at least the first character.
> By starting with a * you need to get the complete list of tokens, then
> filter out those that don't match. I imagine this would be quite slow,
> hence why it's not in Lucene at the moment.
>
>
>
>
> Miles
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Regards,
Pravin Shinde

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Leading wildcard query

Posted by Miles Barr <mi...@magpie.net>.
Pravin Shinde wrote:

> I am trying to use Leading wildcard query, but I am not able to do it.
> Any query with leading wildcard is failing with lexical error.
>
> query = parser.parse( "*hi" )
> JavaError: org.apache.lucene.queryParser.ParseException:
> Lexical error at line 1, column 1.  Encountered: "*" (42), after : ""
>
>
> I came across some documentation in Lucene FAQ which says
>
> http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695 
>
>
> Note: Leading wildcards (e.g. *ook) are not supported by the
> QueryParser (although Lucene could handle them -- see the comment in
> QueryParser.jj to enable these kind of queries -- search for "OG: to
> support prefix queries:").
>
> Is there any way I can do Leading wildchard query ? 


You could implement something, but it would have to be done differently 
to how wildcard queries are currently done. A wildcard query expands to 
match all possible tokens that match that pattern currently in the index 
(restricted to that field). I think the way the index is set up makes it 
possible to build this list when you know at least the first character. 
By starting with a * you need to get the complete list of tokens, then 
filter out those that don't match. I imagine this would be quite slow, 
hence why it's not in Lucene at the moment.




Miles



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Leading wildcard query

Posted by Erick Erickson <er...@gmail.com>.
You could form a filter, using the WildCardTermEnum or RegExTermEnum and
then use the filter with a ConstantScoreQuery. You lose relevancy, but
relevancy is an ambiguous concept with wildcards anyway.

Using the query parser with a leading wildcard, even if enabled, is almost
sure to give you a "TooManyClauses" exception unless your index is very,
very, very small. There's a thread I started on the list titled "I just
don't get wildcards at all" in which the guys generously gave me a tutorial
in what wildcards are all about. I suspect you'd find it interesting.....
You might also want to search the archive for any TooManyClauses since this
has been discussed several times.....

Best
Erick