You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Oystein Reigem <oy...@aksis.uib.no> on 2007/03/13 19:30:32 UTC

Wildcard searches with * or ? as the first character

Hi,

I have read that with Lucene it is not possible to do wildcard searches 
with * or ? as the first character. Wildcard searches with * as the 
first character (or both first and last character) are useful for text 
in languages that have a lot of compound words, like German and the 
Scandinavian languages.

Some systems do offer such searches, but at a penalty. I assume such 
systems sometimes do a sequential search of the text, which is slow, and 
sometimes a sequential search of an index, which might be a bit faster, 
but still quite slow.

But a slow search might be better than no search, as long as the user is 
aware of the consequences of doing wildcard searches starting with a 
wildcard character.

Any comments?

Cheers,

- Øystein -

-- 
Øystein Reigem, The department of culture, language and information technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32 42. Fax: +47 55 58 94 70. E-mail: <oy...@aksis.uib.no>. Home tel: +47 56 14 06 11. Mobile: +47 97 16 96 64. Home e-mail: <or...@broadpark.no>. Aksis home page: <www.aksis.uib.no>.


Re: Wildcard searches with * or ? as the first character - Thanks

Posted by Oystein Reigem <oy...@aksis.uib.no>.
Thanks Steven and Antony.

I read the FAQ not very long ago, but that slipped my attention. Or 
perhaps it's a recent change.

- Øystein -

-- 
Øystein Reigem, The department of culture, language and information technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32 42. Fax: +47 55 58 94 70. E-mail: <oy...@aksis.uib.no>. Home tel: +47 56 14 06 11. Mobile: +47 97 16 96 64. Home e-mail: <or...@broadpark.no>. Aksis home page: <www.aksis.uib.no>.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Wildcard searches with * or ? as the first character

Posted by Steven Parkes <st...@esseff.org>.
It's possible to do leading wildcard searches in Lucene as of 2.1. See 
http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
(http://tinyurl.com/366suf)

-----Original Message-----
From: Oystein Reigem [mailto:oystein.reigem@aksis.uib.no] 
Sent: Tuesday, March 13, 2007 11:31 AM
To: java-user@lucene.apache.org
Subject: Wildcard searches with * or ? as the first character

Hi,

I have read that with Lucene it is not possible to do wildcard searches 
with * or ? as the first character. Wildcard searches with * as the 
first character (or both first and last character) are useful for text 
in languages that have a lot of compound words, like German and the 
Scandinavian languages.

Some systems do offer such searches, but at a penalty. I assume such 
systems sometimes do a sequential search of the text, which is slow, and 
sometimes a sequential search of an index, which might be a bit faster, 
but still quite slow.

But a slow search might be better than no search, as long as the user is 
aware of the consequences of doing wildcard searches starting with a 
wildcard character.

Any comments?

Cheers,

- Øystein -

-- 
Øystein Reigem, The department of culture, language and information technology (Aksis), Allegt 27, N-5007 Bergen, Norway. Tel: +47 55 58 32 42. Fax: +47 55 58 94 70. E-mail: <oy...@aksis.uib.no>. Home tel: +47 56 14 06 11. Mobile: +47 97 16 96 64. Home e-mail: <or...@broadpark.no>. Aksis home page: <www.aksis.uib.no>.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Wildcard searches with * or ? as the first character

Posted by Antony Bowesman <ad...@teamware.com>.
> I have read that with Lucene it is not possible to do wildcard searches 
> with * or ? as the first character. Wildcard searches with * as the 

Lucene supports it.  If you are using QueryParser to parse your queries see

http://lucene.apache.org/java/docs/api/org/apache/lucene/queryParser/QueryParser.html#setAllowLeadingWildcard(boolean)

Antony




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org