You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jo...@ph.lawson.com on 2009/02/26 13:59:21 UTC

contains functionality in Lucene

Hi all,

We have a business requirement that needs Lucene to search similar to 
contains (of SQL) such that we can have something like *ucen*  which 
should return lucene and lucent ... unfortunately wildcards are not 
allowed at the start of the search keyword - how should I go about this? 
Is this possible in Lucene? Is this advisable (some forums suggest this is 
a huge performance hit)?

Thanks for all the help


"XP is making a bet. It is betting that it is better to do a simple thing 
today and pay a little more tomorrow to change it if it needs it, than to 
do a more complicated thing today that may never be used anyway" - Extreme 
Programming Explained, Embrace Change

Best Regards, 

Joseph F. Syjuco 
Team Lead 
M3 Alpha - e-Commerce 
__________________________________ 
Lawson PSSC, Inc. 
4th Floor, One World Square Building 
Upper McKinley Road, Taguig City 1634 
Philippines 

Work: +63 (2) 976-3600 loc 79396
Mobile: +63 (917) 855-3436
Web: http://www.lawson.com 
Email: joseph.syjuco@lawson.com 


-------------------- Internet e-Mail Disclaimer --------------------
This e-mail and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to which they are 
addressed. If you are not the intended recipient you are notified that any 
use, disclosure, copying or distribution of the information is prohibited. 
In such case, you should destroy this message and kindly notify the sender 
by reply e-mail. The views expressed in this e-mail and any attachments 
are personal and, unless stated explicitly, do not represent the views of 
Lawson Software, Inc. 

Re: contains functionality in Lucene

Posted by Danil ŢORIN <to...@gmail.com>.
You can generate n-grams: for example when you index "lucene" you
create tokens "luce", "ucen", "cene".

It will increase term count (and index size), however on search you
will simply search for a single term, which will be extremely fast.

It depends how may documents you have, size of each document, your
query rate, expected latency, hardware...

On Thu, Feb 26, 2009 at 15:36, Erick Erickson <er...@gmail.com> wrote:
> There is an option to turn leading wildcards on, see QueryParser.
>
> All the usual caveats about TooManyClauses apply....
>
> Best
> Erick
>
> On Thu, Feb 26, 2009 at 7:59 AM, <Jo...@ph.lawson.com> wrote:
>
>>
>> Hi all,
>>
>> We have a business requirement that needs Lucene to search similar to
>> contains (of SQL) such that we can have something like *ucen*  which should
>> return lucene and lucent ... unfortunately wildcards are not allowed at the
>> start of the search keyword - how should I go about this?  Is this possible
>> in Lucene? Is this advisable (some forums suggest this is a huge performance
>> hit)?
>>
>> Thanks for all the help
>>
>>  *"XP is making a bet. It is betting that it is better to do a simple
>> thing today and pay a little more tomorrow to change it if it needs it, than
>> to do a more complicated thing today that may never be used anyway" -
>> Extreme Programming Explained, Embrace Change*
>>
>> Best Regards,
>> *
>> Joseph F. Syjuco*
>> Team Lead
>> M3 Alpha - e-Commerce
>> __________________________________
>> Lawson PSSC, Inc.
>> 4th Floor, One World Square Building
>> Upper McKinley Road, Taguig City 1634
>> Philippines
>>
>> Work: +63 (2) 976-3600 loc 79396
>> Mobile: +63 (917) 855-3436
>> Web: *http://www.lawson.com* <http://www.lawson.com/>
>> Email: *joseph.syjuco@lawson.com* <jo...@lawson.com>
>>
>>  *-------------------- Internet e-Mail Disclaimer --------------------*
>>
>> This e-mail and any files transmitted with it are confidential and intended
>> solely for the use of the individual or entity to which they are addressed.
>> If you are not the intended recipient you are notified that any use,
>> disclosure, copying or distribution of the information is prohibited. In
>> such case, you should destroy this message and kindly notify the sender by
>> reply e-mail. The views expressed in this e-mail and any attachments are
>> personal and, unless stated explicitly, do not represent the views of Lawson
>> Software, Inc.
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: contains functionality in Lucene

Posted by Erick Erickson <er...@gmail.com>.
There is an option to turn leading wildcards on, see QueryParser.

All the usual caveats about TooManyClauses apply....

Best
Erick

On Thu, Feb 26, 2009 at 7:59 AM, <Jo...@ph.lawson.com> wrote:

>
> Hi all,
>
> We have a business requirement that needs Lucene to search similar to
> contains (of SQL) such that we can have something like *ucen*  which should
> return lucene and lucent ... unfortunately wildcards are not allowed at the
> start of the search keyword - how should I go about this?  Is this possible
> in Lucene? Is this advisable (some forums suggest this is a huge performance
> hit)?
>
> Thanks for all the help
>
>  *"XP is making a bet. It is betting that it is better to do a simple
> thing today and pay a little more tomorrow to change it if it needs it, than
> to do a more complicated thing today that may never be used anyway" -
> Extreme Programming Explained, Embrace Change*
>
> Best Regards,
> *
> Joseph F. Syjuco*
> Team Lead
> M3 Alpha - e-Commerce
> __________________________________
> Lawson PSSC, Inc.
> 4th Floor, One World Square Building
> Upper McKinley Road, Taguig City 1634
> Philippines
>
> Work: +63 (2) 976-3600 loc 79396
> Mobile: +63 (917) 855-3436
> Web: *http://www.lawson.com* <http://www.lawson.com/>
> Email: *joseph.syjuco@lawson.com* <jo...@lawson.com>
>
>  *-------------------- Internet e-Mail Disclaimer --------------------*
>
> This e-mail and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to which they are addressed.
> If you are not the intended recipient you are notified that any use,
> disclosure, copying or distribution of the information is prohibited. In
> such case, you should destroy this message and kindly notify the sender by
> reply e-mail. The views expressed in this e-mail and any attachments are
> personal and, unless stated explicitly, do not represent the views of Lawson
> Software, Inc.
>