You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joel Nylund <jn...@yahoo.com> on 2009/12/03 19:21:46 UTC

Re: how to do partial word searches?

Just for an update on this, I tried text_rev and it seems to work great.

So in summary, if you want partial word matches within a url or small  
sentence (title), here is what I did and it seems to work pretty well:

- create an extra field that is all lower case , I used mysql lcase in  
the query for DIH
- make that field use text_rev type in schema.xml
- make the query be "sulli OR *sulli*"    (the *sulli* doesnt seem to  
match sulli if its at the end of the field)

thanks
Joel



On Nov 25, 2009, at 9:21 AM, Robert Muir wrote:

> Hi, if you are using Solr 1.4 I think you might want to try type  
> text_rev
> (look in the example schema.xml)
>
> unless i am mistaken:
>
> this will enable leading wildcard support for that field.
> this doesn't do any stemming, which I think might be making your  
> wildcards
> behave wierd.
> it also enables reverse wildcard support, so some of your substring  
> matches
> will be faster.
>
> On Tue, Nov 24, 2009 at 7:51 PM, Joel Nylund <jn...@yahoo.com>  
> wrote:
>
>> Hi, I saw some older postings on this, but didnt see a resolution.
>>
>> I have a field called title, I would like to be able to find  
>> partial word
>> matches within the title.
>>
>> For example:
>>
>> http://localhost:8983/solr/select?q=textTitle:%22*sulli*%22
>>
>> I would expect it to find:
>> <str name="textTitle">the daily dish | by andrew sullivan</str>
>>
>> but it doesnt, it does find sully (which is fine with me also as a  
>> bonus),
>> but doesnt seem to get any of the partial word stuff. Oddly enough  
>> before I
>> lowercased the title, the wildcard matching seemed to work a bit  
>> better, it
>> just didnt deal with the case sensitive query.
>>
>> At first I had mixed case titles and I read that the wildcard  
>> doesn't work
>> with mixed case, so I created another field that is a lowered  
>> version of the
>> title called "textTitle", it is of type text.
>>
>> Is it possible with solr to achieve what I am trying to do, if so  
>> how? If
>> not, anything closer than what I have?
>>
>> thanks
>> Joel
>>
>>
>
>
> -- 
> Robert Muir
> rcmuir@gmail.com


Re: how to do partial word searches?

Posted by Rob Ganly <ro...@daft.ie>.
hi all,

i was having the same problem, i needed to be able to search a substring
anywhere within a word for a specific field. i used the
NGramTokenizerFactory factory in my index analyzer and it seems to work
well.  (
http://lucene.apache.org/solr/api/org/apache/solr/analysis/NGramTokenizerFactory.html
).

i created a new field type based on this definition:
http://coderrr.wordpress.com/category/solr/#ngram_schema_xml

apparently it will increased the size of your index and perhaps indexing
time but is working fine at the moment (although i'm currently only using a
testbed of 20'000 records). i will report back if i discover any painful
issues with scaling up!

rob ganly

On 3 December 2009 18:21, Joel Nylund <jn...@yahoo.com> wrote:

> Just for an update on this, I tried text_rev and it seems to work great.
>
> So in summary, if you want partial word matches within a url or small
> sentence (title), here is what I did and it seems to work pretty well:
>
> - create an extra field that is all lower case , I used mysql lcase in the
> query for DIH
> - make that field use text_rev type in schema.xml
> - make the query be "sulli OR *sulli*"    (the *sulli* doesnt seem to match
> sulli if its at the end of the field)
>
> thanks
> Joel
>
>
>
>
> On Nov 25, 2009, at 9:21 AM, Robert Muir wrote:
>
>  Hi, if you are using Solr 1.4 I think you might want to try type text_rev
>> (look in the example schema.xml)
>>
>> unless i am mistaken:
>>
>> this will enable leading wildcard support for that field.
>> this doesn't do any stemming, which I think might be making your wildcards
>> behave wierd.
>> it also enables reverse wildcard support, so some of your substring
>> matches
>> will be faster.
>>
>> On Tue, Nov 24, 2009 at 7:51 PM, Joel Nylund <jn...@yahoo.com> wrote:
>>
>>  Hi, I saw some older postings on this, but didnt see a resolution.
>>>
>>> I have a field called title, I would like to be able to find partial word
>>> matches within the title.
>>>
>>> For example:
>>>
>>> http://localhost:8983/solr/select?q=textTitle:%22*sulli*%22
>>>
>>> I would expect it to find:
>>> <str name="textTitle">the daily dish | by andrew sullivan</str>
>>>
>>> but it doesnt, it does find sully (which is fine with me also as a
>>> bonus),
>>> but doesnt seem to get any of the partial word stuff. Oddly enough before
>>> I
>>> lowercased the title, the wildcard matching seemed to work a bit better,
>>> it
>>> just didnt deal with the case sensitive query.
>>>
>>> At first I had mixed case titles and I read that the wildcard doesn't
>>> work
>>> with mixed case, so I created another field that is a lowered version of
>>> the
>>> title called "textTitle", it is of type text.
>>>
>>> Is it possible with solr to achieve what I am trying to do, if so how? If
>>> not, anything closer than what I have?
>>>
>>> thanks
>>> Joel
>>>
>>>
>>>
>>
>> --
>> Robert Muir
>> rcmuir@gmail.com
>>
>
>