You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Carr, Adrian" <Ad...@JTV.com> on 2009/09/24 21:28:27 UTC

Alphanumeric Wild Card Search Question

Hello Solr Users,
I've tried to find the answer to this question, and have tried changing my configuration several times, but to no avail. I think someone on this list will know the answer.

Here's my question:
I have some products that I want to allow people to search for with wild cards. For example, if my product is YBM354, I'd like for users to be able to search on "YBM*", "YBM3*", "YBM35*" and for any of these searches to return that product. I've found that I can search for "YBM*" and get the product, just not the other combinations.

I found this: http://www.nabble.com/Can%C2%B4t-use-wildcard-%22*%22-on-alphanumeric-values--td24369209.html, but adding preserveOriginal="1" doesn't seem to make a difference.

I found an example here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory that is close, but I want to do the opposite. The example is:
"Super-Duper-XL500-42-AutoCoder!" -> 0:"Super", 1:"Duper", 2:"XL", 2:"SuperDuperXL", 3:"500" 4:"42", .....

In this example, I want to be able to find this record by searching for "XL5*".

I appreciate the help. Please let me know if there are any questions.

Thanks,
Adrian Carr



RE: Alphanumeric Wild Card Search Question

Posted by "Carr, Adrian" <Ad...@JTV.com>.
In case it helps, here's what I have currently, but I've been messing with different options:

<filter class="solr.WordDelimiterFilterFactory" 
					generateWordParts="0"
					generateNumberParts="0" 
					catenateWords="1" 
					catenateNumbers="1" 
					catenateAll="1" 
					splitOnNumerics="0"  
					preserveOriginal="1"/>
 

-----Original Message-----
From: Carr, Adrian [mailto:Adrian.Carr@JTV.com] 
Sent: Friday, September 25, 2009 9:28 AM
To: solr-user@lucene.apache.org
Subject: RE: Alphanumeric Wild Card Search Question

Hi Ken,
I am using the WordDelimiterFilterFactory. I thought I needed it because I thought that's what gave me the control over the options of how the words are split and indexed? I did try taking it out completely, but that didn't seem to help.

I'll try the analysis tool today. There has got to be a simple solution for this, but it is sure eluding me.
Thanks,
Adrian

-----Original Message-----
From: Ensdorf Ken [mailto:Ensdorf@zoominfo.com]
Sent: Thursday, September 24, 2009 5:03 PM
To: solr-user@lucene.apache.org
Subject: RE: Alphanumeric Wild Card Search Question

> Here's my question:
> I have some products that I want to allow people to search for with 
> wild cards. For example, if my product is YBM354, I'd like for users 
> to be able to search on "YBM*", "YBM3*", "YBM35*" and for any of these 
> searches to return that product. I've found that I can search for 
> "YBM*" and get the product, just not the other combinations.

Are you using WordDelimiterFilterFactory?  That would explain this behavior.

If so, do you need it - for the queries you describe you don't need that kind of tokenization.

Also, have you played with the analysis tool on the admin page, it is a great help in debugging things like this.

-Ken

RE: Alphanumeric Wild Card Search Question

Posted by "Carr, Adrian" <Ad...@JTV.com>.
Hi Ken,
I am using the WordDelimiterFilterFactory. I thought I needed it because I thought that's what gave me the control over the options of how the words are split and indexed? I did try taking it out completely, but that didn't seem to help.

I'll try the analysis tool today. There has got to be a simple solution for this, but it is sure eluding me.
Thanks,
Adrian

-----Original Message-----
From: Ensdorf Ken [mailto:Ensdorf@zoominfo.com] 
Sent: Thursday, September 24, 2009 5:03 PM
To: solr-user@lucene.apache.org
Subject: RE: Alphanumeric Wild Card Search Question

> Here's my question:
> I have some products that I want to allow people to search for with 
> wild cards. For example, if my product is YBM354, I'd like for users 
> to be able to search on "YBM*", "YBM3*", "YBM35*" and for any of these 
> searches to return that product. I've found that I can search for 
> "YBM*" and get the product, just not the other combinations.

Are you using WordDelimiterFilterFactory?  That would explain this behavior.

If so, do you need it - for the queries you describe you don't need that kind of tokenization.

Also, have you played with the analysis tool on the admin page, it is a great help in debugging things like this.

-Ken

RE: Alphanumeric Wild Card Search Question

Posted by Ensdorf Ken <En...@zoominfo.com>.
> Here's my question:
> I have some products that I want to allow people to search for with
> wild cards. For example, if my product is YBM354, I'd like for users to
> be able to search on "YBM*", "YBM3*", "YBM35*" and for any of these
> searches to return that product. I've found that I can search for
> "YBM*" and get the product, just not the other combinations.

Are you using WordDelimiterFilterFactory?  That would explain this behavior.

If so, do you need it - for the queries you describe you don't need that kind of tokenization.

Also, have you played with the analysis tool on the admin page, it is a great help in debugging things like this.

-Ken