You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Smiley, David W." <ds...@mitre.org> on 2010/03/15 21:26:58 UTC

NGram shortcomings

Hello all.

In a search app I'm working on, users are permitted to put wildcards at the beginning, end, and at both the beginning and end of their search queries.  For this to be fast, I need to use NGramFilterFactory.  But this isn't enough, apparently.  Firstly, my query parser needs to know to only use a field indexed this way for words that contain the wildcard.  I don't see anything that's going to make this happen so I guess I'll have to modify a query parser plugin.  (it's a shame these aren't more extensible, BTW)  Secondly, NGramFilterFactory does not mark that a term lies at the front or end of a word.  And so it appears I'll need to insert special marker characters at index and query time to the very beginning and end of each term by writing an filter that does this.  Does anyone know of existing functionality that will save me from doing these things, or perhaps something I'm overlooking in my approach?

~ David Smiley
Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/