You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vannia Rajan <va...@knackforge.com> on 2011/12/26 20:46:23 UTC

Custom Shingle Factory Filter Requirement

Hi,

  I'm trying to implement an advanced Auto-Suggest field. Consider an
example input String:

   "Word1 Word2 Word3 Word4 Word5 Word6"

   I just want this field to auto-suggest content based on whatever i type
(no matter i start typing from word1 or word4). I tried using
ShingleFilterFactory, but it isn't fully satisfying my requirements. With
the ShingleFilterFactory, i'm able to get combinations like:

  "Word1 Word2 Word3..", "Word2 Word3 Word4...", etc., working. But, it
does not create combinations like "Word1 Word4 Word5...". So, it works as
long as i use consecutive words as in the input, but would not work if i
skip any words in between.

  I'm trying to use a custom FilterFactory to satisfy the requirement.
Though i'm a basic Java Developer, who could compile out things (and a PHP
developer for several yrs), I'm not able to get an "Example Filter" that i
could extend to create a new filter/plugin. Looking at the source of
ShingleFilterFactory.java included with SOLR just has a reference to Lucene
class.

 --------

  There is also another requirement - to join another multi-valued field as
a prefix one after another to this single-valued field to create out
several other combinations of auto-suggest.

  I hope someone could guide me to proceed in the right direction..

-- 
Thanks,
Vanniarajan

Re: Custom Shingle Factory Filter Requirement

Posted by Vannia Rajan <va...@knackforge.com>.
On Tue, Dec 27, 2011 at 1:10 PM, Ahmet Arslan <io...@yahoo.com> wrote:

>
> To achieve this behavior, you can use StandardTokenizerFactory and
> EdgeNGramFilterFactory and LowerCaseFilterFactory at index time.
>
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
>

Thanks, but I ended up implementing a custom Transformer & used it as a
DataImport plugin (i used RegexTransformer's source-code as a reference).

This also helped me to merge another fields value to the current field in
the way i need.

-- 
Thanks,
Vanniarajan

Re: Custom Shingle Factory Filter Requirement

Posted by Ahmet Arslan <io...@yahoo.com>.
>   I'm trying to implement an advanced Auto-Suggest
> field. Consider an
> example input String:
> 
>    "Word1 Word2 Word3 Word4 Word5 Word6"
> 
>    I just want this field to auto-suggest
> content based on whatever i type
> (no matter i start typing from word1 or word4). 


To achieve this behavior, you can use StandardTokenizerFactory and EdgeNGramFilterFactory and LowerCaseFilterFactory at index time.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory