You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Nair, Manas" <Ma...@mtvnmix.com> on 2010/03/15 11:39:28 UTC

How to retrieve unique values in typeahead

Hi experts,
 
Please help me out on this.
 
I have a collection of about 30K documents which pertain to pop artists (eg. Madonna, Michael Jackson). These artist names are indexed in the field named "artist_t" which has the following properties in dynamic field declaration:
<dynamicField name="*_t" type="text" indexed="true" stored="true"/>

Most of the documents will have MJ as their artist. I am using EdgeNGram filter factory to get a typeahead implementation. i.e.

when I type in "m" I would get "madonna", michael jackson", "miley cyrus" etc as results. The problem that I have now is that all these terms are repeated.

When I search for "m", instead of "madonna", "michael jackson".... I am getting MJ repeated many times in the initiall 10 docs that solr brings by default.

I need to make all these artists unique i.e if I search "m", I should get individual results just once?

How should I change the schema file and is there a query tweaking required?

Any help would be dearly appreciated.

Thanks and regards,

Manas


Re: How to retrieve unique values in typeahead

Posted by Ahmet Arslan <io...@yahoo.com>.
> I have a collection of about 30K documents which pertain to
> pop artists (eg. Madonna, Michael Jackson). These artist
> names are indexed in the field named "artist_t" which has
> the following properties in dynamic field declaration:
> <dynamicField name="*_t" type="text" indexed="true"
> stored="true"/>
> 
> Most of the documents will have MJ as their artist. I am
> using EdgeNGram filter factory to get a typeahead
> implementation. i.e.
> 
> when I type in "m" I would get "madonna", michael jackson",
> "miley cyrus" etc as results. The problem that I have now is
> that all these terms are repeated.
> 
> When I search for "m", instead of "madonna", "michael
> jackson".... I am getting MJ repeated many times in the
> initiall 10 docs that solr brings by default.
> 
> I need to make all these artists unique i.e if I search
> "m", I should get individual results just once?
> 
> How should I change the schema file and is there a query
> tweaking required?

http://wiki.apache.org/solr/TermsComponent (can be used for Auto-Suggest) can eliminate repeated terms. With this solution you don't need EdgeNGram anymore. If you want to suggest more than one term, you can add ShingleFilterFactory to your index analyzer chain.