You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Doug McKenzie <do...@firebox.com> on 2011/09/22 19:02:32 UTC

Autosuggest best practice / feedback

Hi there,

I'm relatively new to Solr and have been playing around with it for a 
few weeks now. I've got a system setup now that I'm currently quite 
happy with and is returning some decent results (although there's always 
room for improvement). Just hoping to get some feedback on the setup

Currently running 2 seperate Solr engines, one tasked with storing 
products and their various info, the other is storing previous site 
searches and is being used for auto suggest functionality.

The auto suggest schema :

<fieldType name="text_ngram" class="solr.TextField" 
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords_en.txt" enablePositionIncrement="true"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" 
maxGramSize="15" side="front"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

Stopwords is being used to filter out rude words from previous searches 
(is this the best way of doing things?)

Also looking at implementing a "Did you mean?" suggestor which will 
probably search against a WhitespaceTokened field of the same data 
rather than this one.

Any thoughts / feedback / comments / criticism / biscuits appreciated

Cheers
Doug

--
Become a Firebox Fan on Facebook: http://facebook.com/firebox
And Follow us on Twitter: http://twitter.com/firebox

Firebox has been nominated for Retailer of the Year in the 2011 Stuff Awards. Who will win? It's up to you! Visit http://www.stuff.tv/awards and place your vote. We'll do a special dance if it's us.

Firebox HQ is MOVING HOUSE! We're migrating from Streatham Hill to  shiny new digs in Shoreditch. As of 3rd October please update your records to:
Firebox.com, 6.10 The Tea Building, 56 Shoreditch High Street, London, E1 6JJ

Global Head Office: Firebox House, Ardwell Road, London SW2 4RT
Firebox.com Ltd is registered in England and Wales, company number 3874477
Registered Company Address: 41 Welbeck Street London W1G 8EA Firebox.com
 
Any views expressed in this email are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of Firebox.com Ltd.