You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by kumar <pa...@gmail.com> on 2014/04/18 06:47:33 UTC

Filtering Solr Queries

Hi,

I am indexing the data using title, city and location fields. 

but different cities are having same location names like "rajaji nagar",
"rajajinagar".

When user types 

computers in rajaji nagar--------It has to display results like "computers
in rajajinagr" as well as "computers in rajaji nagr".

I am using the following schema.


<field name="city" type="string_ci" indexed="true" stored="false" />
<field name="locality" type="string_ci" indexed="true" stored="false" />
<field name="mytitle" type="textfullmatch" indexed="true" stored="false"
multiValued="true" omitNorms="true" omitTermFreqAndPositions="true" />





<fieldType name="textfullmatch" class="solr.TextField">
			<analyzer type="index">
				<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
				<tokenizer class="solr.KeywordTokenizerFactory"/>
				<filter class="solr.LowerCaseFilterFactory"/>
				<filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
replacement=" " replace="all"/>
				<filter class="solr.EdgeNGramFilterFactory" maxGramSize="50"
minGramSize="2"/>
				<filter class="solr.PatternReplaceFilterFactory"
pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
				<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
			</analyzer>
			<analyzer type="query">
				<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
				<tokenizer class="solr.KeywordTokenizerFactory"/>
				                                
				<filter class="solr.LowerCaseFilterFactory"/>
				<filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
replacement=" " replace="all"/>
				<filter class="solr.PatternReplaceFilterFactory"
pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
				<filter class="solr.PatternReplaceFilterFactory" pattern="^(.{30})(.*)?"
replacement="$1" replace="all"/>
				<filter class="solr.SynonymFilterFactory" ignoreCase="true"
synonyms="synonyms_fsw.txt" expand="true" />
				<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
				<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
			</analyzer>
</fieldType>







--
View this message in context: http://lucene.472066.n3.nabble.com/Filtering-Solr-Queries-tp4131924.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Filtering Solr Queries

Posted by Erick Erickson <er...@gmail.com>.
Is this a manageable list? That is, not a zillion names? If so, it
seems like you could do this with synonyms. Assuming your string_ci
bit is a "string" type, you'd need to change that to something like
KeywordTokenizerFactory followed by filters, and you might want to add
something like LowercaseFilterFactory to the chain.

Best,
Erick

On Thu, Apr 17, 2014 at 9:47 PM, kumar <pa...@gmail.com> wrote:
> Hi,
>
> I am indexing the data using title, city and location fields.
>
> but different cities are having same location names like "rajaji nagar",
> "rajajinagar".
>
> When user types
>
> computers in rajaji nagar--------It has to display results like "computers
> in rajajinagr" as well as "computers in rajaji nagr".
>
> I am using the following schema.
>
>
> <field name="city" type="string_ci" indexed="true" stored="false" />
> <field name="locality" type="string_ci" indexed="true" stored="false" />
> <field name="mytitle" type="textfullmatch" indexed="true" stored="false"
> multiValued="true" omitNorms="true" omitTermFreqAndPositions="true" />
>
>
>
>
>
> <fieldType name="textfullmatch" class="solr.TextField">
>                         <analyzer type="index">
>                                 <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>                                 <tokenizer class="solr.KeywordTokenizerFactory"/>
>                                 <filter class="solr.LowerCaseFilterFactory"/>
>                                 <filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
> replacement=" " replace="all"/>
>                                 <filter class="solr.EdgeNGramFilterFactory" maxGramSize="50"
> minGramSize="2"/>
>                                 <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
>                                 <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
>                         </analyzer>
>                         <analyzer type="query">
>                                 <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>                                 <tokenizer class="solr.KeywordTokenizerFactory"/>
>
>                                 <filter class="solr.LowerCaseFilterFactory"/>
>                                 <filter class="solr.PatternReplaceFilterFactory" pattern="([\.,;:-_])"
> replacement=" " replace="all"/>
>                                 <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
>                                 <filter class="solr.PatternReplaceFilterFactory" pattern="^(.{30})(.*)?"
> replacement="$1" replace="all"/>
>                                 <filter class="solr.SynonymFilterFactory" ignoreCase="true"
> synonyms="synonyms_fsw.txt" expand="true" />
>                                 <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true" />
>                                 <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>                         </analyzer>
> </fieldType>
>
>
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Filtering-Solr-Queries-tp4131924.html
> Sent from the Solr - User mailing list archive at Nabble.com.