You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2013/09/20 02:52:38 UTC
Re: Question on ICUFoldingFilterFactory
What do you mean by "output"? Are you looking at fields in returned
documents? In which case you should see original stored field. Or are you -
for example - looking at facet/group values which are using tokenized
post-processed results?
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
On Fri, Sep 20, 2013 at 2:22 AM, Nemani, Raj <Ra...@turner.com> wrote:
> Hello,
>
> I was wondering if anybody who has experience with ICUFoldingFilterFactory
> can help out with the following issue. Thank you so much in advance.
>
> Raj
>
> ------------------------------------------------------------------
>
> Problem:
> When a document is created/updated, the value's casing is indexed
> properly. However, when it's queried, the value is returned in lowercase.
> Example:
> Document input: NBAE
> Document value: NBAE
> Query input: NBAE,nbae,Nbae...etc
> Query Output: nbae
>
> If I remove the ICUFoldingFilterFactory filter, the casing problem goes
> away, but I then searches for nbae (lowercase) or Nbae (mix case) return no
> values.
>
>
> Field Type:
> <fieldType name="text_phrase" class="solr.TextField"
> positionIncrementGap="20" autoGeneratePhraseQueries="true">
> <analyzer>
> <filter
> class="solr.PatternReplaceFilterFactory" pattern="\s&\s"
> replacement="\sand\s"/>
> <charFilter
> class="solr.PatternReplaceCharFilterFactory"
> pattern="[\p{Punct}\u00BF\u00A1]" replaceWith=" "/>
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.TrimFilterFactory" />
> <filter
> class="solr.PatternReplaceFilterFactory" pattern="[\p{Cntrl}]"
> replacement=""/>
> <filter class="solr.ICUFoldingFilterFactory"/>
> <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_en.txt" enablePositionIncrements="true" />
> </analyzer>
> </fieldType>
>
>
> Let me know if that makes sense. I'm curious if the
> solr.ICUFoldingFilterFactory has additional attributes that I can use to
> control the casing behavior but retain it's other filtering properties
> (ASCIIFoldingFilter, and ICUNormalizer2Filter)
>
> Thanks!!!
>
>