You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Viresh Modi <vi...@highq.com> on 2013/11/28 14:47:03 UTC

How solr text search finding work

For instance,When we searched for “Kredit” and received hits containing the
words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
word “Kreditgeber” are not coming up in the results list. Would you know
why?


Regards
Viresh Modi

Re: How solr text search finding work

Posted by Paul Libbrecht <pa...@hoplahup.net>.

Viresh,

there's two ways to solve this.

- Using the CompoundWordsAnalyzer. I still haven't been able to find an easy to embark method into there. That would decompose, at indexing and query time, the term Kreditgeber into kredit and geber. For a higher precision, you probably want to do it at indexing time only.

- At query expansion time, using a QueryComponent, where you'd turn each word (so… TermQuery objects as received by the QueryParser) into wildcard queries.

I'd prefer the first.

paul


Le 28 nov. 2013 à 14:47, Viresh Modi <vi...@highq.com> a écrit :

> For instance,When we searched for “Kredit” and received hits containing the
> words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
> word “Kreditgeber” are not coming up in the results list. Would you know
> why?
> 
> 
> Regards
> Viresh Modi

Re: How solr text search finding work

Posted by Viresh Modi <vi...@highq.com>.

Declaration:
 <field name="content" type="text_en_splitting" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true" />
My Field Type Defination as below:

<fieldType name="text_en_splitting" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <!-- Case insensitive stop word removal.
          add enablePositionIncrements=true in both the index and query
          analyzers to leave a 'gap' for more accurate phrase queries.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                enablePositionIncrements="true"
                />
<!--
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="0"/>
-->
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
    </fieldType>


On 28 November 2013 19:31, Ahmet Arslan <io...@yahoo.com> wrote:

> Hi Viresh,
>
> It all about analysis (field type) of the text you are indexing searching.
> http://localhost:8983/solr/#/collection1/analysis page is very helpful to
> obverse/debug how your text is indexed analyzed. It is probably because of
> your stemmer. Can you paste your field type definition?
>
>
>
>
> On Thursday, November 28, 2013 3:47 PM, Viresh Modi <vi...@highq.com>
> wrote:
> For instance,When we searched for “Kredit” and received hits containing the
> words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
> word “Kreditgeber” are not coming up in the results list. Would you know
> why?
>
>
> Regards
> Viresh Modi
>



-- 

Regards,
Viresh Modi
Software Engineer (Publisher)

Email: viresh.modi@highq.com <pu...@highqsolutions.com>
Mobile: +919714567430

Re: How solr text search finding work

Posted by Ahmet Arslan <io...@yahoo.com>.

Hi Viresh,

It all about analysis (field type) of the text you are indexing searching. http://localhost:8983/solr/#/collection1/analysis page is very helpful to obverse/debug how your text is indexed analyzed. It is probably because of your stemmer. Can you paste your field type definition?




On Thursday, November 28, 2013 3:47 PM, Viresh Modi <vi...@highq.com> wrote:
For instance,When we searched for “Kredit” and received hits containing the
words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
word “Kreditgeber” are not coming up in the results list. Would you know
why?


Regards
Viresh Modi