You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Viresh Modi <vi...@highq.com> on 2013/11/28 14:47:03 UTC
How solr text search finding work
For instance,When we searched for “Kredit” and received hits containing the
words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
word “Kreditgeber” are not coming up in the results list. Would you know
why?
Regards
Viresh Modi
Re: How solr text search finding work
Posted by Paul Libbrecht <pa...@hoplahup.net>.
Viresh,
there's two ways to solve this.
- Using the CompoundWordsAnalyzer. I still haven't been able to find an easy to embark method into there. That would decompose, at indexing and query time, the term Kreditgeber into kredit and geber. For a higher precision, you probably want to do it at indexing time only.
- At query expansion time, using a QueryComponent, where you'd turn each word (so… TermQuery objects as received by the QueryParser) into wildcard queries.
I'd prefer the first.
paul
Le 28 nov. 2013 à 14:47, Viresh Modi <vi...@highq.com> a écrit :
> For instance,When we searched for “Kredit” and received hits containing the
> words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
> word “Kreditgeber” are not coming up in the results list. Would you know
> why?
>
>
> Regards
> Viresh Modi
Re: How solr text search finding work
Posted by Viresh Modi <vi...@highq.com>.
Declaration:
<field name="content" type="text_en_splitting" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true" />
My Field Type Defination as below:
<fieldType name="text_en_splitting" class="solr.TextField"
positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="lang/stopwords_en.txt"
enablePositionIncrements="true"
/>
<!--
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="0"/>
-->
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
On 28 November 2013 19:31, Ahmet Arslan <io...@yahoo.com> wrote:
> Hi Viresh,
>
> It all about analysis (field type) of the text you are indexing searching.
> http://localhost:8983/solr/#/collection1/analysis page is very helpful to
> obverse/debug how your text is indexed analyzed. It is probably because of
> your stemmer. Can you paste your field type definition?
>
>
>
>
> On Thursday, November 28, 2013 3:47 PM, Viresh Modi <vi...@highq.com>
> wrote:
> For instance,When we searched for “Kredit” and received hits containing the
> words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
> word “Kreditgeber” are not coming up in the results list. Would you know
> why?
>
>
> Regards
> Viresh Modi
>
--
Regards,
Viresh Modi
Software Engineer (Publisher)
Email: viresh.modi@highq.com <pu...@highqsolutions.com>
Mobile: +919714567430
Re: How solr text search finding work
Posted by Ahmet Arslan <io...@yahoo.com>.
Hi Viresh,
It all about analysis (field type) of the text you are indexing searching. http://localhost:8983/solr/#/collection1/analysis page is very helpful to obverse/debug how your text is indexed analyzed. It is probably because of your stemmer. Can you paste your field type definition?
On Thursday, November 28, 2013 3:47 PM, Viresh Modi <vi...@highq.com> wrote:
For instance,When we searched for “Kredit” and received hits containing the
words “Kredit”, “Kredite” and “Kredit-“. However, entries containing the
word “Kreditgeber” are not coming up in the results list. Would you know
why?
Regards
Viresh Modi