You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Anupam Bhattacharya <an...@gmail.com> on 2018/02/02 07:34:50 UTC
No Suggestions from SpellCheck when _text_ field tokenizer set to solr.NGramTokenizerFactory
I have configured Solr Managed-schema as following
Below configuration is for Full Text Search:
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<!-- <tokenizer class="solr.StandardTokenizerFactory"/> -->
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
maxGramSize="10"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<!-- <tokenizer class="solr.StandardTokenizerFactory"/> -->
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
maxGramSize="10"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
<filter class="solr.SynonymGraphFilterFactory" expand="true"
ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Following is the configuration for Spell check type of field.
<fieldType name="text_spellcheck" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt"
ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Below is the field on which I do spell check
<field name="title_txt_spellcheck_ja" type="text_spellcheck"
omitNorms="true" indexed="true" stored="true"/>
Below is the full text field
<field name="_text_" type="text_general" multiValued="true" indexed="true"
stored="false"/>
I copy text from another field for suggestions.
<copyField source="title_txt_ja" dest="title_txt_spellcheck_ja"/>
/spell?fl=id,title_txt_spellcheck_ja&wt=json&defType=edismax&q=te&spellcheck=on&spellcheck.count=10&spellcheck.collate=true&spellcheck.dictionary=title_txt_spellcheck_ja&spellcheck.collateExtendedResults=true&spellcheck.maxCollations=3
All the configurations were working fine till i changed
<tokenizer class="solr.StandardTokenizerFactory"/> to
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
maxGramSize="10"/>
in text_general field type.
Any clues ?
Regards
Anupam Bhattacharya
Re: No Suggestions from SpellCheck when _text_ field tokenizer set
to solr.NGramTokenizerFactory
Posted by Alessandro Benedetti <a....@sease.io>.
How is this field type defined : textSpell ?
Can you detail what it is not working as expected ?
Thanks
-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: No Suggestions from SpellCheck when _text_ field tokenizer set to solr.NGramTokenizerFactory
Posted by Anupam Bhattacharya <an...@gmail.com>.
Following is the configuration related to Spell check in the
Solr-config.xml file.
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="name">title_txt_spellcheck_ja</str>
<str name="field">title_txt_spellcheck_ja</str>
<str name="buildOnOptimize">true</str>
<str name="buildOnCommit">true</str>
<str name="spellcheckIndexDir">./spellchecker_en</str>
</lst>
<lst name="spellchecker">
<str name="name">title_txt_spellcheck_en</str>
<str name="field">title_txt_spellcheck_en</str>
<str name="buildOnOptimize">true</str>
<str name="buildOnCommit">true</str>
<str name="spellcheckIndexDir">./spellchecker_de</str>
</lst>
............
............
</searchComponent>
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<!-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
Regards,
Anupam
On Mon, Feb 5, 2018 at 5:45 PM, Alessandro Benedetti <a....@sease.io>
wrote:
> Hi, how is your spellcheck dictionary :
> "spellcheck.dictionary=title_txt_spellcheck_ja" defined in the
> solrconfig.xml?
>
> Regards
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
--
Thanks & Regards
Anupam Bhattacharya
Re: No Suggestions from SpellCheck when _text_ field tokenizer set
to solr.NGramTokenizerFactory
Posted by Alessandro Benedetti <a....@sease.io>.
Hi, how is your spellcheck dictionary :
"spellcheck.dictionary=title_txt_spellcheck_ja" defined in the
solrconfig.xml?
Regards
-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: No Suggestions from SpellCheck when _text_ field tokenizer set to solr.NGramTokenizerFactory
Posted by Anupam Bhattacharya <an...@gmail.com>.
Pls. help to understand the root cause of this behavior.
Even though text_spellcheck fieldType is using solr.StandardTokenizerFactory
& doesnt have any relation with text_general field which is using
solr.NGramTokenizerFactory
tokenizer why the Solr Spell check services is not working as expected.
Regards,
Anupam
On Fri, Feb 2, 2018 at 1:04 PM, Anupam Bhattacharya <an...@gmail.com>
wrote:
> I have configured Solr Managed-schema as following
>
> Below configuration is for Full Text Search:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer type="index">
> <!-- <tokenizer class="solr.StandardTokenizerFactory"/> -->
> <tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
> maxGramSize="10"/>
> <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <!-- <tokenizer class="solr.StandardTokenizerFactory"/> -->
> <tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
> maxGramSize="10"/>
> <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
> <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
> Following is the configuration for Spell check type of field.
>
> <fieldType name="text_spellcheck" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.StandardFilterFactory"/>
> <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
>
> Below is the field on which I do spell check
>
> <field name="title_txt_spellcheck_ja" type="text_spellcheck"
> omitNorms="true" indexed="true" stored="true"/>
>
> Below is the full text field
> <field name="_text_" type="text_general" multiValued="true" indexed="true"
> stored="false"/>
>
> I copy text from another field for suggestions.
>
> <copyField source="title_txt_ja" dest="title_txt_spellcheck_ja"/>
>
> /spell?fl=id,title_txt_spellcheck_ja&wt=json&defType=
> edismax&q=te&spellcheck=on&spellcheck.count=10&spellcheck.collate=true&
> spellcheck.dictionary=title_txt_spellcheck_ja&spellcheck.
> collateExtendedResults=true&spellcheck.maxCollations=3
>
> All the configurations were working fine till i changed
> <tokenizer class="solr.StandardTokenizerFactory"/> to
> <tokenizer class="solr.NGramTokenizerFactory" minGramSize="2"
> maxGramSize="10"/>
> in text_general field type.
>
> Any clues ?
>
> Regards
> Anupam Bhattacharya
>
>
--
Thanks & Regards
Anupam Bhattacharya