You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Hodder, Rick" <RH...@navg.com> on 2019/07/30 20:30:24 UTC
SOLR 8.1.1 EdgeNGramFilterFactory parsing query
I have a SOLR 4.10.2 core, and I am upgrading to 8.1.1.
I created an 8.1.1 core manually using the default_config set , and then brought over settings into the 8.1.1 schema
I have adjusted the schema.xml and solrconfig.xml, and I have the core queryable in 8.1.1.
I have a field named Company:
<field name="Company" type="string" indexed="true" stored="true"/>
<field name="IDX_Company" type="text_general" indexed="true" stored="false" multiValued="true" />
<copyField source="Company" dest="IDX_Company"/>
In 4.10.2 when I run the query:
IDX_Company:blue
with debugQuery on, I see the query parsed into pieces (correctly)
"debug": {
"rawquerystring": "IDX_Company:blue",
"querystring": "IDX_Company:blue",
"parsedquery": "(IDX_Company:b IDX_Company:bl IDX_Company:blu IDX_Company:blue)/no_coord",
...
When I run this against 8.1.1, with debugQuery on, I get the following:
"debug":{
"rawquerystring":"IDX_Company:blue",
"querystring":"IDX_Company:blue",
"parsedquery":"IDX_Company:blue",
...
It seems to not be applying the EdgeNGramFilterFactory - the only change I made to the EdgeNGramFilterFactory configuration was to remove the "side" attribute, per the documentation.
Also, per the documentation, I replaced the SynonymFilterFactory with SynonmGraphFilterFactory, and added the FlattenGraphFilterFactory.
I have tried removing the FlattenGraphFilterFactory, I have cleared and repopulated the core (reindexed), I have stopped and started SOLR 8.1.1, and no difference.
Here is the definition of text_general I am using in schema.xml
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"/> <!-- RDH - removed side="front"-->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- RDH SynonymFilterFactory has been deprecated, replace with SynonymGraphFilterFactory -->
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<!-- RDH https://lucene.apache.org/solr/guide/8_1/filter-descriptions.html
Flatten Graph Filter
This filter must be included on INDEX-time analyzer specifications that include at least one graph-aware filter, including Synonym Graph Filter and Word Delimiter Graph Filter.
-->
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<!-- strip all punctuation -->
<filter class="solr.PatternReplaceFilterFactory" pattern="[^\p{L}\p{N} ]" replacement=" " replace="all" /> <!-- RDH -->
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="15"/> <!-- RDH - removed side="front"-->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<!-- RDH SynonymFilterFactory is deprecated, replace with SynonymGraphFilterFactory -->
<filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<!-- RDH https://lucene.apache.org/solr/guide/8_1/filter-descriptions.html
Flatten Graph Filter
This filter must be included on INDEX-time analyzer specifications that include at least one graph-aware filter, including Synonym Graph Filter and Word Delimiter Graph Filter.
-->
<filter class="solr.FlattenGraphFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<!-- strip all punctuation -->
<filter class="solr.PatternReplaceFilterFactory" pattern="[^\p{L}\p{N} ]" replacement=" " replace="all" /> <!-- RDH -->
</analyzer>
</fieldType>
Re: SOLR 8.1.1 EdgeNGramFilterFactory parsing query
Posted by Erick Erickson <er...@gmail.com>.
This works fine for me. Are you completely sure that
1> you pushed the changed config to the right place
2> you reloaded your server?
One thing I do is go to the admin UI and check for the collection core) and bring up the schema file just to be sure that I’m using the schema I think I am.
I’d also check the admin/analysis page to see what that shows, sometimes I can get hints there.
And I’m assuming you’ve completely re-indexed your data, although for query parsing that shouldn’t be relevant.
Best,
Erick