You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Profimedia (Jira)" <ji...@apache.org> on 2019/10/23 09:02:00 UTC
[jira] [Created] (SOLR-13861) SynonymGraphFilterFactory - with pattern tokenizer - not able to start

Profimedia created SOLR-13861:
---------------------------------

             Summary: SynonymGraphFilterFactory - with pattern tokenizer - not able to start
                 Key: SOLR-13861
                 URL: https://issues.apache.org/jira/browse/SOLR-13861
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: search
    Affects Versions: 7.7.2
            Reporter: Profimedia


Hi,

we face problem with definition of SynonymGraphFilterFactory, when we use SimplePatternTokenizerFactory. It seem's that there is a problem, that Solr during processing schema, lose attribute tokenizerFactory.pattern.

 
{code:xml}
<fieldType name="text_synonym" class="solr.TextField"  >
		<analyzer type="index">
			<tokenizer class="solr.SimplePatternTokenizerFactory" pattern="[^,]+"/>
		</analyzer>
		<analyzer type="query">
			<tokenizer class="solr.SimplePatternTokenizerFactory" pattern="[^,]+"/>
			<filter class="solr.SynonymGraphFilterFactory"
					synonyms="synonyms.txt"
					expand="false"
            tokenizerFactory="solr.SimplePatternTokenizerFactory" tokenizerFactory.pattern="[^,]+" />
		</analyzer>
	</fieldType>
{code}
We got exception like this:
{code:java}
Caused by: java.lang.IllegalArgumentException: Configuration Error: missing parameter 'pattern'
        at org.apache.lucene.analysis.util.AbstractAnalysisFactory.require(AbstractAnalysisFactory.java:97)
        at org.apache.lucene.analysis.pattern.SimplePatternTokenizerFactory.<init>(SimplePatternTokenizerFactory.java:68)
        ... 58 more
{code}
We debug this issue and we found that problem is at this method which are called more than once:
{code:java}
// (there are no tests for this functionality)
  private TokenizerFactory loadTokenizerFactory(ResourceLoader loader, String cname) throws IOException {
    Class<? extends TokenizerFactory> clazz = loader.findClass(cname, TokenizerFactory.class);
    try {
      TokenizerFactory tokFactory = clazz.getConstructor(Map.class).newInstance(tokArgs);
      if (tokFactory instanceof ResourceLoaderAware) {
        ((ResourceLoaderAware) tokFactory).inform(loader);
      }
      return tokFactory;
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }
{code}
In a first step argument tokArgs was cleared. And in second step, Solr reports missing param pattern.

We did some workaround like this:
{code:java}
TokenizerFactory tokFactory = clazz.getConstructor(Map.class).newInstance(new HashMap<>(tokArgs))
{code}
, which creates for each call new map from tokArgs, which could be cleared. But I think, that for this issue will exist better solution, then creating copy of tokArgs map.

After that we can run filter, mentioned above, without problems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org