You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Profimedia (Jira)" <ji...@apache.org> on 2019/10/23 09:03:00 UTC

[jira] [Updated] (SOLR-13861) SynonymGraphFilterFactory - with pattern tokenizer - not able to start

     [ https://issues.apache.org/jira/browse/SOLR-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Profimedia updated SOLR-13861:
------------------------------
    Labels: synonyms  (was: )

> SynonymGraphFilterFactory - with pattern tokenizer - not able to start
> ----------------------------------------------------------------------
>
>                 Key: SOLR-13861
>                 URL: https://issues.apache.org/jira/browse/SOLR-13861
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 7.7.2
>            Reporter: Profimedia
>            Priority: Major
>              Labels: synonyms
>
> Hi,
> we face problem with definition of SynonymGraphFilterFactory, when we use SimplePatternTokenizerFactory. It seem's that there is a problem, that Solr during processing schema, lose attribute tokenizerFactory.pattern.
>  
> {code:xml}
> <fieldType name="text_synonym" class="solr.TextField"  >
> 		<analyzer type="index">
> 			<tokenizer class="solr.SimplePatternTokenizerFactory" pattern="[^,]+"/>
> 		</analyzer>
> 		<analyzer type="query">
> 			<tokenizer class="solr.SimplePatternTokenizerFactory" pattern="[^,]+"/>
> 			<filter class="solr.SynonymGraphFilterFactory"
> 					synonyms="synonyms.txt"
> 					expand="false"
>             tokenizerFactory="solr.SimplePatternTokenizerFactory" tokenizerFactory.pattern="[^,]+" />
> 		</analyzer>
> 	</fieldType>
> {code}
> We got exception like this:
> {code:java}
> Caused by: java.lang.IllegalArgumentException: Configuration Error: missing parameter 'pattern'
>         at org.apache.lucene.analysis.util.AbstractAnalysisFactory.require(AbstractAnalysisFactory.java:97)
>         at org.apache.lucene.analysis.pattern.SimplePatternTokenizerFactory.<init>(SimplePatternTokenizerFactory.java:68)
>         ... 58 more
> {code}
> We debug this issue and we found that problem is at this method which are called more than once:
> {code:java}
> // (there are no tests for this functionality)
>   private TokenizerFactory loadTokenizerFactory(ResourceLoader loader, String cname) throws IOException {
>     Class<? extends TokenizerFactory> clazz = loader.findClass(cname, TokenizerFactory.class);
>     try {
>       TokenizerFactory tokFactory = clazz.getConstructor(Map.class).newInstance(tokArgs);
>       if (tokFactory instanceof ResourceLoaderAware) {
>         ((ResourceLoaderAware) tokFactory).inform(loader);
>       }
>       return tokFactory;
>     } catch (Exception e) {
>       throw new RuntimeException(e);
>     }
>   }
> {code}
> In a first step argument tokArgs was cleared. And in second step, Solr reports missing param pattern.
> We did some workaround like this:
> {code:java}
> TokenizerFactory tokFactory = clazz.getConstructor(Map.class).newInstance(new HashMap<>(tokArgs))
> {code}
> , which creates for each call new map from tokArgs, which could be cleared. But I think, that for this issue will exist better solution, then creating copy of tokArgs map.
> After that we can run filter, mentioned above, without problems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org