You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "Mike Drob (Jira)" <ji...@apache.org> on 2019/09/26 22:12:00 UTC

[jira] [Created] (SOLR-13797) SolrResourceLoader produces inconsistent results when given bad arguments

Mike Drob created SOLR-13797:
--------------------------------

             Summary: SolrResourceLoader produces inconsistent results when given bad arguments
                 Key: SOLR-13797
                 URL: https://issues.apache.org/jira/browse/SOLR-13797
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 8.2, 7.7.2
            Reporter: Mike Drob
            Assignee: Mike Drob


SolrResourceLoader will attempt to do some magic to infer what the user wanted when loading TokenFilter and Tokenizer classes. However, this can end up putting the wrong class in the cache such that the request succeeds the first time but fails subsequent times. It should either succeed or fail consistently on every call.

This can be triggered in a variety of ways, but the simplest is maybe by specifying the wrong element type in an indexing chain. Consider the field type definition:

{code:xml}
<fieldType name="text_en_partial" class="solr.TextField">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="2"/>
  </analyzer>
</fieldType>
{code}

If loaded by itself (e.g. docker container for standalone validation) then the schema will pass and collection will succeed, with Solr actually figuring out that it needs an {{NGramTokenFilterFactory}}. However, if this is loaded on a cluster with other collections where the {{NGramTokenizerFactory}} has been loaded correctly then we get {{ClassCastException}}. Or if this collection is loaded first then others using the Tokenizer will fail instead.

I'd argue that succeeding on both calls is the better approach because it does what the user likely wants instead of what the user explicitly asks for, and creates a nicer user experience that is marginally less pedantic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org