You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Mike Drob (Jira)" <ji...@apache.org> on 2019/10/01 17:57:00 UTC
[jira] [Commented] (SOLR-13797) SolrResourceLoader produces inconsistent results when given bad arguments

    [ https://issues.apache.org/jira/browse/SOLR-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942192#comment-16942192 ] 

Mike Drob commented on SOLR-13797:
----------------------------------

The change here is pretty minor but also touches some really core code so I'd feel more comfortable if somebody else takes a look at the patch before I push this. Thanks!

> SolrResourceLoader produces inconsistent results when given bad arguments
> -------------------------------------------------------------------------
>
>                 Key: SOLR-13797
>                 URL: https://issues.apache.org/jira/browse/SOLR-13797
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.7.2, 8.2
>            Reporter: Mike Drob
>            Assignee: Mike Drob
>            Priority: Major
>         Attachments: SOLR-13797.v1.patch, SOLR-13797.v2.patch
>
>
> SolrResourceLoader will attempt to do some magic to infer what the user wanted when loading TokenFilter and Tokenizer classes. However, this can end up putting the wrong class in the cache such that the request succeeds the first time but fails subsequent times. It should either succeed or fail consistently on every call.
> This can be triggered in a variety of ways, but the simplest is maybe by specifying the wrong element type in an indexing chain. Consider the field type definition:
> {code:xml}
> <fieldType name="text_en_partial" class="solr.TextField">
>   <analyzer type="index">
>     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>     <filter class="solr.NGramTokenizerFactory" minGramSize="1" maxGramSize="2"/>
>   </analyzer>
> </fieldType>
> {code}
> If loaded by itself (e.g. docker container for standalone validation) then the schema will pass and collection will succeed, with Solr actually figuring out that it needs an {{NGramTokenFilterFactory}}. However, if this is loaded on a cluster with other collections where the {{NGramTokenizerFactory}} has been loaded correctly then we get {{ClassCastException}}. Or if this collection is loaded first then others using the Tokenizer will fail instead.
> I'd argue that succeeding on both calls is the better approach because it does what the user likely wants instead of what the user explicitly asks for, and creates a nicer user experience that is marginally less pedantic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org