You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Gregory Chanan (JIRA)" <ji...@apache.org> on 2014/07/15 21:11:04 UTC

[jira] [Created] (SOLR-6250) Schemaless parsing does not work on a consistent schema

Gregory Chanan created SOLR-6250:
------------------------------------

             Summary: Schemaless parsing does not work on a consistent schema
                 Key: SOLR-6250
                 URL: https://issues.apache.org/jira/browse/SOLR-6250
             Project: Solr
          Issue Type: Improvement
          Components: Schema and Analysis
            Reporter: Gregory Chanan


See this comment (https://issues.apache.org/jira/browse/SOLR-6137?focusedCommentId=14044366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044366), reproduced here:

bq. One small issue I noticed is that there is a race between parsing and schema addition. The AddSchemaFieldsUpdateProcessor handles this by only working on a fixed schema, so the schema doesn't change underneath it. If it decides on a schema addition and that fails (because another addition beat it), it will grab the latest schema and retry. But the parsers don't do that so the core's schema can change in the middle of parsing. It may make sense to defend against that by moving the retry code from the AddSchemaFieldsUpdateProcessor to some processor that runs before all the parsers. The downside is if the schema addition fails, you have to rerun all the parsers, but that may be a minor concern.
bq. This may not actually matter. Consider the case tested at the end of the test: two documents are simultaneously inserted with the same field having a Long and Date value. Assume the Date wins the schema "race" and is updated first. While parsing the Long, each parser may see the schema as having a date field or no field. If a valid parser (that is, one that can modify the field value) sees a date field, it won't do any modifications because shouldMutate will fail, leaving the object in whatever state the serializer left it (either Long or String). If it sees no field, it will mutate the object to create a Long object. In either case, we should get an error at the point we actually create the lucene document, because neither a Long nor String-representation-of-a-long can be stored in a Date field. This is pretty difficult to reason about though.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org