You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2021/01/05 22:26:00 UTC
[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field

    [ https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259253#comment-17259253 ] 

David Smiley commented on SOLR-15018:
-------------------------------------

Perhaps we should have a more structured schema that reflects child doc relations, which would solve this.  This idea has been thrown around a bit in conversations around nested docs.  I've also been torn in retrospect if the child documents ought to have been put into a map "to the side" of the normal field values.  Pros/cons.  Perhaps that should have been done?  It's not too late for a refactor of that nature; 9.0 is on the horizon.  Or maybe an easier way to iterate so that you pre-filter them out if you don't want them?  Shrug.

In the mean time, "patches welcome" – either a fix here or code to detect the problem and throw an exception instead of silently delete.

> Atomic update deletes child documents if schema has catch-all ignore field
> --------------------------------------------------------------------------
>
>                 Key: SOLR-15018
>                 URL: https://issues.apache.org/jira/browse/SOLR-15018
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: update
>    Affects Versions: 8.6.3
>            Reporter: Andreas Hubold
>            Priority: Major
>              Labels: AtomicUpdate, ChildDocuments, NestedDocuments
>
> Nested child documents disappear when some unrelated fields of a parent document are atomically updated, if the schema contains a catch-all dynamic field to ignore unknown fields like:
> {noformat}
> <dynamicField name="*" type="ignored" />
> <fieldType name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" /> {noformat}
> {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the original document, but it does not receive nested documents from {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} creates a SolrInputDocument for it. The problematic code is:
> {code:java}
> SchemaField sf = schema.getFieldOrNull(fname);
> if (sf != null) {
>   if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) continue;
> } {code}
> The code finds the "ignored" SchemaField as matching field for the nested document name (loaded from _nest_path_). Because of that they're not added to the SolrInputDocument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org