You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2021/03/25 18:25:00 UTC

[jira] [Created] (SOLR-15293) Deprecate/remove overwriteDupes option from SignatureUpdateProcessorFactory

Chris M. Hostetter created SOLR-15293:
-----------------------------------------

             Summary: Deprecate/remove overwriteDupes option from SignatureUpdateProcessorFactory
                 Key: SOLR-15293
                 URL: https://issues.apache.org/jira/browse/SOLR-15293
             Project: Solr
          Issue Type: Sub-task
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Chris M. Hostetter


The design principle of the {{overwriteDupes}} option of SignatureUpdateProcessorFactory is something that is only viable in single shard use cases, and even then it currently doesn't work because UpdateCommand "options" are not included when Shard Leaders write updates to the tlog, or forwards them to other replicas (SOLR-8030). With multiple shards it can never be viable w/o broadcasting a "Delete By Query" to every replica on every document add/update (SOLR-3473) which is vastly less efficient then the current low level {{updateDocument(Term,...)}} support provided by IndexWriter for replacing documents by uniqueKey.

I think in general we should remove the {{overwriteDupes}} option completely. If SignatureUpdateProcessorFactory is used to generate a synthetic uniqueKey field then the existing Solr/Lucene behavior of routing the document to the correct shard, and replacing any prior instances of that doc will work find.

The functionality of SignatureUpdateProcessorFactory should be constrained *solely* to generating a signature – if that signature is put in the unique key field, then de-duplication will happen automatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org