You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael (Jira)" <ji...@apache.org> on 2020/02/19 13:26:00 UTC
[jira] [Commented] (SOLR-14262) local commit is (silently - no rf
support) ignored during replay
[ https://issues.apache.org/jira/browse/SOLR-14262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040026#comment-17040026 ]
Michael commented on SOLR-14262:
--------------------------------
Some possibly related issues:
Open:
https://issues.apache.org/jira/browse/SOLR-3888 - "need beter handling of external add/commit requests during tlog recovery"
Closed:
https://issues.apache.org/jira/browse/SOLR-12011
https://issues.apache.org/jira/browse/SOLR-9366
> local commit is (silently - no rf support) ignored during replay
> ----------------------------------------------------------------
>
> Key: SOLR-14262
> URL: https://issues.apache.org/jira/browse/SOLR-14262
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Chris M. Hostetter
> Priority: Major
>
> Summarizing an issue discovered by Michael Frank and reported to the solr-user mailing list in this thread...
> [https://lists.apache.org/thread.html/%3CCAGgV7souCSbhM4+CnhVVTRJxtZBBVpnaXsY-7VSKSfPaR_aHVQ@mail.gmail.com%3E]
> Situation:
> * chaos testing of add+commit while randomly bringing nodes up/down
> * test client checks rf of every add
> ** commit does not support rf
> * after adding a doc (and confirming expected rf) + commiting, it's possible to issue a search that gets back a "stale" version of the doc
> Analysis by Michael...
> {quote}
> We traced the problem down to DistributedUpdateProcessor.doLocalCommit()
> which is *silently* dropping all commits while the replica is currently
> inactive and replaying, imeadiatly returns and still reports status=0.
> ...
> The issue we have is the "silent" part. If upon recieving a commit request
> the replica
> * would either wait to become healthy and and then commit and return,
> honoring waitSearcher=true (which is what we expected from reading the
> documentation)
> * or at least behave consistently the same way as all other
> UpdateRequests and report back the achieved replication factor with the
> "rf" response parameter
> we could easily detect the degraded cluster state in the client and keep
> re-trying the commit till "rf" matches the number of replicas.
> {quote}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org