You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "Michael (Jira)" <ji...@apache.org> on 2020/02/19 13:26:00 UTC

[jira] [Commented] (SOLR-14262) local commit is (silently - no rf support) ignored during replay

    [ https://issues.apache.org/jira/browse/SOLR-14262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040026#comment-17040026 ] 

Michael commented on SOLR-14262:
--------------------------------

Some possibly related issues:
Open:
https://issues.apache.org/jira/browse/SOLR-3888  - "need beter handling of external add/commit requests during tlog recovery"
Closed:
https://issues.apache.org/jira/browse/SOLR-12011
https://issues.apache.org/jira/browse/SOLR-9366

> local commit is (silently - no rf support) ignored during replay
> ----------------------------------------------------------------
>
>                 Key: SOLR-14262
>                 URL: https://issues.apache.org/jira/browse/SOLR-14262
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Priority: Major
>
> Summarizing an issue discovered by Michael Frank and reported to the solr-user mailing list in this thread...
> [https://lists.apache.org/thread.html/%3CCAGgV7souCSbhM4+CnhVVTRJxtZBBVpnaXsY-7VSKSfPaR_aHVQ@mail.gmail.com%3E]
> Situation:
> * chaos testing of add+commit while randomly bringing nodes up/down
> * test client checks rf of every add
> ** commit does not support rf
> * after adding a doc (and confirming expected rf) + commiting, it's possible to issue a search that gets back a "stale" version of the doc
> Analysis by Michael...
> {quote}
> We traced the problem down to DistributedUpdateProcessor.doLocalCommit()
> which is *silently* dropping all commits while the replica is currently
> inactive and replaying, imeadiatly returns and still reports status=0.
> ...
> The issue we have is the "silent" part. If upon recieving a commit request
> the replica
> * would either wait to become healthy and and then commit and return,
> honoring waitSearcher=true (which is what we expected from reading the
> documentation)
> * or at least behave consistently the same way as all other
> UpdateRequests and report back the achieved replication factor with the
> "rf" response parameter
> we could easily detect the degraded cluster state in the client and keep
> re-trying the commit till "rf" matches the number of replicas.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org