You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2018/04/09 22:16:12 UTC

[GitHub] nickva opened a new issue #1276: Return "crashing" state from `_scheduler/docs` immediately after the first crash

nickva opened a new issue #1276: Return "crashing" state from `_scheduler/docs` immediately after the first crash
URL: https://github.com/apache/couchdb/issues/1276
 
 
   When a replication job that has been running for a while crashes with an error and is stopped. It status in `_scheduler/docs` endpoint response should be `crashing` instead of `pending`.
   
   The `crashing` vs `pending` status is driven by the "consecutive errors" count. This is computed as the number of crashes that occur in a row, for example a crash soon after a job start or a crash soon after another crash. If the crash happens a long enough time after a job start or previous crash then consecutive errors count is reset to 0 and the job is considered healthy.
   
   So this sequence of steps is possible. A user starts a job, lets it run for 5 minutes then deletes the source. The output of `_scheduler/docs` will show state = `pending` since consecutive crashes is still 0, but the user would rather see state as `crashing` in the result.
   
   The fix is so return crashing if a crash is the last even in the job history even if errors count is still 0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services