You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Yan Xu (JIRA)" <ji...@apache.org> on 2013/12/18 07:09:08 UTC

[jira] [Comment Edited] (MESOS-883) Group's handling of non-retryable errors and local timeout is incorrect

    [ https://issues.apache.org/jira/browse/MESOS-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848997#comment-13848997 ] 

Yan Xu edited comment on MESOS-883 at 12/18/13 6:07 AM:
--------------------------------------------------------

In suggested order:

https://reviews.apache.org/r/16332/
https://reviews.apache.org/r/16333/
https://reviews.apache.org/r/16290/
https://reviews.apache.org/r/16291/
https://reviews.apache.org/r/16345/


was (Author: xujyan):
https://reviews.apache.org/r/16290/
https://reviews.apache.org/r/16291/

> Group's handling of non-retryable errors and local timeout is incorrect
> -----------------------------------------------------------------------
>
>                 Key: MESOS-883
>                 URL: https://issues.apache.org/jira/browse/MESOS-883
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Yan Xu
>            Assignee: Yan Xu
>             Fix For: 0.16.0
>
>
> Currently both non-retryable errors and local timeout result in failed Futures and the client cannot differentiate the two.
> The clients of ZK master contender/detector need to terminate when facing non-retryable errors such as authenticate failures and retry when the ZK session times out locally (except for the leading master, who should terminate instead).
> The solution is to make Group's local session timeout behave exactly the same as sever-side session expiration which does not lead to failures. Therefore clients of the contender and detector should terminate when a failure is returned.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)