You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Klaus Ma <kl...@gmail.com> on 2016/02/21 15:55:55 UTC
Re: Review Request 37531: Fix master CHECK failure if a framework uses
duplicated task id.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/#review120073
-----------------------------------------------------------
ping @jieyu/vinodkone.
- Klaus Ma
On Jan. 13, 2016, 10:06 p.m., Klaus Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37531/
> -----------------------------------------------------------
>
> (Updated Jan. 13, 2016, 10:06 p.m.)
>
>
> Review request for mesos, Jie Yu and Vinod Kone.
>
>
> Bugs: MESOS-3070
> https://issues.apache.org/jira/browse/MESOS-3070
>
>
> Repository: mesos
>
>
> Description
> -------
>
> __Phenomenon:__
> The master crash because of duplicated task id
>
> __Root Cause:__
> The task id are stored in slave agent; if master failover, there's a time window that new slave lanched a task with same task id; so if the old task re-registered back, the master will crash because of duplicated task id.
>
> __Solution:__
> Stores tasks info in Master::Framework by SlaveID to avoid duplicated issue.
>
>
> Diffs
> -----
>
> src/master/http.cpp bcafc7aff89659a68352f3876ce6042f8b34bd5d
> src/master/master.hpp f02d165874fa8023675e545793de699aeecae29b
> src/master/master.cpp c122c30d943813fc3ce9e7025783c7231809b022
> src/tests/master_tests.cpp 223b9d20a3a8a8194a3a6a605ec2394c37ab5957
>
> Diff: https://reviews.apache.org/r/37531/diff/
>
>
> Testing
> -------
>
> make
> make check
>
>
> Thanks,
>
> Klaus Ma
>
>