You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by David Greenberg <ds...@gmail.com> on 2014/03/07 23:05:40 UTC

What happens when I call reconcileTasks and database divergence

I am trying to figure out how to use reconcileTasks to ensure that my DB of
tasks is synchronized with Mesos's tasks. Right now, I first commit the
fact that I ran a task to the DB, then I launchTasks. My concern is that
when I use reconcileTasks to ensure the DB state matches the Mesos state,
the launchTasks could've failed, and I'm not sure how the application can
discover that the task it thought it submitted was never submitted.

How do other frameworks deal with synchronizing their state with the Mesos
state?

Re: What happens when I call reconcileTasks and database divergence

Posted by Benjamin Mahler <be...@gmail.com>.
This is a great question and is the primary motivation behind:
https://issues.apache.org/jira/browse/MESOS-295

To guarantee frameworks can maintain a consistent view of their tasks
(without a custom reconciliation mechanism, as is used in Aurora), we will
be implementing the Registrar to persist a small amount of state in the
Master:
https://issues.apache.org/jira/browse/MESOS-764

I'm also planning to write a document describing how to properly implement
state reconciliation from a framework developer's perspective, once the
Registrar is released.

After discussing with other committers, it's likely that the Registrar will
be released as follows to provide the smoothest upgrade path:
  1. Initial Registrar release. This will, by default, provide the same
semantics as before. (--strict=false).
  2. Subsequent release. This will, by default, provide strict semantics.
(--strict=true).

Only when we're operating in a --strict manner can frameworks fully
reconcile state against the Master.

The design doc may shed some light here, but some things are out-of-date
(including --strict, which does not correspond to what I've described here):
https://cwiki.apache.org/confluence/display/MESOS/Registrar+Design+Document

I'll update the doc in the coming weeks, let me know if you have other
questions!


On Sun, Mar 9, 2014 at 9:47 PM, Vinod Kone <vi...@gmail.com> wrote:

> Hey David,
>
> You might want to look at Aurora and Marathon to see how they do state
> reconciliation.
>
> We are working on a new feature, adding persistent state to master
> (MESOS-764) <https://issues.apache.org/jira/browse/MESOS-764>, that should
> make reconciliation even easier.
>

Re: What happens when I call reconcileTasks and database divergence

Posted by Vinod Kone <vi...@gmail.com>.
Hey David,

You might want to look at Aurora and Marathon to see how they do state
reconciliation.

We are working on a new feature, adding persistent state to master
(MESOS-764) <https://issues.apache.org/jira/browse/MESOS-764>, that should
make reconciliation even easier.