You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2016/03/08 03:25:50 UTC

Re: State of registrar

Apologies for the long delay.

I wouldn't call it experimental (that comment is stale), you should feel
free to turn on strictness. Strictness enforces that agents that were
removed by an old master cannot re-join with a new master. This preserves
the steady state behavior: if the master removes an agent, it does not
allow it to return. Ideally, the flag is removed and strictness is the
default, but we didn't feel comfortable removing it until we had state
backup support in the master. Turning off strictness allows for an escape
hatch if state is lost. Now that we are persisting more information than
just the list of agents, this escape hatch doesn't restore the other state
(like maintenance schedules, quota information, etc).

As for why it's not on by default today, we found that many frameworks,
like Aurora and Marathon, are capable of handling a removed agent
re-surfacing in the cluster and so it wasn't critical to turn this on.
Also, we also realized that we need to re-work the partition handling in
Mesos in order to give frameworks the control over how to react to an
unreachable agent.

Does that clarify things?

On Mon, Feb 1, 2016 at 11:04 AM, Zhitao Li <zh...@uber.com> wrote:

> Hi,
>
> I've been reading related documentation on Mesos website and trying to
> understand the current status of registrar.
>
> I noticed that we still consider "--registrar_strict" as experimental, but
> I can't find the back story of what's needed to finish the project or the
> JIRA so tracking that.
>
> Also, does anyone have recommendations on whether we should turn this flag
> on, and what benefits cluster operator would get?
>
> Thanks.