You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2016/07/11 16:03:17 UTC

RFC: partitioned tasks and the strict registry

Folks,

We're working on some Mesos features that will allow frameworks to
control how partitioned tasks are handled [1]. As part of designing
how this will work, I'd love to hear from users and framework
developers about they handle partitioned tasks/agents. Specifically:

(a) Have you enabled the strict registry? ('--registry_strict' master flag)

(b) If so, do any of your frameworks _depend_ on the semantics
provided by the strict registry? [2]

(c) Does your framework handle LOST tasks? For example, does your
framework account for the fact that LOST tasks might transition back
to RUNNING in certain circumstances?

(d) Suppose we changed the semantics of LOST in the following way: (1)
strict registry is no longer supported, and (2) LOST tasks will
*always* be allowed to reregister with the master and resume running
(even if the master has not failed over). Would this change cause
problems for any of your frameworks?

Answering "I don't know" to any of these questions is fine :) Feel
free to respond to me privately if you'd prefer.

If you have any other feedback or questions, please contact me.

Thanks!

Neil

[1] More information on the proposed changes can be found here:
https://goo.gl/7dRw4Q

[2] e.g., your framework assumes that LOST tasks will never go back to RUNNING.