You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Ying Ji <ji...@gmail.com> on 2015/07/02 02:08:38 UTC

Questions about Framework/Scheduler

Hey, I am new to mesos and just start to investigate it. I have a fundament
question about Framework

Assume i am using a long live framework, how the mesos master detect the
framework in unavailable , such as some network error or some internal
error from the framework ? (I can not find it at mater.cpp. Could you
please point out the source code for me ?)The framework has been registered
to the master successfully, and has successfully run for a whole.

Thanks

Ying

Re: Questions about Framework/Scheduler

Posted by Adam Bordelon <ad...@mesosphere.io>.
See Master::exited()
https://github.com/apache/mesos/blob/0.22.1/src/master/master.cpp#L878
which derives from Process::exited()
https://github.com/apache/mesos/blob/0.22.1/3rdparty/libprocess/include/process/process.hpp#L55
In the event of a temporary network partition, the Mesos master will
continue trying to send offer/status/etc. messages to the framework
scheduler. Since status messages are reliable at-least once delivery, they
are actually queued up (per task) on the slave until an acknowledgement is
received from scheduler.

On Wed, Jul 1, 2015 at 5:08 PM, Ying Ji <ji...@gmail.com> wrote:

> Hey, I am new to mesos and just start to investigate it. I have a
> fundament question about Framework
>
> Assume i am using a long live framework, how the mesos master detect the
> framework in unavailable , such as some network error or some internal
> error from the framework ? (I can not find it at mater.cpp. Could you
> please point out the source code for me ?)The framework has been registered
> to the master successfully, and has successfully run for a whole.
>
> Thanks
>
> Ying
>