You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Ruturaj Dhekane <ru...@gmail.com> on 2016/05/10 16:54:05 UTC

Configuring failover timeouts?

Hi all,

I have a Mesos+Marathon setup deployed on 3 master 6 slaves mode. I have
deployed a MySQL docker container and another front end container.

I have health check enabled on the MySQL container.

When i turn off the slave that contains the MySQL, DCOS detects bad health,
marks the slave as unhealthy. but it takes about 15 minutes before marathon
reschedules it on to a new host.

How to reduce this time to say 2 minutes? Why doesn't Marathon schedule it
the moment the health has gone bad? Any params I can change to enable this?

I checked the documentation and asked on IRC - not much help. Please point
me to a documentation if I have missed something.

Thank you,
Ruturaj

Re: Configuring failover timeouts?

Posted by Ruturaj Dhekane <ru...@gmail.com>.

This is the health check of MySQL.

{
    "protocol": "TCP",
    "portIndex": 0,
    "gracePeriodSeconds": 100,
    "intervalSeconds": 10,
    "timeoutSeconds": 20,
    "maxConsecutiveFailures": 3,
    "ignoreHttp1xx": false
  }


The DCOS is detecting it fairly fast. Marathon is taking quite a lot
of time to fail it over. Does mesos report to Marathon the same time
DCOS reports the host as unhealthy?

BTW, is there a difference between a Node marked as unhealthy and a
node considered Lost?


On Tue, May 10, 2016 at 10:55 PM, Tomek Janiszewski <ja...@gmail.com>
wrote:

> Hi
>
> What is configuration of health-check. If it's configured to have 5 min
> interval and allow 3 consequential fails you end up with 15 min delay
> between marking task unhealthy and spawning new one.
>
> Best
> Tomek
>
> wt., 10.05.2016 o 18:54 użytkownik Ruturaj Dhekane <ru...@gmail.com>
> napisał:
>
>> Hi all,
>>
>> I have a Mesos+Marathon setup deployed on 3 master 6 slaves mode. I have
>> deployed a MySQL docker container and another front end container.
>>
>> I have health check enabled on the MySQL container.
>>
>> When i turn off the slave that contains the MySQL, DCOS detects bad
>> health, marks the slave as unhealthy. but it takes about 15 minutes before
>> marathon reschedules it on to a new host.
>>
>> How to reduce this time to say 2 minutes? Why doesn't Marathon schedule
>> it the moment the health has gone bad? Any params I can change to enable
>> this?
>>
>> I checked the documentation and asked on IRC - not much help. Please
>> point me to a documentation if I have missed something.
>>
>> Thank you,
>> Ruturaj
>>
>

Re: Configuring failover timeouts?

Posted by Tomek Janiszewski <ja...@gmail.com>.

Hi

What is configuration of health-check. If it's configured to have 5 min
interval and allow 3 consequential fails you end up with 15 min delay
between marking task unhealthy and spawning new one.

Best
Tomek

wt., 10.05.2016 o 18:54 użytkownik Ruturaj Dhekane <ru...@gmail.com>
napisał:

> Hi all,
>
> I have a Mesos+Marathon setup deployed on 3 master 6 slaves mode. I have
> deployed a MySQL docker container and another front end container.
>
> I have health check enabled on the MySQL container.
>
> When i turn off the slave that contains the MySQL, DCOS detects bad
> health, marks the slave as unhealthy. but it takes about 15 minutes before
> marathon reschedules it on to a new host.
>
> How to reduce this time to say 2 minutes? Why doesn't Marathon schedule it
> the moment the health has gone bad? Any params I can change to enable this?
>
> I checked the documentation and asked on IRC - not much help. Please point
> me to a documentation if I have missed something.
>
> Thank you,
> Ruturaj
>