You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Jeff Schroeder <je...@computer.org> on 2016/03/31 03:56:47 UTC

Mesos agents across a WAN?

Given regional bare metal Mesos clusters on multiple continents, are there
any known issues running some of the agents over the WAN? Is anyone else
doing it, or is this a terrible idea that I should tell management no on?

A few specifics:

1. Are there any known limitations or configuration gotchas I might
encounter?
2. Does setting up ZK observers in each non-primary dc and pointing the
agents at them exclusively make sense?
3. Are there plans on a mesos equivalent of something like ubernetes[1], or
would that be up to each framework?
4. Any suggestions on how best to do agent attributes / constraints for
something like this? I was planning on having the config management add a
"data_center" agent attribute to match on.

Thanks!

[1]
https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation

-- 
Jeff Schroeder

Don't drink and derive, alcohol and analysis don't mix.
http://www.digitalprognosis.com

Re: Mesos agents across a WAN?

Posted by Jeff Schroeder <je...@computer.org>.
Interesting. Thanks Alex. Is anyone actively working on this at Mesosphere
or the community that wants to chat about it?

On Thursday, March 31, 2016, Alex Rukletsov <al...@mesosphere.com> wrote:

> Jeff,
>
> regarding 3: we are investigating this:
> https://issues.apache.org/jira/browse/MESOS-3548
>
> On Thu, Mar 31, 2016 at 3:56 AM, Jeff Schroeder <
> jeffschroeder@computer.org
> <javascript:_e(%7B%7D,'cvml','jeffschroeder@computer.org');>> wrote:
>
>> Given regional bare metal Mesos clusters on multiple continents, are
>> there any known issues running some of the agents over the WAN? Is anyone
>> else doing it, or is this a terrible idea that I should tell management no
>> on?
>>
>> A few specifics:
>>
>> 1. Are there any known limitations or configuration gotchas I might
>> encounter?
>> 2. Does setting up ZK observers in each non-primary dc and pointing the
>> agents at them exclusively make sense?
>> 3. Are there plans on a mesos equivalent of something like ubernetes[1],
>> or would that be up to each framework?
>> 4. Any suggestions on how best to do agent attributes / constraints for
>> something like this? I was planning on having the config management add a
>> "data_center" agent attribute to match on.
>>
>> Thanks!
>>
>> [1]
>> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>>
>> --
>> Jeff Schroeder
>>
>> Don't drink and derive, alcohol and analysis don't mix.
>> http://www.digitalprognosis.com
>>
>
>

-- 
Text by Jeff, typos by iPhone

Re: Mesos agents across a WAN?

Posted by Alex Rukletsov <al...@mesosphere.com>.
Jeff,

regarding 3: we are investigating this:
https://issues.apache.org/jira/browse/MESOS-3548

On Thu, Mar 31, 2016 at 3:56 AM, Jeff Schroeder <je...@computer.org>
wrote:

> Given regional bare metal Mesos clusters on multiple continents, are there
> any known issues running some of the agents over the WAN? Is anyone else
> doing it, or is this a terrible idea that I should tell management no on?
>
> A few specifics:
>
> 1. Are there any known limitations or configuration gotchas I might
> encounter?
> 2. Does setting up ZK observers in each non-primary dc and pointing the
> agents at them exclusively make sense?
> 3. Are there plans on a mesos equivalent of something like ubernetes[1],
> or would that be up to each framework?
> 4. Any suggestions on how best to do agent attributes / constraints for
> something like this? I was planning on having the config management add a
> "data_center" agent attribute to match on.
>
> Thanks!
>
> [1]
> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>
> --
> Jeff Schroeder
>
> Don't drink and derive, alcohol and analysis don't mix.
> http://www.digitalprognosis.com
>

Re: Mesos agents across a WAN?

Posted by Evan Krall <kr...@yelp.com>.
With Marathon, if you use "command" health checks, the checks are run by
the agents, so they continue to operate during a partition. As far as I can
tell, Marathon doesn't consider delayed healthcheck messages as failures.
Be aware that if you use the Docker executor, your health checks run within
your containers.

On Fri, Apr 1, 2016 at 7:28 AM, Jeff Schroeder <je...@computer.org>
wrote:

> On Thursday, March 31, 2016, Evan Krall <kr...@yelp.com> wrote:
>
>> On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <
>> jeffschroeder@computer.org> wrote:
>>
>>> Given regional bare metal Mesos clusters on multiple continents, are
>>> there any known issues running some of the agents over the WAN? Is anyone
>>> else doing it, or is this a terrible idea that I should tell management no
>>> on?
>>>
>>> A few specifics:
>>>
>>> 1. Are there any known limitations or configuration gotchas I might
>>> encounter?
>>>
>>
>> One thing to keep in mind is that the masters maintain a distributed log
>> through a consensus protocol, so there needs to be a quorum of masters that
>> can talk to each other in order to operate. Consensus protocols tend to be
>> very latency-sensitive, so you probably want to keep masters near each
>> other.
>>
>>
>> Some of our clusters span semi-wide geographical regions (in production,
>> up to about 5 milliseconds RTT between master and some slaves). So far, we
>> haven't seen any issues caused by that amount of latency, and I believe we
>> have clusters in non-production environments which have even higher round
>> trip between slaves and masters, and work fine. I haven't benchmarked task
>> launch time or anything like that, so I can't say how much it affects the
>> speed of operations.
>>
>> Mesos generally does the right thing around network partitions (changes
>> won't propagate, but it won't kill your tasks), but if you're running
>> things in Marathon and using TCP or HTTP healthchecks, be aware that
>> Marathon does not rate limit itself on issuing task kills
>> <https://github.com/mesosphere/marathon/issues/3317> for healthcheck
>> failures. This means during a network partition, your applications will be
>> fine, but once the network partition heals (or if you're experiencing
>> packet loss but not total failure), Marathon will suddenly kill all of the
>> tasks on the far side of the partition. A workaround for that is to use
>> command health checks, which are run by the mesos slave.
>>
>
> Right. Due to this I didn't plan on having masters across a WAN, just
> agents. That makes a ton of sense about Marathon due to the elected leader
> doing health checks, thanks for pointing it out. I'm also assuming if using
> something like Aurora where the health checks are part of the executor this
> would not be an issue. Any idea if there is a way to distribute the
> healthchecks with Marathon or should I ask on their ML?
>
>
>> 2. Does setting up ZK observers in each non-primary dc and pointing the
>>> agents at them exclusively make sense?
>>>
>>
>> My understanding of ZK observers is that they proxy writes to the actual
>> ZK quorum members, so this would probably be fine. mesos-slave uses ZK to
>> discover masters, and mesos-master uses ZK to do leader election; only
>> mesos-master is doing any writes to ZK.
>>
>> I'm not sure how often mesos-slave reads from ZK to get the list of
>> masters; I assume it doesn't bother if it has a live connection to a master.
>>
>>
>>> 4. Any suggestions on how best to do agent attributes / constraints for
>>> something like this? I was planning on having the config management add a
>>> "data_center" agent attribute to match on.
>>>
>>
>> If you're running services on Marathon or similar, I'd definitely
>> recommend exposing the location of the slaves as an attribute, and having
>> constraints to keep different instances of your application spread across
>> the different locations. The "correct" constraints to apply depends on your
>> application and latency / failure sensitivity.
>>
>> Evan
>>
>
> This is exactly what I was looking for, thank you for sharing your
> experience.
>
>
> --
> Text by Jeff, typos by iPhone
>

Re: Mesos agents across a WAN?

Posted by Jeff Schroeder <je...@computer.org>.
On Thursday, March 31, 2016, Evan Krall <kr...@yelp.com> wrote:

> On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <
> jeffschroeder@computer.org
> <javascript:_e(%7B%7D,'cvml','jeffschroeder@computer.org');>> wrote:
>
>> Given regional bare metal Mesos clusters on multiple continents, are
>> there any known issues running some of the agents over the WAN? Is anyone
>> else doing it, or is this a terrible idea that I should tell management no
>> on?
>>
>> A few specifics:
>>
>> 1. Are there any known limitations or configuration gotchas I might
>> encounter?
>>
>
> One thing to keep in mind is that the masters maintain a distributed log
> through a consensus protocol, so there needs to be a quorum of masters that
> can talk to each other in order to operate. Consensus protocols tend to be
> very latency-sensitive, so you probably want to keep masters near each
> other.
>
>
> Some of our clusters span semi-wide geographical regions (in production,
> up to about 5 milliseconds RTT between master and some slaves). So far, we
> haven't seen any issues caused by that amount of latency, and I believe we
> have clusters in non-production environments which have even higher round
> trip between slaves and masters, and work fine. I haven't benchmarked task
> launch time or anything like that, so I can't say how much it affects the
> speed of operations.
>
> Mesos generally does the right thing around network partitions (changes
> won't propagate, but it won't kill your tasks), but if you're running
> things in Marathon and using TCP or HTTP healthchecks, be aware that
> Marathon does not rate limit itself on issuing task kills
> <https://github.com/mesosphere/marathon/issues/3317> for healthcheck
> failures. This means during a network partition, your applications will be
> fine, but once the network partition heals (or if you're experiencing
> packet loss but not total failure), Marathon will suddenly kill all of the
> tasks on the far side of the partition. A workaround for that is to use
> command health checks, which are run by the mesos slave.
>

Right. Due to this I didn't plan on having masters across a WAN, just
agents. That makes a ton of sense about Marathon due to the elected leader
doing health checks, thanks for pointing it out. I'm also assuming if using
something like Aurora where the health checks are part of the executor this
would not be an issue. Any idea if there is a way to distribute the
healthchecks with Marathon or should I ask on their ML?


> 2. Does setting up ZK observers in each non-primary dc and pointing the
>> agents at them exclusively make sense?
>>
>
> My understanding of ZK observers is that they proxy writes to the actual
> ZK quorum members, so this would probably be fine. mesos-slave uses ZK to
> discover masters, and mesos-master uses ZK to do leader election; only
> mesos-master is doing any writes to ZK.
>
> I'm not sure how often mesos-slave reads from ZK to get the list of
> masters; I assume it doesn't bother if it has a live connection to a master.
>
>
>> 4. Any suggestions on how best to do agent attributes / constraints for
>> something like this? I was planning on having the config management add a
>> "data_center" agent attribute to match on.
>>
>
> If you're running services on Marathon or similar, I'd definitely
> recommend exposing the location of the slaves as an attribute, and having
> constraints to keep different instances of your application spread across
> the different locations. The "correct" constraints to apply depends on your
> application and latency / failure sensitivity.
>
> Evan
>

This is exactly what I was looking for, thank you for sharing your
experience.


-- 
Text by Jeff, typos by iPhone

Re: Mesos agents across a WAN?

Posted by Brian Devins <ba...@gmail.com>.
I would recommend looking at what Yelp did with their PaaSTA project. They
have an interesting approach to the multi-region orchestration.

Re: Mesos agents across a WAN?

Posted by Vinod Kone <vi...@apache.org>.
This is great info Evan, especially coming from a production experience.
Thanks for sharing it !

On Thu, Mar 31, 2016 at 1:49 PM, Evan Krall <kr...@yelp.com> wrote:

> On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <
> jeffschroeder@computer.org> wrote:
>
>> Given regional bare metal Mesos clusters on multiple continents, are
>> there any known issues running some of the agents over the WAN? Is anyone
>> else doing it, or is this a terrible idea that I should tell management no
>> on?
>>
>> A few specifics:
>>
>> 1. Are there any known limitations or configuration gotchas I might
>> encounter?
>>
>
> One thing to keep in mind is that the masters maintain a distributed log
> through a consensus protocol, so there needs to be a quorum of masters that
> can talk to each other in order to operate. Consensus protocols tend to be
> very latency-sensitive, so you probably want to keep masters near each
> other.
>
> Some of our clusters span semi-wide geographical regions (in production,
> up to about 5 milliseconds RTT between master and some slaves). So far, we
> haven't seen any issues caused by that amount of latency, and I believe we
> have clusters in non-production environments which have even higher round
> trip between slaves and masters, and work fine. I haven't benchmarked task
> launch time or anything like that, so I can't say how much it affects the
> speed of operations.
>
> Mesos generally does the right thing around network partitions (changes
> won't propagate, but it won't kill your tasks), but if you're running
> things in Marathon and using TCP or HTTP healthchecks, be aware that
> Marathon does not rate limit itself on issuing task kills
> <https://github.com/mesosphere/marathon/issues/3317> for healthcheck
> failures. This means during a network partition, your applications will be
> fine, but once the network partition heals (or if you're experiencing
> packet loss but not total failure), Marathon will suddenly kill all of the
> tasks on the far side of the partition. A workaround for that is to use
> command health checks, which are run by the mesos slave.
>
>
>> 2. Does setting up ZK observers in each non-primary dc and pointing the
>> agents at them exclusively make sense?
>>
>
> My understanding of ZK observers is that they proxy writes to the actual
> ZK quorum members, so this would probably be fine. mesos-slave uses ZK to
> discover masters, and mesos-master uses ZK to do leader election; only
> mesos-master is doing any writes to ZK.
>
> I'm not sure how often mesos-slave reads from ZK to get the list of
> masters; I assume it doesn't bother if it has a live connection to a master.
>
>
>> 4. Any suggestions on how best to do agent attributes / constraints for
>> something like this? I was planning on having the config management add a
>> "data_center" agent attribute to match on.
>>
>
> If you're running services on Marathon or similar, I'd definitely
> recommend exposing the location of the slaves as an attribute, and having
> constraints to keep different instances of your application spread across
> the different locations. The "correct" constraints to apply depends on your
> application and latency / failure sensitivity.
>
> Evan
>
>
>> Thanks!
>>
>> [1]
>> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>>
>> --
>> Jeff Schroeder
>>
>> Don't drink and derive, alcohol and analysis don't mix.
>> http://www.digitalprognosis.com
>>
>
>

Re: Mesos agents across a WAN?

Posted by Evan Krall <kr...@yelp.com>.
On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <je...@computer.org>
wrote:

> Given regional bare metal Mesos clusters on multiple continents, are there
> any known issues running some of the agents over the WAN? Is anyone else
> doing it, or is this a terrible idea that I should tell management no on?
>
> A few specifics:
>
> 1. Are there any known limitations or configuration gotchas I might
> encounter?
>

One thing to keep in mind is that the masters maintain a distributed log
through a consensus protocol, so there needs to be a quorum of masters that
can talk to each other in order to operate. Consensus protocols tend to be
very latency-sensitive, so you probably want to keep masters near each
other.

Some of our clusters span semi-wide geographical regions (in production, up
to about 5 milliseconds RTT between master and some slaves). So far, we
haven't seen any issues caused by that amount of latency, and I believe we
have clusters in non-production environments which have even higher round
trip between slaves and masters, and work fine. I haven't benchmarked task
launch time or anything like that, so I can't say how much it affects the
speed of operations.

Mesos generally does the right thing around network partitions (changes
won't propagate, but it won't kill your tasks), but if you're running
things in Marathon and using TCP or HTTP healthchecks, be aware that
Marathon does not rate limit itself on issuing task kills
<https://github.com/mesosphere/marathon/issues/3317> for healthcheck
failures. This means during a network partition, your applications will be
fine, but once the network partition heals (or if you're experiencing
packet loss but not total failure), Marathon will suddenly kill all of the
tasks on the far side of the partition. A workaround for that is to use
command health checks, which are run by the mesos slave.


> 2. Does setting up ZK observers in each non-primary dc and pointing the
> agents at them exclusively make sense?
>

My understanding of ZK observers is that they proxy writes to the actual ZK
quorum members, so this would probably be fine. mesos-slave uses ZK to
discover masters, and mesos-master uses ZK to do leader election; only
mesos-master is doing any writes to ZK.

I'm not sure how often mesos-slave reads from ZK to get the list of
masters; I assume it doesn't bother if it has a live connection to a master.


> 4. Any suggestions on how best to do agent attributes / constraints for
> something like this? I was planning on having the config management add a
> "data_center" agent attribute to match on.
>

If you're running services on Marathon or similar, I'd definitely recommend
exposing the location of the slaves as an attribute, and having constraints
to keep different instances of your application spread across the different
locations. The "correct" constraints to apply depends on your application
and latency / failure sensitivity.

Evan


> Thanks!
>
> [1]
> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>
> --
> Jeff Schroeder
>
> Don't drink and derive, alcohol and analysis don't mix.
> http://www.digitalprognosis.com
>