You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Bill Green <bg...@newrelic.com> on 2016/03/02 03:21:33 UTC

marathon-lb at scale

For folks using marathon-lb, how far have you scaled it? I’d be very interested to hear your experiences with it, especially in the area of partition tolerance.

Any insights would be greatly appreciated, thanks!

--
Bill Green
SRE, New Relic
@cloudangst

Re: marathon-lb at scale

Posted by Jeff Schroeder <je...@computer.org>.

Being able to set HAPROXY_0_VHOST to the mesos-dns name and having
everything just magically work is a pretty fantastic user experience
however. Especially for users who might need to talk to SysAdmin teams to
manually change DNS. Any alternatives? We have marathon-lb running in a
container with keepalived on every mesos agent. That way users who want
static names can ask to have a name created which is just a cname to the
vip. Otherwise they can use the mesos-dns name and manage the entire
lifecycle themselves.

On Thursday, March 3, 2016, Brenden Matthews <br...@diddyinc.com> wrote:

> As a sidenote, I wouldn't recommend running marathon-lb on every node in
> the cluster. Running it on 3-5 should be sufficient for HA. You can simply
> round-robin (with DNS) between the marathon-lb instances (using, say,
> Mesos-DNS). The additional round trip delay you save by running marathon-lb
> on each machine is likely inconsequential.
>
> On Thu, Mar 3, 2016 at 8:13 AM, Alfredo Carneiro <
> alfredo@simbioseventures.com
> <javascript:_e(%7B%7D,'cvml','alfredo@simbioseventures.com');>> wrote:
>
>> Craig, soon I can increase my cluster and I will let you know okay? For
>> now, it is in production like a test!hehe
>>
>> On Thu, Mar 3, 2016 at 1:06 PM, craig w <codecraig@gmail.com
>> <javascript:_e(%7B%7D,'cvml','codecraig@gmail.com');>> wrote:
>>
>>> Alfredo -- i'll be curious to hear how it goes if you scale it up. I had
>>> initially tested with 12 nodes and it seemed fine, then when i went to 90
>>> it became an issue. Again, this was with an older Marathon, so things could
>>> be much better now.
>>>
>>> On Thu, Mar 3, 2016 at 11:02 AM, Alfredo Carneiro <
>>> alfredo@simbioseventures.com
>>> <javascript:_e(%7B%7D,'cvml','alfredo@simbioseventures.com');>> wrote:
>>>
>>>> I've just started to use it on production with 14 nodes and it is
>>>> scaling well! I had no problems yet. And different from
>>>> haproxy-marathon-bridge, it supports natively a loop every second, so you
>>>> don't have any delay, at least I haven't seen.
>>>>
>>>> On Thu, Mar 3, 2016 at 12:23 PM, craig w <codecraig@gmail.com
>>>> <javascript:_e(%7B%7D,'cvml','codecraig@gmail.com');>> wrote:
>>>>
>>>>> That was using the marathon-haproxy-bridge script, so it was polling
>>>>> marathon every minute to try and keep up to date. Though a minute lag in
>>>>> updating the proxy wasn't sufficient.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Alfredo Miranda
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> https://github.com/mindscratch
>>> https://www.google.com/+CraigWickesser
>>> https://twitter.com/mind_scratch
>>> https://twitter.com/craig_links
>>>
>>>
>>
>>
>> --
>> Alfredo Miranda
>>
>
>

-- 
Text by Jeff, typos by iPhone

Re: marathon-lb at scale

Posted by Brenden Matthews <br...@diddyinc.com>.

As a sidenote, I wouldn't recommend running marathon-lb on every node in
the cluster. Running it on 3-5 should be sufficient for HA. You can simply
round-robin (with DNS) between the marathon-lb instances (using, say,
Mesos-DNS). The additional round trip delay you save by running marathon-lb
on each machine is likely inconsequential.

On Thu, Mar 3, 2016 at 8:13 AM, Alfredo Carneiro <
alfredo@simbioseventures.com> wrote:

> Craig, soon I can increase my cluster and I will let you know okay? For
> now, it is in production like a test!hehe
>
> On Thu, Mar 3, 2016 at 1:06 PM, craig w <co...@gmail.com> wrote:
>
>> Alfredo -- i'll be curious to hear how it goes if you scale it up. I had
>> initially tested with 12 nodes and it seemed fine, then when i went to 90
>> it became an issue. Again, this was with an older Marathon, so things could
>> be much better now.
>>
>> On Thu, Mar 3, 2016 at 11:02 AM, Alfredo Carneiro <
>> alfredo@simbioseventures.com> wrote:
>>
>>> I've just started to use it on production with 14 nodes and it is
>>> scaling well! I had no problems yet. And different from
>>> haproxy-marathon-bridge, it supports natively a loop every second, so you
>>> don't have any delay, at least I haven't seen.
>>>
>>> On Thu, Mar 3, 2016 at 12:23 PM, craig w <co...@gmail.com> wrote:
>>>
>>>> That was using the marathon-haproxy-bridge script, so it was polling
>>>> marathon every minute to try and keep up to date. Though a minute lag in
>>>> updating the proxy wasn't sufficient.
>>>>
>>>
>>>
>>>
>>> --
>>> Alfredo Miranda
>>>
>>
>>
>>
>> --
>>
>> https://github.com/mindscratch
>> https://www.google.com/+CraigWickesser
>> https://twitter.com/mind_scratch
>> https://twitter.com/craig_links
>>
>>
>
>
> --
> Alfredo Miranda
>

Re: marathon-lb at scale

Posted by Alfredo Carneiro <al...@simbioseventures.com>.

Craig, soon I can increase my cluster and I will let you know okay? For
now, it is in production like a test!hehe

On Thu, Mar 3, 2016 at 1:06 PM, craig w <co...@gmail.com> wrote:

> Alfredo -- i'll be curious to hear how it goes if you scale it up. I had
> initially tested with 12 nodes and it seemed fine, then when i went to 90
> it became an issue. Again, this was with an older Marathon, so things could
> be much better now.
>
> On Thu, Mar 3, 2016 at 11:02 AM, Alfredo Carneiro <
> alfredo@simbioseventures.com> wrote:
>
>> I've just started to use it on production with 14 nodes and it is scaling
>> well! I had no problems yet. And different from haproxy-marathon-bridge, it
>> supports natively a loop every second, so you don't have any delay, at
>> least I haven't seen.
>>
>> On Thu, Mar 3, 2016 at 12:23 PM, craig w <co...@gmail.com> wrote:
>>
>>> That was using the marathon-haproxy-bridge script, so it was polling
>>> marathon every minute to try and keep up to date. Though a minute lag in
>>> updating the proxy wasn't sufficient.
>>>
>>
>>
>>
>> --
>> Alfredo Miranda
>>
>
>
>
> --
>
> https://github.com/mindscratch
> https://www.google.com/+CraigWickesser
> https://twitter.com/mind_scratch
> https://twitter.com/craig_links
>
>


-- 
Alfredo Miranda

Re: marathon-lb at scale

Posted by craig w <co...@gmail.com>.

Alfredo -- i'll be curious to hear how it goes if you scale it up. I had
initially tested with 12 nodes and it seemed fine, then when i went to 90
it became an issue. Again, this was with an older Marathon, so things could
be much better now.

On Thu, Mar 3, 2016 at 11:02 AM, Alfredo Carneiro <
alfredo@simbioseventures.com> wrote:

> I've just started to use it on production with 14 nodes and it is scaling
> well! I had no problems yet. And different from haproxy-marathon-bridge, it
> supports natively a loop every second, so you don't have any delay, at
> least I haven't seen.
>
> On Thu, Mar 3, 2016 at 12:23 PM, craig w <co...@gmail.com> wrote:
>
>> That was using the marathon-haproxy-bridge script, so it was polling
>> marathon every minute to try and keep up to date. Though a minute lag in
>> updating the proxy wasn't sufficient.
>>
>
>
>
> --
> Alfredo Miranda
>



-- 

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Re: marathon-lb at scale

Posted by Alfredo Carneiro <al...@simbioseventures.com>.

I've just started to use it on production with 14 nodes and it is scaling
well! I had no problems yet. And different from haproxy-marathon-bridge, it
supports natively a loop every second, so you don't have any delay, at
least I haven't seen.

On Thu, Mar 3, 2016 at 12:23 PM, craig w <co...@gmail.com> wrote:

> That was using the marathon-haproxy-bridge script, so it was polling
> marathon every minute to try and keep up to date. Though a minute lag in
> updating the proxy wasn't sufficient.
>

-- 
Alfredo Miranda

Re: marathon-lb at scale

Posted by craig w <co...@gmail.com>.

That was using the marathon-haproxy-bridge script, so it was polling
marathon every minute to try and keep up to date. Though a minute lag in
updating the proxy wasn't sufficient.

Re: marathon-lb at scale

Posted by Chris Baker <ch...@galacticfog.com>.

craig, was that using polling via a cron over the marathon-haproxy-bridge
script?

On Wed, Mar 2, 2016 at 8:53 PM craig w <co...@gmail.com> wrote:

> For what it's worth, I had tried the haproxy bridge with marathon 0.11
> back when that was the latest release. I had the bridge running on 90 nodes
> and it crushed the marathon leader.
>
> Marathon had made lots of improvements so it might be better now, but
> figured I'd share.
> On Mar 2, 2016 7:59 PM, "Brenden Matthews" <br...@diddyinc.com> wrote:
>
>> I'd suggest you also try posting on the Marathon group:
>> https://groups.google.com/forum/#!forum/marathon-framework
>>
>> On Tue, Mar 1, 2016 at 6:21 PM, Bill Green <bg...@newrelic.com> wrote:
>>
>>> For folks using marathon-lb, how far have you scaled it? I’d be very
>>> interested to hear your experiences with it, especially in the area of
>>> partition tolerance.
>>>
>>> Any insights would be greatly appreciated, thanks!
>>>
>>> --
>>> Bill Green
>>> SRE, New Relic
>>> @cloudangst
>>>
>>>
>>>
>>>
>>

Re: marathon-lb at scale

Posted by craig w <co...@gmail.com>.

For what it's worth, I had tried the haproxy bridge with marathon 0.11 back
when that was the latest release. I had the bridge running on 90 nodes and
it crushed the marathon leader.

Marathon had made lots of improvements so it might be better now, but
figured I'd share.
On Mar 2, 2016 7:59 PM, "Brenden Matthews" <br...@diddyinc.com> wrote:

> I'd suggest you also try posting on the Marathon group:
> https://groups.google.com/forum/#!forum/marathon-framework
>
> On Tue, Mar 1, 2016 at 6:21 PM, Bill Green <bg...@newrelic.com> wrote:
>
>> For folks using marathon-lb, how far have you scaled it? I’d be very
>> interested to hear your experiences with it, especially in the area of
>> partition tolerance.
>>
>> Any insights would be greatly appreciated, thanks!
>>
>> --
>> Bill Green
>> SRE, New Relic
>> @cloudangst
>>
>>
>>
>>
>

Re: marathon-lb at scale

Posted by Brenden Matthews <br...@diddyinc.com>.

I'd suggest you also try posting on the Marathon group:
https://groups.google.com/forum/#!forum/marathon-framework

On Tue, Mar 1, 2016 at 6:21 PM, Bill Green <bg...@newrelic.com> wrote:

> For folks using marathon-lb, how far have you scaled it? I’d be very
> interested to hear your experiences with it, especially in the area of
> partition tolerance.
>
> Any insights would be greatly appreciated, thanks!
>
> --
> Bill Green
> SRE, New Relic
> @cloudangst
>
>
>
>