You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mesos.apache.org by Itamar Ostricher <it...@yowza3d.com> on 2015/01/06 10:12:58 UTC

Recommended resources for master / scheduler machines

Are there recommendations regarding master / scheduler machines resources
as function of cluster size?

Say I have a cluster with hundreds of slave machines and thousands of CPUs,
with a single framework that will schedule millions of tasks.
How does the strength of the master & scheduler machines affect the overall
cluster performance?

Thanks,
- Itamar.

Re: Recommended resources for master / scheduler machines

Posted by Shuai Lin <li...@gmail.com>.

Hi Itamar,

You should really run zookeeper on more than one node (typically 3, 5, or 7
is very common). Otherwise, in your case, if the node running your
zookeeper servce goes down for any reason, your whole mesos installation
would stop working until you bring that node back.

Regards,
Shuai



On Sat, Jan 10, 2015 at 9:56 PM, Itamar Ostricher <it...@yowza3d.com>
wrote:

> Interesting. I knew I needed to look into ZooKeeper more than I did :-)
>
> I don't know what's "distributed mode" in ZooKeeper. I can tell you we use
> a single host for the master, and configure all machines with
> "zk://master-host-name:2181/mesos" in /etc/mesos/zk before the mesos
> services are started.
>
> We don't assign a dedicated device to ZooKeeper, so maybe it bites us...
>
> On Thu, Jan 8, 2015 at 9:33 PM, Tomas Barton <ba...@gmail.com>
> wrote:
>
>> Is ZooKeeper running in distributed mode?
>>
>> ZooKeeper is writes periodically all data to disk (transaction log), so
>> the bottleneck could be ZooKeeper rather than
>> not enough CPUs. ZooKeeper limits each key to 1MB, typically 512MB should
>> be enough for ZooKeeper (or 4GB
>> might not be enough, depends on your use-case).
>>
>> from ZooKeeper docs:
>>
>> ZooKeeper's transaction log must be on a dedicated device. (A dedicated
>> partition is not enough.) ZooKeeper writes the log sequentially, without
>> seeking Sharing your log device with other processes can cause seeks and
>> contention, which in turn can cause multi-second delays.
>>
>>  In particular, you should not create a situation in which ZooKeeper
>> swaps to disk. The disk is death to ZooKeeper. Everything is ordered, so if
>> processing one request swaps the disk, all other queued requests will
>> probably do the same. the disk. DON'T SWAP.
>>
>>
>> On 8 January 2015 at 16:47, Itamar Ostricher <it...@yowza3d.com> wrote:
>>
>>> Thanks Tomas.
>>>
>>> We're still quite far from the 10k-20k machines limit :-)
>>>
>>> Currently, our framework scheduler generates many (millions) of mostly
>>> small tasks (some in the ~100ms, some in the few seconds).
>>> I understand that the network is the main bottleneck, but we sometimes
>>> experience lost tasks, and sometimes I see master logs indicating that the
>>> master is unable to talk with the zookeeper service (which is on the same
>>> host), and I was wondering if it's related to CPU/RAM of the master machine.
>>> Is 1 CPU enough? 2? 4?
>>> 1GiB RAM? 4? 8?
>>>
>>> On Thu, Jan 8, 2015 at 5:00 PM, Tomas Barton <ba...@gmail.com>
>>> wrote:
>>>
>>>> Hi Itamar,
>>>>
>>>> there's definitely certain limit of machines which can Mesos master
>>>> handle. This limit is between 10 000 - 20 000 (that's number
>>>> reported by Twitter). This bottleneck is caused by event loop which
>>>> handles communication at master.
>>>>
>>>> With hundreds of machines you should be fine. Only in case that your
>>>> framework scheduler would demand
>>>> too many resources for computing allocations you might encounter some
>>>> problems.
>>>>
>>>> How does the strength of the master & scheduler machines affect the
>>>>> overall cluster performance?
>>>>
>>>>
>>>> I would say that the network is usually the main bottleneck. Adding
>>>> extra RAM won't improve mesos-master
>>>> performance. Of course if there's high CPU load on master you might
>>>> observe performance regression. Also
>>>> this depends on granularity of your tasks, if you have few long running
>>>> tasks or many short tasks (which runs
>>>> just hundreds of ms).
>>>>
>>>> Tomas
>>>>
>>>>
>>>> On 6 January 2015 at 10:12, Itamar Ostricher <it...@yowza3d.com>
>>>> wrote:
>>>>
>>>>> Are there recommendations regarding master / scheduler machines
>>>>> resources as function of cluster size?
>>>>>
>>>>> Say I have a cluster with hundreds of slave machines and thousands of
>>>>> CPUs, with a single framework that will schedule millions of tasks.
>>>>> How does the strength of the master & scheduler machines affect the
>>>>> overall cluster performance?
>>>>>
>>>>> Thanks,
>>>>> - Itamar.
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Recommended resources for master / scheduler machines

Posted by Itamar Ostricher <it...@yowza3d.com>.

Interesting. I knew I needed to look into ZooKeeper more than I did :-)

I don't know what's "distributed mode" in ZooKeeper. I can tell you we use
a single host for the master, and configure all machines with
"zk://master-host-name:2181/mesos" in /etc/mesos/zk before the mesos
services are started.

We don't assign a dedicated device to ZooKeeper, so maybe it bites us...

On Thu, Jan 8, 2015 at 9:33 PM, Tomas Barton <ba...@gmail.com> wrote:

> Is ZooKeeper running in distributed mode?
>
> ZooKeeper is writes periodically all data to disk (transaction log), so
> the bottleneck could be ZooKeeper rather than
> not enough CPUs. ZooKeeper limits each key to 1MB, typically 512MB should
> be enough for ZooKeeper (or 4GB
> might not be enough, depends on your use-case).
>
> from ZooKeeper docs:
>
> ZooKeeper's transaction log must be on a dedicated device. (A dedicated
> partition is not enough.) ZooKeeper writes the log sequentially, without
> seeking Sharing your log device with other processes can cause seeks and
> contention, which in turn can cause multi-second delays.
>
>  In particular, you should not create a situation in which ZooKeeper swaps
> to disk. The disk is death to ZooKeeper. Everything is ordered, so if
> processing one request swaps the disk, all other queued requests will
> probably do the same. the disk. DON'T SWAP.
>
>
> On 8 January 2015 at 16:47, Itamar Ostricher <it...@yowza3d.com> wrote:
>
>> Thanks Tomas.
>>
>> We're still quite far from the 10k-20k machines limit :-)
>>
>> Currently, our framework scheduler generates many (millions) of mostly
>> small tasks (some in the ~100ms, some in the few seconds).
>> I understand that the network is the main bottleneck, but we sometimes
>> experience lost tasks, and sometimes I see master logs indicating that the
>> master is unable to talk with the zookeeper service (which is on the same
>> host), and I was wondering if it's related to CPU/RAM of the master machine.
>> Is 1 CPU enough? 2? 4?
>> 1GiB RAM? 4? 8?
>>
>> On Thu, Jan 8, 2015 at 5:00 PM, Tomas Barton <ba...@gmail.com>
>> wrote:
>>
>>> Hi Itamar,
>>>
>>> there's definitely certain limit of machines which can Mesos master
>>> handle. This limit is between 10 000 - 20 000 (that's number
>>> reported by Twitter). This bottleneck is caused by event loop which
>>> handles communication at master.
>>>
>>> With hundreds of machines you should be fine. Only in case that your
>>> framework scheduler would demand
>>> too many resources for computing allocations you might encounter some
>>> problems.
>>>
>>> How does the strength of the master & scheduler machines affect the
>>>> overall cluster performance?
>>>
>>>
>>> I would say that the network is usually the main bottleneck. Adding
>>> extra RAM won't improve mesos-master
>>> performance. Of course if there's high CPU load on master you might
>>> observe performance regression. Also
>>> this depends on granularity of your tasks, if you have few long running
>>> tasks or many short tasks (which runs
>>> just hundreds of ms).
>>>
>>> Tomas
>>>
>>>
>>> On 6 January 2015 at 10:12, Itamar Ostricher <it...@yowza3d.com> wrote:
>>>
>>>> Are there recommendations regarding master / scheduler machines
>>>> resources as function of cluster size?
>>>>
>>>> Say I have a cluster with hundreds of slave machines and thousands of
>>>> CPUs, with a single framework that will schedule millions of tasks.
>>>> How does the strength of the master & scheduler machines affect the
>>>> overall cluster performance?
>>>>
>>>> Thanks,
>>>> - Itamar.
>>>>
>>>
>>>
>>
>

Re: Recommended resources for master / scheduler machines

Posted by Tomas Barton <ba...@gmail.com>.

Is ZooKeeper running in distributed mode?

ZooKeeper is writes periodically all data to disk (transaction log), so the
bottleneck could be ZooKeeper rather than
not enough CPUs. ZooKeeper limits each key to 1MB, typically 512MB should
be enough for ZooKeeper (or 4GB
might not be enough, depends on your use-case).

from ZooKeeper docs:

ZooKeeper's transaction log must be on a dedicated device. (A dedicated
partition is not enough.) ZooKeeper writes the log sequentially, without
seeking Sharing your log device with other processes can cause seeks and
contention, which in turn can cause multi-second delays.

 In particular, you should not create a situation in which ZooKeeper swaps
to disk. The disk is death to ZooKeeper. Everything is ordered, so if
processing one request swaps the disk, all other queued requests will
probably do the same. the disk. DON'T SWAP.

On 8 January 2015 at 16:47, Itamar Ostricher <it...@yowza3d.com> wrote:

> Thanks Tomas.
>
> We're still quite far from the 10k-20k machines limit :-)
>
> Currently, our framework scheduler generates many (millions) of mostly
> small tasks (some in the ~100ms, some in the few seconds).
> I understand that the network is the main bottleneck, but we sometimes
> experience lost tasks, and sometimes I see master logs indicating that the
> master is unable to talk with the zookeeper service (which is on the same
> host), and I was wondering if it's related to CPU/RAM of the master machine.
> Is 1 CPU enough? 2? 4?
> 1GiB RAM? 4? 8?
>
> On Thu, Jan 8, 2015 at 5:00 PM, Tomas Barton <ba...@gmail.com>
> wrote:
>
>> Hi Itamar,
>>
>> there's definitely certain limit of machines which can Mesos master
>> handle. This limit is between 10 000 - 20 000 (that's number
>> reported by Twitter). This bottleneck is caused by event loop which
>> handles communication at master.
>>
>> With hundreds of machines you should be fine. Only in case that your
>> framework scheduler would demand
>> too many resources for computing allocations you might encounter some
>> problems.
>>
>> How does the strength of the master & scheduler machines affect the
>>> overall cluster performance?
>>
>>
>> I would say that the network is usually the main bottleneck. Adding extra
>> RAM won't improve mesos-master
>> performance. Of course if there's high CPU load on master you might
>> observe performance regression. Also
>> this depends on granularity of your tasks, if you have few long running
>> tasks or many short tasks (which runs
>> just hundreds of ms).
>>
>> Tomas
>>
>>
>> On 6 January 2015 at 10:12, Itamar Ostricher <it...@yowza3d.com> wrote:
>>
>>> Are there recommendations regarding master / scheduler machines
>>> resources as function of cluster size?
>>>
>>> Say I have a cluster with hundreds of slave machines and thousands of
>>> CPUs, with a single framework that will schedule millions of tasks.
>>> How does the strength of the master & scheduler machines affect the
>>> overall cluster performance?
>>>
>>> Thanks,
>>> - Itamar.
>>>
>>
>>
>

Re: Recommended resources for master / scheduler machines

Posted by Itamar Ostricher <it...@yowza3d.com>.

Thanks Tomas.

We're still quite far from the 10k-20k machines limit :-)

Currently, our framework scheduler generates many (millions) of mostly
small tasks (some in the ~100ms, some in the few seconds).
I understand that the network is the main bottleneck, but we sometimes
experience lost tasks, and sometimes I see master logs indicating that the
master is unable to talk with the zookeeper service (which is on the same
host), and I was wondering if it's related to CPU/RAM of the master machine.
Is 1 CPU enough? 2? 4?
1GiB RAM? 4? 8?

On Thu, Jan 8, 2015 at 5:00 PM, Tomas Barton <ba...@gmail.com> wrote:

> Hi Itamar,
>
> there's definitely certain limit of machines which can Mesos master
> handle. This limit is between 10 000 - 20 000 (that's number
> reported by Twitter). This bottleneck is caused by event loop which
> handles communication at master.
>
> With hundreds of machines you should be fine. Only in case that your
> framework scheduler would demand
> too many resources for computing allocations you might encounter some
> problems.
>
> How does the strength of the master & scheduler machines affect the
>> overall cluster performance?
>
>
> I would say that the network is usually the main bottleneck. Adding extra
> RAM won't improve mesos-master
> performance. Of course if there's high CPU load on master you might
> observe performance regression. Also
> this depends on granularity of your tasks, if you have few long running
> tasks or many short tasks (which runs
> just hundreds of ms).
>
> Tomas
>
>
> On 6 January 2015 at 10:12, Itamar Ostricher <it...@yowza3d.com> wrote:
>
>> Are there recommendations regarding master / scheduler machines resources
>> as function of cluster size?
>>
>> Say I have a cluster with hundreds of slave machines and thousands of
>> CPUs, with a single framework that will schedule millions of tasks.
>> How does the strength of the master & scheduler machines affect the
>> overall cluster performance?
>>
>> Thanks,
>> - Itamar.
>>
>
>

Re: Recommended resources for master / scheduler machines

Posted by Tomas Barton <ba...@gmail.com>.

Hi Itamar,

there's definitely certain limit of machines which can Mesos master handle.
This limit is between 10 000 - 20 000 (that's number
reported by Twitter). This bottleneck is caused by event loop which handles
communication at master.

With hundreds of machines you should be fine. Only in case that your
framework scheduler would demand
too many resources for computing allocations you might encounter some
problems.

How does the strength of the master & scheduler machines affect the overall
> cluster performance?

I would say that the network is usually the main bottleneck. Adding extra
RAM won't improve mesos-master
performance. Of course if there's high CPU load on master you might observe
performance regression. Also
this depends on granularity of your tasks, if you have few long running
tasks or many short tasks (which runs
just hundreds of ms).

Tomas

On 6 January 2015 at 10:12, Itamar Ostricher <it...@yowza3d.com> wrote:

> Are there recommendations regarding master / scheduler machines resources
> as function of cluster size?
>
> Say I have a cluster with hundreds of slave machines and thousands of
> CPUs, with a single framework that will schedule millions of tasks.
> How does the strength of the master & scheduler machines affect the
> overall cluster performance?
>
> Thanks,
> - Itamar.
>