You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by "Erb, Stephan" <St...@blue-yonder.com> on 2016/03/21 13:54:01 UTC

Current state of the oversubscription feature

Hi everyone,

I am interested in the current state of the Mesos oversubscription feature [1]. In particular, I would like to know if anyone has taken a closer look at non-compressible resources such as memory. 

Anything I should be aware of?

Thanks and Best Regards,
Stephan

[1] http://mesos.apache.org/documentation/latest/oversubscription/

Re: Current state of the oversubscription feature

Posted by Niklas Nielsen <ni...@qni.dk>.
Hi everyone,

For now (and in the current Serenity[1]), we avoided oversubscription of
non-compressible resources.
If you look at a system like Heracles[2], the polling interval is in
O(seconds) which may not be fast enough to react to OOM. But shouldn't be
impossible, taken all the work VMWare has been putting into it. We
postponed it for now.
There is, however, nothing that prevents you from doing that with the
current resource estimator and QoS controller APIs.

In the first version of Serenity, we relied on slack in terms of cpu time
to deem how many resources which should be oversubscribed and then used
hardware performance counters to monitor (and protect) latency critical
workloads.
It turned out, however, to be extremely difficult to make universal without
access to the APMs (SLI/SLO) from the application itself.

Niklas

[1] https://github.com/mesosphere/serenity

On Mon, Mar 21, 2016 at 10:54 AM, Zhitao Li <zh...@gmail.com> wrote:

> Hi Stephan,
>
> Glad someone is sharing interest in this topic. My company is also very
> interested in this topic. Sharing a couple of thoughts:
>
> 1. I believe there real difficulties here come from isolation: how Mesos
> would handle over committed memory because it cannot throttle like CPU?
> 2. Handling this within one single Mesos framework could differ from the
> case of running multiple frameworks;
> 3. I know you are active on Apache Aurora. I believe right now Aurora does
> not consider ram as revocable resources, but we probably work together to
> expand that once we know the isolation story.
>
>
> On Mon, Mar 21, 2016 at 8:30 AM, Erb, Stephan <Stephan.Erb@blue-yonder.com
> > wrote:
>
>> Judging from the epic description, this seems to target the
>> oversubscription of reserved resources ​on the framework level.
>>
>>
>> However, my question was targeting the task level, where one task of a
>> framework is requesting more RAM than it actually uses, and another tasks
>> from the same framework can be started as revocable and use those slack
>> resources.
>>
>>
>> The latter is already possible with compressible resources such as CPU or
>> bandwidth. I am now interested in non-compressible resources (i.e. memory).
>>
>>
>> ------------------------------
>> *From:* Guangya Liu <gy...@gmail.com>
>> *Sent:* Monday, March 21, 2016 15:53
>> *To:* user@mesos.apache.org
>> *Subject:* Re: Current state of the oversubscription feature
>>
>> https://issues.apache.org/jira/browse/MESOS-4967 is planning to
>> introduce "Oversubscription for reservation", can you please help check
>> if this help?
>>
>> Thanks,
>>
>> Guangya
>>
>> On Mon, Mar 21, 2016 at 8:54 PM, Erb, Stephan <
>> Stephan.Erb@blue-yonder.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I am interested in the current state of the Mesos oversubscription
>>> feature [1]. In particular, I would like to know if anyone has taken a
>>> closer look at non-compressible resources such as memory.
>>>
>>> Anything I should be aware of?
>>>
>>> Thanks and Best Regards,
>>> Stephan
>>>
>>> [1] http://mesos.apache.org/documentation/latest/oversubscription/
>>
>>
>>
>
>
> --
> Cheers,
>
> Zhitao Li
>



-- 
Niklas

Re: Current state of the oversubscription feature

Posted by Zhitao Li <zh...@gmail.com>.
Hi Stephan,

Glad someone is sharing interest in this topic. My company is also very
interested in this topic. Sharing a couple of thoughts:

1. I believe there real difficulties here come from isolation: how Mesos
would handle over committed memory because it cannot throttle like CPU?
2. Handling this within one single Mesos framework could differ from the
case of running multiple frameworks;
3. I know you are active on Apache Aurora. I believe right now Aurora does
not consider ram as revocable resources, but we probably work together to
expand that once we know the isolation story.


On Mon, Mar 21, 2016 at 8:30 AM, Erb, Stephan <St...@blue-yonder.com>
wrote:

> Judging from the epic description, this seems to target the
> oversubscription of reserved resources ​on the framework level.
>
>
> However, my question was targeting the task level, where one task of a
> framework is requesting more RAM than it actually uses, and another tasks
> from the same framework can be started as revocable and use those slack
> resources.
>
>
> The latter is already possible with compressible resources such as CPU or
> bandwidth. I am now interested in non-compressible resources (i.e. memory).
>
>
> ------------------------------
> *From:* Guangya Liu <gy...@gmail.com>
> *Sent:* Monday, March 21, 2016 15:53
> *To:* user@mesos.apache.org
> *Subject:* Re: Current state of the oversubscription feature
>
> https://issues.apache.org/jira/browse/MESOS-4967 is planning to introduce
> "Oversubscription for reservation", can you please help check if this
> help?
>
> Thanks,
>
> Guangya
>
> On Mon, Mar 21, 2016 at 8:54 PM, Erb, Stephan <Stephan.Erb@blue-yonder.com
> > wrote:
>
>> Hi everyone,
>>
>> I am interested in the current state of the Mesos oversubscription
>> feature [1]. In particular, I would like to know if anyone has taken a
>> closer look at non-compressible resources such as memory.
>>
>> Anything I should be aware of?
>>
>> Thanks and Best Regards,
>> Stephan
>>
>> [1] http://mesos.apache.org/documentation/latest/oversubscription/
>
>
>


-- 
Cheers,

Zhitao Li

Re: Current state of the oversubscription feature

Posted by "Erb, Stephan" <St...@blue-yonder.com>.
Judging from the epic description, this seems to target the oversubscription of reserved resources ?on the framework level.


However, my question was targeting the task level, where one task of a framework is requesting more RAM than it actually uses, and another tasks from the same framework can be started as revocable and use those slack resources.


The latter is already possible with compressible resources such as CPU or bandwidth. I am now interested in non-compressible resources (i.e. memory).


________________________________
From: Guangya Liu <gy...@gmail.com>
Sent: Monday, March 21, 2016 15:53
To: user@mesos.apache.org
Subject: Re: Current state of the oversubscription feature

https://issues.apache.org/jira/browse/MESOS-4967 is planning to introduce "Oversubscription for reservation", can you please help check if this help?

Thanks,

Guangya

On Mon, Mar 21, 2016 at 8:54 PM, Erb, Stephan <St...@blue-yonder.com>> wrote:
Hi everyone,

I am interested in the current state of the Mesos oversubscription feature [1]. In particular, I would like to know if anyone has taken a closer look at non-compressible resources such as memory.

Anything I should be aware of?

Thanks and Best Regards,
Stephan

[1] http://mesos.apache.org/documentation/latest/oversubscription/


Re: Current state of the oversubscription feature

Posted by Guangya Liu <gy...@gmail.com>.
https://issues.apache.org/jira/browse/MESOS-4967 is planning to
introduce "Oversubscription
for reservation", can you please help check if this help?

Thanks,

Guangya

On Mon, Mar 21, 2016 at 8:54 PM, Erb, Stephan <St...@blue-yonder.com>
wrote:

> Hi everyone,
>
> I am interested in the current state of the Mesos oversubscription feature
> [1]. In particular, I would like to know if anyone has taken a closer look
> at non-compressible resources such as memory.
>
> Anything I should be aware of?
>
> Thanks and Best Regards,
> Stephan
>
> [1] http://mesos.apache.org/documentation/latest/oversubscription/