You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@aurora.apache.org by "Erb, Stephan" <St...@blue-yonder.com> on 2017/03/02 13:35:19 UTC

Dynamic Reservations

Hi everyone,

There have been two documents on Dynamic Reservations as a first step towards persistent services:

·         RFC: https://docs.google.com/document/d/15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.hcsc8tda08vy

·         Technical Design Doc:  https://docs.google.com/document/d/1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.klg3urfbnq3v

Since a couple of days there are also now two patches online for a MVP by Dmitriy:

·         https://reviews.apache.org/r/56690/

·         https://reviews.apache.org/r/56691/

From reading the documents, I am under the impression that there is a rough consensus on the following points:

·         We want dynamic reservations. Our general goal is to enable the re-scheduling of tasks on the same host they used in a previous run.

·         Dynamic reservations are a best-effort feature. If in doubt, a task will be scheduled somewhere else.

·         Jobs opt into reserved resources using an appropriate tier config.

·         The tier config in supposed to be neither preemptible nor revocable. Reserving resources therefore requires appropriate quota.

·         Aurora will tag reserved Mesos resources by adding the unique instance key of the reserving task instance as a label. Only this task instance will be allowed to use those tagged resources.

I am unclear on the following general questions as there is contradicting content:

a)       How does the user interact with reservations?  There are several proposals in the documents to auto-reserve on `aurora job create` or `aurora cron schedule` and to automatically un-reserve on the appropriate reverse actions. But will we also allow a user further control over the reservations so that they can manage those independent of the task/job lifecycle? For example, how does Borg handle this?

b)       The implementation proposal and patches include an OfferReconciler, so this implies we don’t want to offer any control for the user. The only control mechanism will be the cluster-wide offer wait time limiting the number of seconds unused reserved resources can linger before they are un-reserved.

c)       Will we allow adhoc/cron jobs to reserve resources? Does it even matter if we don’t give control to users and just rely on the OfferReconciler?


I have a couple of questions on the MVP and some implementation details. I will follow up with those in a separate mail.

Thanks and best regards,
Stephan

Re: Dynamic Reservations

Posted by Dmitriy Shirchenko <ca...@gmail.com>.
Yup, working on addressing all of the comments! Thanks for leaving them,
everyone.

Also, as @serb correctly pointed out (and Josh found out) I submitted an
updated patch [1] with updated design document [2]

[1] https://reviews.apache.org/r/57487/
[2]
https://docs.google.com/document/d/1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#

On Mon, Mar 13, 2017 at 2:33 PM Joshua Cohen <jc...@apache.org> wrote:

> Dmitriy,
>
> There's a fair number of comments both here and on the doc. Will you have
> time to respond to these so we can find a path forward?
>
> Cheers,
>
> Joshua
>
> On Wed, Mar 8, 2017 at 8:44 PM, David McLaughlin <dm...@apache.org>
> wrote:
>
> > Ticket for replace task primitive already exists:
> > https://issues.apache.org/jira/browse/MESOS-1280
> >
> > On Wed, Mar 8, 2017 at 6:34 PM, David McLaughlin <dmclaughlin@apache.org
> >
> > wrote:
> >
> > > Spoke with Zameer offline and he asked me to post additional thoughts
> > > here.
> > >
> > > My motivation for solving this without dynamic reservations is just the
> > > sheer number of questions I have after reading the RFC and current
> design
> > > doc. And most of them are not about the current proposal and goals or
> the
> > > MVP but more about how this feature will scale into persistent storage.
> > >
> > > I think best-effort dynamic reservations are such a different problem
> > than
> > > the reservations that would be needed to support persistent storage. My
> > > primary concern is around things like quota. For the current proposal
> and
> > > the small best-effort feature we're adding, it makes no sense to get
> into
> > > the complexities of separate quota for reserved resources vs preferred
> > > resources, but the reality of exposing such a concept to a large
> > > organisation where we can't automatically reclaim anything reserved
> means
> > > we'd almost definitely want that. The issue with the iterative approach
> > is
> > > decisions we take here could have a huge impact on those tasks later,
> > once
> > > we expose the reserved tier into the open. That means more upfront
> design
> > > and planning, which so far has blocked a super useful feature that I
> feel
> > > all of us want.
> > >
> > > My gut feeling is we went about this all wrong. We started with dynamic
> > > reservations and thought about how we could speed up task scheduling
> with
> > > them. If we took the current problem brief and started from first
> > > principals then I think we'd naturally look for something like a
> > > replaceTask(offerId, taskInfo) type API from Mesos.
> > >
> > > I'll bring this up within our team and see if we can put resources on
> > > adding such an API. Any feedback on this approach in the meantime is
> > > welcome.
> > >
> > > On Wed, Mar 8, 2017 at 5:30 PM, David McLaughlin <
> dmclaughlin@apache.org
> > >
> > > wrote:
> > >
> > >> You don't have to store anything with my proposal. Preemption doesn't
> > >> store anything either. The whole thing is it's just best-effort, and
> if
> > the
> > >> Scheduler restarts the worst that would happen is part of the current
> > batch
> > >> would have to go through the current Scheduling loop that users
> tolerate
> > >> and deal with today.
> > >>
> > >>
> > >>
> > >> On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zm...@apache.org>
> wrote:
> > >>
> > >>> David,
> > >>>
> > >>> I have two concerns with that idea. First, it would require
> persisting
> > >>> the
> > >>> relationship of <Hostname, Resources> to <Task> for every task. I'm
> not
> > >>> sure if adding more storage and storage operations is the ideal way
> of
> > >>> solving this problem. Second, in a multi framework environment, a
> > >>> framework
> > >>> needs to use dynamic reservations otherwise the resources might be
> > taken
> > >>> by
> > >>> another framework.
> > >>>
> > >>> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <
> > dmclaughlin@apache.org
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > So I read the docs again and I have one major question - do we even
> > >>> need
> > >>> > dynamic reservations for the current proposal?
> > >>> >
> > >>> > The current goal of the proposed work is to keep an offer on a host
> > and
> > >>> > prevent some other pending task from taking it before the next
> > >>> scheduling
> > >>> > round. This exact problem is solved in preemption and we could use
> a
> > >>> > similar technique for reserving offers after killing tasks when
> going
> > >>> > through the update loop. We wouldn't need to add tiers or
> > >>> reconciliation or
> > >>> > solve any of these other concerns. Reusing an offer skips so much
> of
> > >>> the
> > >>> > expensive stuff in the Scheduler that it would be a no-brainer for
> > the
> > >>> > operator to turn it on for every single task in the cluster.
> > >>> >
> > >>> >
> > >>> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sniemitz@apache.org
> >
> > >>> wrote:
> > >>> >
> > >>> > > I read over the docs, it looks like a good start.  Personally I
> > >>> don't see
> > >>> > > much of a benefit for dynamically reserved cpu/mem, but I'm
> excited
> > >>> about
> > >>> > > the possibility of building off this for dynamically reserved
> > >>> persistent
> > >>> > > volumes.
> > >>> > >
> > >>> > > I would like to see more detail on how a reservation "times out",
> > >>> and the
> > >>> > > configuration options per job around that, as I feel like its the
> > >>> most
> > >>> > > complicated part of all of this.  Ideally there would also be
> hooks
> > >>> into
> > >>> > > the host maintenance APIs here.
> > >>> > >
> > >>> > > I also didn't see any mention of it, but I believe mesos requires
> > the
> > >>> > > framework to reserve resources with a role.  By default aurora
> runs
> > >>> as
> > >>> > the
> > >>> > > special "*" role, does this mean aurora will need to have a role
> > >>> > specified
> > >>> > > now for this to work?  Or does mesos allow reserving resources
> > >>> without a
> > >>> > > role?
> > >>> > >
> > >>> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
> > >>> > Stephan.Erb@blue-yonder.com>
> > >>> > > wrote:
> > >>> > >
> > >>> > > > Hi everyone,
> > >>> > > >
> > >>> > > > There have been two documents on Dynamic Reservations as a
> first
> > >>> step
> > >>> > > > towards persistent services:
> > >>> > > >
> > >>> > > > ·         RFC: https://docs.google.com/document/d/
> > >>> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
> > >>> > hcsc8tda08vy
> > >>> > > >
> > >>> > > > ·         Technical Design Doc:
> https://docs.google.com/docume
> > >>> nt/d/
> > >>> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
> > >>> > klg3urfbnq3v
> > >>> > > >
> > >>> > > > Since a couple of days there are also now two patches online
> for
> > a
> > >>> MVP
> > >>> > by
> > >>> > > > Dmitriy:
> > >>> > > >
> > >>> > > > ·         https://reviews.apache.org/r/56690/
> > >>> > > >
> > >>> > > > ·         https://reviews.apache.org/r/56691/
> > >>> > > >
> > >>> > > > From reading the documents, I am under the impression that
> there
> > >>> is a
> > >>> > > > rough consensus on the following points:
> > >>> > > >
> > >>> > > > ·         We want dynamic reservations. Our general goal is to
> > >>> enable
> > >>> > the
> > >>> > > > re-scheduling of tasks on the same host they used in a previous
> > >>> run.
> > >>> > > >
> > >>> > > > ·         Dynamic reservations are a best-effort feature. If in
> > >>> doubt,
> > >>> > a
> > >>> > > > task will be scheduled somewhere else.
> > >>> > > >
> > >>> > > > ·         Jobs opt into reserved resources using an appropriate
> > >>> tier
> > >>> > > > config.
> > >>> > > >
> > >>> > > > ·         The tier config in supposed to be neither preemptible
> > nor
> > >>> > > > revocable. Reserving resources therefore requires appropriate
> > >>> quota.
> > >>> > > >
> > >>> > > > ·         Aurora will tag reserved Mesos resources by adding
> the
> > >>> unique
> > >>> > > > instance key of the reserving task instance as a label. Only
> this
> > >>> task
> > >>> > > > instance will be allowed to use those tagged resources.
> > >>> > > >
> > >>> > > > I am unclear on the following general questions as there is
> > >>> > contradicting
> > >>> > > > content:
> > >>> > > >
> > >>> > > > a)       How does the user interact with reservations?  There
> are
> > >>> > several
> > >>> > > > proposals in the documents to auto-reserve on `aurora job
> create`
> > >>> or
> > >>> > > > `aurora cron schedule` and to automatically un-reserve on the
> > >>> > appropriate
> > >>> > > > reverse actions. But will we also allow a user further control
> > >>> over the
> > >>> > > > reservations so that they can manage those independent of the
> > >>> task/job
> > >>> > > > lifecycle? For example, how does Borg handle this?
> > >>> > > >
> > >>> > > > b)       The implementation proposal and patches include an
> > >>> > > > OfferReconciler, so this implies we don’t want to offer any
> > >>> control for
> > >>> > > the
> > >>> > > > user. The only control mechanism will be the cluster-wide offer
> > >>> wait
> > >>> > time
> > >>> > > > limiting the number of seconds unused reserved resources can
> > linger
> > >>> > > before
> > >>> > > > they are un-reserved.
> > >>> > > >
> > >>> > > > c)       Will we allow adhoc/cron jobs to reserve resources?
> Does
> > >>> it
> > >>> > even
> > >>> > > > matter if we don’t give control to users and just rely on the
> > >>> > > > OfferReconciler?
> > >>> > > >
> > >>> > > >
> > >>> > > > I have a couple of questions on the MVP and some implementation
> > >>> > details.
> > >>> > > I
> > >>> > > > will follow up with those in a separate mail.
> > >>> > > >
> > >>> > > > Thanks and best regards,
> > >>> > > > Stephan
> > >>> > > >
> > >>> > >
> > >>> >
> > >>> > --
> > >>> > Zameer Manji
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Dynamic Reservations

Posted by Joshua Cohen <jc...@apache.org>.
Dmitriy,

There's a fair number of comments both here and on the doc. Will you have
time to respond to these so we can find a path forward?

Cheers,

Joshua

On Wed, Mar 8, 2017 at 8:44 PM, David McLaughlin <dm...@apache.org>
wrote:

> Ticket for replace task primitive already exists:
> https://issues.apache.org/jira/browse/MESOS-1280
>
> On Wed, Mar 8, 2017 at 6:34 PM, David McLaughlin <dm...@apache.org>
> wrote:
>
> > Spoke with Zameer offline and he asked me to post additional thoughts
> > here.
> >
> > My motivation for solving this without dynamic reservations is just the
> > sheer number of questions I have after reading the RFC and current design
> > doc. And most of them are not about the current proposal and goals or the
> > MVP but more about how this feature will scale into persistent storage.
> >
> > I think best-effort dynamic reservations are such a different problem
> than
> > the reservations that would be needed to support persistent storage. My
> > primary concern is around things like quota. For the current proposal and
> > the small best-effort feature we're adding, it makes no sense to get into
> > the complexities of separate quota for reserved resources vs preferred
> > resources, but the reality of exposing such a concept to a large
> > organisation where we can't automatically reclaim anything reserved means
> > we'd almost definitely want that. The issue with the iterative approach
> is
> > decisions we take here could have a huge impact on those tasks later,
> once
> > we expose the reserved tier into the open. That means more upfront design
> > and planning, which so far has blocked a super useful feature that I feel
> > all of us want.
> >
> > My gut feeling is we went about this all wrong. We started with dynamic
> > reservations and thought about how we could speed up task scheduling with
> > them. If we took the current problem brief and started from first
> > principals then I think we'd naturally look for something like a
> > replaceTask(offerId, taskInfo) type API from Mesos.
> >
> > I'll bring this up within our team and see if we can put resources on
> > adding such an API. Any feedback on this approach in the meantime is
> > welcome.
> >
> > On Wed, Mar 8, 2017 at 5:30 PM, David McLaughlin <dmclaughlin@apache.org
> >
> > wrote:
> >
> >> You don't have to store anything with my proposal. Preemption doesn't
> >> store anything either. The whole thing is it's just best-effort, and if
> the
> >> Scheduler restarts the worst that would happen is part of the current
> batch
> >> would have to go through the current Scheduling loop that users tolerate
> >> and deal with today.
> >>
> >>
> >>
> >> On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zm...@apache.org> wrote:
> >>
> >>> David,
> >>>
> >>> I have two concerns with that idea. First, it would require persisting
> >>> the
> >>> relationship of <Hostname, Resources> to <Task> for every task. I'm not
> >>> sure if adding more storage and storage operations is the ideal way of
> >>> solving this problem. Second, in a multi framework environment, a
> >>> framework
> >>> needs to use dynamic reservations otherwise the resources might be
> taken
> >>> by
> >>> another framework.
> >>>
> >>> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <
> dmclaughlin@apache.org
> >>> >
> >>> wrote:
> >>>
> >>> > So I read the docs again and I have one major question - do we even
> >>> need
> >>> > dynamic reservations for the current proposal?
> >>> >
> >>> > The current goal of the proposed work is to keep an offer on a host
> and
> >>> > prevent some other pending task from taking it before the next
> >>> scheduling
> >>> > round. This exact problem is solved in preemption and we could use a
> >>> > similar technique for reserving offers after killing tasks when going
> >>> > through the update loop. We wouldn't need to add tiers or
> >>> reconciliation or
> >>> > solve any of these other concerns. Reusing an offer skips so much of
> >>> the
> >>> > expensive stuff in the Scheduler that it would be a no-brainer for
> the
> >>> > operator to turn it on for every single task in the cluster.
> >>> >
> >>> >
> >>> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org>
> >>> wrote:
> >>> >
> >>> > > I read over the docs, it looks like a good start.  Personally I
> >>> don't see
> >>> > > much of a benefit for dynamically reserved cpu/mem, but I'm excited
> >>> about
> >>> > > the possibility of building off this for dynamically reserved
> >>> persistent
> >>> > > volumes.
> >>> > >
> >>> > > I would like to see more detail on how a reservation "times out",
> >>> and the
> >>> > > configuration options per job around that, as I feel like its the
> >>> most
> >>> > > complicated part of all of this.  Ideally there would also be hooks
> >>> into
> >>> > > the host maintenance APIs here.
> >>> > >
> >>> > > I also didn't see any mention of it, but I believe mesos requires
> the
> >>> > > framework to reserve resources with a role.  By default aurora runs
> >>> as
> >>> > the
> >>> > > special "*" role, does this mean aurora will need to have a role
> >>> > specified
> >>> > > now for this to work?  Or does mesos allow reserving resources
> >>> without a
> >>> > > role?
> >>> > >
> >>> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
> >>> > Stephan.Erb@blue-yonder.com>
> >>> > > wrote:
> >>> > >
> >>> > > > Hi everyone,
> >>> > > >
> >>> > > > There have been two documents on Dynamic Reservations as a first
> >>> step
> >>> > > > towards persistent services:
> >>> > > >
> >>> > > > ·         RFC: https://docs.google.com/document/d/
> >>> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
> >>> > hcsc8tda08vy
> >>> > > >
> >>> > > > ·         Technical Design Doc:  https://docs.google.com/docume
> >>> nt/d/
> >>> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
> >>> > klg3urfbnq3v
> >>> > > >
> >>> > > > Since a couple of days there are also now two patches online for
> a
> >>> MVP
> >>> > by
> >>> > > > Dmitriy:
> >>> > > >
> >>> > > > ·         https://reviews.apache.org/r/56690/
> >>> > > >
> >>> > > > ·         https://reviews.apache.org/r/56691/
> >>> > > >
> >>> > > > From reading the documents, I am under the impression that there
> >>> is a
> >>> > > > rough consensus on the following points:
> >>> > > >
> >>> > > > ·         We want dynamic reservations. Our general goal is to
> >>> enable
> >>> > the
> >>> > > > re-scheduling of tasks on the same host they used in a previous
> >>> run.
> >>> > > >
> >>> > > > ·         Dynamic reservations are a best-effort feature. If in
> >>> doubt,
> >>> > a
> >>> > > > task will be scheduled somewhere else.
> >>> > > >
> >>> > > > ·         Jobs opt into reserved resources using an appropriate
> >>> tier
> >>> > > > config.
> >>> > > >
> >>> > > > ·         The tier config in supposed to be neither preemptible
> nor
> >>> > > > revocable. Reserving resources therefore requires appropriate
> >>> quota.
> >>> > > >
> >>> > > > ·         Aurora will tag reserved Mesos resources by adding the
> >>> unique
> >>> > > > instance key of the reserving task instance as a label. Only this
> >>> task
> >>> > > > instance will be allowed to use those tagged resources.
> >>> > > >
> >>> > > > I am unclear on the following general questions as there is
> >>> > contradicting
> >>> > > > content:
> >>> > > >
> >>> > > > a)       How does the user interact with reservations?  There are
> >>> > several
> >>> > > > proposals in the documents to auto-reserve on `aurora job create`
> >>> or
> >>> > > > `aurora cron schedule` and to automatically un-reserve on the
> >>> > appropriate
> >>> > > > reverse actions. But will we also allow a user further control
> >>> over the
> >>> > > > reservations so that they can manage those independent of the
> >>> task/job
> >>> > > > lifecycle? For example, how does Borg handle this?
> >>> > > >
> >>> > > > b)       The implementation proposal and patches include an
> >>> > > > OfferReconciler, so this implies we don’t want to offer any
> >>> control for
> >>> > > the
> >>> > > > user. The only control mechanism will be the cluster-wide offer
> >>> wait
> >>> > time
> >>> > > > limiting the number of seconds unused reserved resources can
> linger
> >>> > > before
> >>> > > > they are un-reserved.
> >>> > > >
> >>> > > > c)       Will we allow adhoc/cron jobs to reserve resources? Does
> >>> it
> >>> > even
> >>> > > > matter if we don’t give control to users and just rely on the
> >>> > > > OfferReconciler?
> >>> > > >
> >>> > > >
> >>> > > > I have a couple of questions on the MVP and some implementation
> >>> > details.
> >>> > > I
> >>> > > > will follow up with those in a separate mail.
> >>> > > >
> >>> > > > Thanks and best regards,
> >>> > > > Stephan
> >>> > > >
> >>> > >
> >>> >
> >>> > --
> >>> > Zameer Manji
> >>> >
> >>>
> >>
> >>
> >
>

Re: Dynamic Reservations

Posted by David McLaughlin <dm...@apache.org>.
Ticket for replace task primitive already exists:
https://issues.apache.org/jira/browse/MESOS-1280

On Wed, Mar 8, 2017 at 6:34 PM, David McLaughlin <dm...@apache.org>
wrote:

> Spoke with Zameer offline and he asked me to post additional thoughts
> here.
>
> My motivation for solving this without dynamic reservations is just the
> sheer number of questions I have after reading the RFC and current design
> doc. And most of them are not about the current proposal and goals or the
> MVP but more about how this feature will scale into persistent storage.
>
> I think best-effort dynamic reservations are such a different problem than
> the reservations that would be needed to support persistent storage. My
> primary concern is around things like quota. For the current proposal and
> the small best-effort feature we're adding, it makes no sense to get into
> the complexities of separate quota for reserved resources vs preferred
> resources, but the reality of exposing such a concept to a large
> organisation where we can't automatically reclaim anything reserved means
> we'd almost definitely want that. The issue with the iterative approach is
> decisions we take here could have a huge impact on those tasks later, once
> we expose the reserved tier into the open. That means more upfront design
> and planning, which so far has blocked a super useful feature that I feel
> all of us want.
>
> My gut feeling is we went about this all wrong. We started with dynamic
> reservations and thought about how we could speed up task scheduling with
> them. If we took the current problem brief and started from first
> principals then I think we'd naturally look for something like a
> replaceTask(offerId, taskInfo) type API from Mesos.
>
> I'll bring this up within our team and see if we can put resources on
> adding such an API. Any feedback on this approach in the meantime is
> welcome.
>
> On Wed, Mar 8, 2017 at 5:30 PM, David McLaughlin <dm...@apache.org>
> wrote:
>
>> You don't have to store anything with my proposal. Preemption doesn't
>> store anything either. The whole thing is it's just best-effort, and if the
>> Scheduler restarts the worst that would happen is part of the current batch
>> would have to go through the current Scheduling loop that users tolerate
>> and deal with today.
>>
>>
>>
>> On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zm...@apache.org> wrote:
>>
>>> David,
>>>
>>> I have two concerns with that idea. First, it would require persisting
>>> the
>>> relationship of <Hostname, Resources> to <Task> for every task. I'm not
>>> sure if adding more storage and storage operations is the ideal way of
>>> solving this problem. Second, in a multi framework environment, a
>>> framework
>>> needs to use dynamic reservations otherwise the resources might be taken
>>> by
>>> another framework.
>>>
>>> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <dmclaughlin@apache.org
>>> >
>>> wrote:
>>>
>>> > So I read the docs again and I have one major question - do we even
>>> need
>>> > dynamic reservations for the current proposal?
>>> >
>>> > The current goal of the proposed work is to keep an offer on a host and
>>> > prevent some other pending task from taking it before the next
>>> scheduling
>>> > round. This exact problem is solved in preemption and we could use a
>>> > similar technique for reserving offers after killing tasks when going
>>> > through the update loop. We wouldn't need to add tiers or
>>> reconciliation or
>>> > solve any of these other concerns. Reusing an offer skips so much of
>>> the
>>> > expensive stuff in the Scheduler that it would be a no-brainer for the
>>> > operator to turn it on for every single task in the cluster.
>>> >
>>> >
>>> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org>
>>> wrote:
>>> >
>>> > > I read over the docs, it looks like a good start.  Personally I
>>> don't see
>>> > > much of a benefit for dynamically reserved cpu/mem, but I'm excited
>>> about
>>> > > the possibility of building off this for dynamically reserved
>>> persistent
>>> > > volumes.
>>> > >
>>> > > I would like to see more detail on how a reservation "times out",
>>> and the
>>> > > configuration options per job around that, as I feel like its the
>>> most
>>> > > complicated part of all of this.  Ideally there would also be hooks
>>> into
>>> > > the host maintenance APIs here.
>>> > >
>>> > > I also didn't see any mention of it, but I believe mesos requires the
>>> > > framework to reserve resources with a role.  By default aurora runs
>>> as
>>> > the
>>> > > special "*" role, does this mean aurora will need to have a role
>>> > specified
>>> > > now for this to work?  Or does mesos allow reserving resources
>>> without a
>>> > > role?
>>> > >
>>> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
>>> > Stephan.Erb@blue-yonder.com>
>>> > > wrote:
>>> > >
>>> > > > Hi everyone,
>>> > > >
>>> > > > There have been two documents on Dynamic Reservations as a first
>>> step
>>> > > > towards persistent services:
>>> > > >
>>> > > > ·         RFC: https://docs.google.com/document/d/
>>> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
>>> > hcsc8tda08vy
>>> > > >
>>> > > > ·         Technical Design Doc:  https://docs.google.com/docume
>>> nt/d/
>>> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
>>> > klg3urfbnq3v
>>> > > >
>>> > > > Since a couple of days there are also now two patches online for a
>>> MVP
>>> > by
>>> > > > Dmitriy:
>>> > > >
>>> > > > ·         https://reviews.apache.org/r/56690/
>>> > > >
>>> > > > ·         https://reviews.apache.org/r/56691/
>>> > > >
>>> > > > From reading the documents, I am under the impression that there
>>> is a
>>> > > > rough consensus on the following points:
>>> > > >
>>> > > > ·         We want dynamic reservations. Our general goal is to
>>> enable
>>> > the
>>> > > > re-scheduling of tasks on the same host they used in a previous
>>> run.
>>> > > >
>>> > > > ·         Dynamic reservations are a best-effort feature. If in
>>> doubt,
>>> > a
>>> > > > task will be scheduled somewhere else.
>>> > > >
>>> > > > ·         Jobs opt into reserved resources using an appropriate
>>> tier
>>> > > > config.
>>> > > >
>>> > > > ·         The tier config in supposed to be neither preemptible nor
>>> > > > revocable. Reserving resources therefore requires appropriate
>>> quota.
>>> > > >
>>> > > > ·         Aurora will tag reserved Mesos resources by adding the
>>> unique
>>> > > > instance key of the reserving task instance as a label. Only this
>>> task
>>> > > > instance will be allowed to use those tagged resources.
>>> > > >
>>> > > > I am unclear on the following general questions as there is
>>> > contradicting
>>> > > > content:
>>> > > >
>>> > > > a)       How does the user interact with reservations?  There are
>>> > several
>>> > > > proposals in the documents to auto-reserve on `aurora job create`
>>> or
>>> > > > `aurora cron schedule` and to automatically un-reserve on the
>>> > appropriate
>>> > > > reverse actions. But will we also allow a user further control
>>> over the
>>> > > > reservations so that they can manage those independent of the
>>> task/job
>>> > > > lifecycle? For example, how does Borg handle this?
>>> > > >
>>> > > > b)       The implementation proposal and patches include an
>>> > > > OfferReconciler, so this implies we don’t want to offer any
>>> control for
>>> > > the
>>> > > > user. The only control mechanism will be the cluster-wide offer
>>> wait
>>> > time
>>> > > > limiting the number of seconds unused reserved resources can linger
>>> > > before
>>> > > > they are un-reserved.
>>> > > >
>>> > > > c)       Will we allow adhoc/cron jobs to reserve resources? Does
>>> it
>>> > even
>>> > > > matter if we don’t give control to users and just rely on the
>>> > > > OfferReconciler?
>>> > > >
>>> > > >
>>> > > > I have a couple of questions on the MVP and some implementation
>>> > details.
>>> > > I
>>> > > > will follow up with those in a separate mail.
>>> > > >
>>> > > > Thanks and best regards,
>>> > > > Stephan
>>> > > >
>>> > >
>>> >
>>> > --
>>> > Zameer Manji
>>> >
>>>
>>
>>
>

Re: Dynamic Reservations

Posted by David McLaughlin <dm...@apache.org>.
Spoke with Zameer offline and he asked me to post additional thoughts here.

My motivation for solving this without dynamic reservations is just the
sheer number of questions I have after reading the RFC and current design
doc. And most of them are not about the current proposal and goals or the
MVP but more about how this feature will scale into persistent storage.

I think best-effort dynamic reservations are such a different problem than
the reservations that would be needed to support persistent storage. My
primary concern is around things like quota. For the current proposal and
the small best-effort feature we're adding, it makes no sense to get into
the complexities of separate quota for reserved resources vs preferred
resources, but the reality of exposing such a concept to a large
organisation where we can't automatically reclaim anything reserved means
we'd almost definitely want that. The issue with the iterative approach is
decisions we take here could have a huge impact on those tasks later, once
we expose the reserved tier into the open. That means more upfront design
and planning, which so far has blocked a super useful feature that I feel
all of us want.

My gut feeling is we went about this all wrong. We started with dynamic
reservations and thought about how we could speed up task scheduling with
them. If we took the current problem brief and started from first
principals then I think we'd naturally look for something like a
replaceTask(offerId, taskInfo) type API from Mesos.

I'll bring this up within our team and see if we can put resources on
adding such an API. Any feedback on this approach in the meantime is
welcome.

On Wed, Mar 8, 2017 at 5:30 PM, David McLaughlin <dm...@apache.org>
wrote:

> You don't have to store anything with my proposal. Preemption doesn't
> store anything either. The whole thing is it's just best-effort, and if the
> Scheduler restarts the worst that would happen is part of the current batch
> would have to go through the current Scheduling loop that users tolerate
> and deal with today.
>
>
>
> On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zm...@apache.org> wrote:
>
>> David,
>>
>> I have two concerns with that idea. First, it would require persisting the
>> relationship of <Hostname, Resources> to <Task> for every task. I'm not
>> sure if adding more storage and storage operations is the ideal way of
>> solving this problem. Second, in a multi framework environment, a
>> framework
>> needs to use dynamic reservations otherwise the resources might be taken
>> by
>> another framework.
>>
>> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <dm...@apache.org>
>> wrote:
>>
>> > So I read the docs again and I have one major question - do we even need
>> > dynamic reservations for the current proposal?
>> >
>> > The current goal of the proposed work is to keep an offer on a host and
>> > prevent some other pending task from taking it before the next
>> scheduling
>> > round. This exact problem is solved in preemption and we could use a
>> > similar technique for reserving offers after killing tasks when going
>> > through the update loop. We wouldn't need to add tiers or
>> reconciliation or
>> > solve any of these other concerns. Reusing an offer skips so much of the
>> > expensive stuff in the Scheduler that it would be a no-brainer for the
>> > operator to turn it on for every single task in the cluster.
>> >
>> >
>> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org>
>> wrote:
>> >
>> > > I read over the docs, it looks like a good start.  Personally I don't
>> see
>> > > much of a benefit for dynamically reserved cpu/mem, but I'm excited
>> about
>> > > the possibility of building off this for dynamically reserved
>> persistent
>> > > volumes.
>> > >
>> > > I would like to see more detail on how a reservation "times out", and
>> the
>> > > configuration options per job around that, as I feel like its the most
>> > > complicated part of all of this.  Ideally there would also be hooks
>> into
>> > > the host maintenance APIs here.
>> > >
>> > > I also didn't see any mention of it, but I believe mesos requires the
>> > > framework to reserve resources with a role.  By default aurora runs as
>> > the
>> > > special "*" role, does this mean aurora will need to have a role
>> > specified
>> > > now for this to work?  Or does mesos allow reserving resources
>> without a
>> > > role?
>> > >
>> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
>> > Stephan.Erb@blue-yonder.com>
>> > > wrote:
>> > >
>> > > > Hi everyone,
>> > > >
>> > > > There have been two documents on Dynamic Reservations as a first
>> step
>> > > > towards persistent services:
>> > > >
>> > > > ·         RFC: https://docs.google.com/document/d/
>> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
>> > hcsc8tda08vy
>> > > >
>> > > > ·         Technical Design Doc:  https://docs.google.com/docume
>> nt/d/
>> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
>> > klg3urfbnq3v
>> > > >
>> > > > Since a couple of days there are also now two patches online for a
>> MVP
>> > by
>> > > > Dmitriy:
>> > > >
>> > > > ·         https://reviews.apache.org/r/56690/
>> > > >
>> > > > ·         https://reviews.apache.org/r/56691/
>> > > >
>> > > > From reading the documents, I am under the impression that there is
>> a
>> > > > rough consensus on the following points:
>> > > >
>> > > > ·         We want dynamic reservations. Our general goal is to
>> enable
>> > the
>> > > > re-scheduling of tasks on the same host they used in a previous run.
>> > > >
>> > > > ·         Dynamic reservations are a best-effort feature. If in
>> doubt,
>> > a
>> > > > task will be scheduled somewhere else.
>> > > >
>> > > > ·         Jobs opt into reserved resources using an appropriate tier
>> > > > config.
>> > > >
>> > > > ·         The tier config in supposed to be neither preemptible nor
>> > > > revocable. Reserving resources therefore requires appropriate quota.
>> > > >
>> > > > ·         Aurora will tag reserved Mesos resources by adding the
>> unique
>> > > > instance key of the reserving task instance as a label. Only this
>> task
>> > > > instance will be allowed to use those tagged resources.
>> > > >
>> > > > I am unclear on the following general questions as there is
>> > contradicting
>> > > > content:
>> > > >
>> > > > a)       How does the user interact with reservations?  There are
>> > several
>> > > > proposals in the documents to auto-reserve on `aurora job create` or
>> > > > `aurora cron schedule` and to automatically un-reserve on the
>> > appropriate
>> > > > reverse actions. But will we also allow a user further control over
>> the
>> > > > reservations so that they can manage those independent of the
>> task/job
>> > > > lifecycle? For example, how does Borg handle this?
>> > > >
>> > > > b)       The implementation proposal and patches include an
>> > > > OfferReconciler, so this implies we don’t want to offer any control
>> for
>> > > the
>> > > > user. The only control mechanism will be the cluster-wide offer wait
>> > time
>> > > > limiting the number of seconds unused reserved resources can linger
>> > > before
>> > > > they are un-reserved.
>> > > >
>> > > > c)       Will we allow adhoc/cron jobs to reserve resources? Does it
>> > even
>> > > > matter if we don’t give control to users and just rely on the
>> > > > OfferReconciler?
>> > > >
>> > > >
>> > > > I have a couple of questions on the MVP and some implementation
>> > details.
>> > > I
>> > > > will follow up with those in a separate mail.
>> > > >
>> > > > Thanks and best regards,
>> > > > Stephan
>> > > >
>> > >
>> >
>> > --
>> > Zameer Manji
>> >
>>
>
>

Re: Dynamic Reservations

Posted by David McLaughlin <dm...@apache.org>.
You don't have to store anything with my proposal. Preemption doesn't store
anything either. The whole thing is it's just best-effort, and if the
Scheduler restarts the worst that would happen is part of the current batch
would have to go through the current Scheduling loop that users tolerate
and deal with today.



On Wed, Mar 8, 2017 at 5:08 PM, Zameer Manji <zm...@apache.org> wrote:

> David,
>
> I have two concerns with that idea. First, it would require persisting the
> relationship of <Hostname, Resources> to <Task> for every task. I'm not
> sure if adding more storage and storage operations is the ideal way of
> solving this problem. Second, in a multi framework environment, a framework
> needs to use dynamic reservations otherwise the resources might be taken by
> another framework.
>
> On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <dm...@apache.org>
> wrote:
>
> > So I read the docs again and I have one major question - do we even need
> > dynamic reservations for the current proposal?
> >
> > The current goal of the proposed work is to keep an offer on a host and
> > prevent some other pending task from taking it before the next scheduling
> > round. This exact problem is solved in preemption and we could use a
> > similar technique for reserving offers after killing tasks when going
> > through the update loop. We wouldn't need to add tiers or reconciliation
> or
> > solve any of these other concerns. Reusing an offer skips so much of the
> > expensive stuff in the Scheduler that it would be a no-brainer for the
> > operator to turn it on for every single task in the cluster.
> >
> >
> > On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org>
> wrote:
> >
> > > I read over the docs, it looks like a good start.  Personally I don't
> see
> > > much of a benefit for dynamically reserved cpu/mem, but I'm excited
> about
> > > the possibility of building off this for dynamically reserved
> persistent
> > > volumes.
> > >
> > > I would like to see more detail on how a reservation "times out", and
> the
> > > configuration options per job around that, as I feel like its the most
> > > complicated part of all of this.  Ideally there would also be hooks
> into
> > > the host maintenance APIs here.
> > >
> > > I also didn't see any mention of it, but I believe mesos requires the
> > > framework to reserve resources with a role.  By default aurora runs as
> > the
> > > special "*" role, does this mean aurora will need to have a role
> > specified
> > > now for this to work?  Or does mesos allow reserving resources without
> a
> > > role?
> > >
> > > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
> > Stephan.Erb@blue-yonder.com>
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > There have been two documents on Dynamic Reservations as a first step
> > > > towards persistent services:
> > > >
> > > > ·         RFC: https://docs.google.com/document/d/
> > > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
> > hcsc8tda08vy
> > > >
> > > > ·         Technical Design Doc:  https://docs.google.com/document/d/
> > > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
> > klg3urfbnq3v
> > > >
> > > > Since a couple of days there are also now two patches online for a
> MVP
> > by
> > > > Dmitriy:
> > > >
> > > > ·         https://reviews.apache.org/r/56690/
> > > >
> > > > ·         https://reviews.apache.org/r/56691/
> > > >
> > > > From reading the documents, I am under the impression that there is a
> > > > rough consensus on the following points:
> > > >
> > > > ·         We want dynamic reservations. Our general goal is to enable
> > the
> > > > re-scheduling of tasks on the same host they used in a previous run.
> > > >
> > > > ·         Dynamic reservations are a best-effort feature. If in
> doubt,
> > a
> > > > task will be scheduled somewhere else.
> > > >
> > > > ·         Jobs opt into reserved resources using an appropriate tier
> > > > config.
> > > >
> > > > ·         The tier config in supposed to be neither preemptible nor
> > > > revocable. Reserving resources therefore requires appropriate quota.
> > > >
> > > > ·         Aurora will tag reserved Mesos resources by adding the
> unique
> > > > instance key of the reserving task instance as a label. Only this
> task
> > > > instance will be allowed to use those tagged resources.
> > > >
> > > > I am unclear on the following general questions as there is
> > contradicting
> > > > content:
> > > >
> > > > a)       How does the user interact with reservations?  There are
> > several
> > > > proposals in the documents to auto-reserve on `aurora job create` or
> > > > `aurora cron schedule` and to automatically un-reserve on the
> > appropriate
> > > > reverse actions. But will we also allow a user further control over
> the
> > > > reservations so that they can manage those independent of the
> task/job
> > > > lifecycle? For example, how does Borg handle this?
> > > >
> > > > b)       The implementation proposal and patches include an
> > > > OfferReconciler, so this implies we don’t want to offer any control
> for
> > > the
> > > > user. The only control mechanism will be the cluster-wide offer wait
> > time
> > > > limiting the number of seconds unused reserved resources can linger
> > > before
> > > > they are un-reserved.
> > > >
> > > > c)       Will we allow adhoc/cron jobs to reserve resources? Does it
> > even
> > > > matter if we don’t give control to users and just rely on the
> > > > OfferReconciler?
> > > >
> > > >
> > > > I have a couple of questions on the MVP and some implementation
> > details.
> > > I
> > > > will follow up with those in a separate mail.
> > > >
> > > > Thanks and best regards,
> > > > Stephan
> > > >
> > >
> >
> > --
> > Zameer Manji
> >
>

Re: Dynamic Reservations

Posted by Zameer Manji <zm...@apache.org>.
David,

I have two concerns with that idea. First, it would require persisting the
relationship of <Hostname, Resources> to <Task> for every task. I'm not
sure if adding more storage and storage operations is the ideal way of
solving this problem. Second, in a multi framework environment, a framework
needs to use dynamic reservations otherwise the resources might be taken by
another framework.

On Wed, Mar 8, 2017 at 5:01 PM, David McLaughlin <dm...@apache.org>
wrote:

> So I read the docs again and I have one major question - do we even need
> dynamic reservations for the current proposal?
>
> The current goal of the proposed work is to keep an offer on a host and
> prevent some other pending task from taking it before the next scheduling
> round. This exact problem is solved in preemption and we could use a
> similar technique for reserving offers after killing tasks when going
> through the update loop. We wouldn't need to add tiers or reconciliation or
> solve any of these other concerns. Reusing an offer skips so much of the
> expensive stuff in the Scheduler that it would be a no-brainer for the
> operator to turn it on for every single task in the cluster.
>
>
> On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org> wrote:
>
> > I read over the docs, it looks like a good start.  Personally I don't see
> > much of a benefit for dynamically reserved cpu/mem, but I'm excited about
> > the possibility of building off this for dynamically reserved persistent
> > volumes.
> >
> > I would like to see more detail on how a reservation "times out", and the
> > configuration options per job around that, as I feel like its the most
> > complicated part of all of this.  Ideally there would also be hooks into
> > the host maintenance APIs here.
> >
> > I also didn't see any mention of it, but I believe mesos requires the
> > framework to reserve resources with a role.  By default aurora runs as
> the
> > special "*" role, does this mean aurora will need to have a role
> specified
> > now for this to work?  Or does mesos allow reserving resources without a
> > role?
> >
> > On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <
> Stephan.Erb@blue-yonder.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > There have been two documents on Dynamic Reservations as a first step
> > > towards persistent services:
> > >
> > > ·         RFC: https://docs.google.com/document/d/
> > > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.
> hcsc8tda08vy
> > >
> > > ·         Technical Design Doc:  https://docs.google.com/document/d/
> > > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.
> klg3urfbnq3v
> > >
> > > Since a couple of days there are also now two patches online for a MVP
> by
> > > Dmitriy:
> > >
> > > ·         https://reviews.apache.org/r/56690/
> > >
> > > ·         https://reviews.apache.org/r/56691/
> > >
> > > From reading the documents, I am under the impression that there is a
> > > rough consensus on the following points:
> > >
> > > ·         We want dynamic reservations. Our general goal is to enable
> the
> > > re-scheduling of tasks on the same host they used in a previous run.
> > >
> > > ·         Dynamic reservations are a best-effort feature. If in doubt,
> a
> > > task will be scheduled somewhere else.
> > >
> > > ·         Jobs opt into reserved resources using an appropriate tier
> > > config.
> > >
> > > ·         The tier config in supposed to be neither preemptible nor
> > > revocable. Reserving resources therefore requires appropriate quota.
> > >
> > > ·         Aurora will tag reserved Mesos resources by adding the unique
> > > instance key of the reserving task instance as a label. Only this task
> > > instance will be allowed to use those tagged resources.
> > >
> > > I am unclear on the following general questions as there is
> contradicting
> > > content:
> > >
> > > a)       How does the user interact with reservations?  There are
> several
> > > proposals in the documents to auto-reserve on `aurora job create` or
> > > `aurora cron schedule` and to automatically un-reserve on the
> appropriate
> > > reverse actions. But will we also allow a user further control over the
> > > reservations so that they can manage those independent of the task/job
> > > lifecycle? For example, how does Borg handle this?
> > >
> > > b)       The implementation proposal and patches include an
> > > OfferReconciler, so this implies we don’t want to offer any control for
> > the
> > > user. The only control mechanism will be the cluster-wide offer wait
> time
> > > limiting the number of seconds unused reserved resources can linger
> > before
> > > they are un-reserved.
> > >
> > > c)       Will we allow adhoc/cron jobs to reserve resources? Does it
> even
> > > matter if we don’t give control to users and just rely on the
> > > OfferReconciler?
> > >
> > >
> > > I have a couple of questions on the MVP and some implementation
> details.
> > I
> > > will follow up with those in a separate mail.
> > >
> > > Thanks and best regards,
> > > Stephan
> > >
> >
>
> --
> Zameer Manji
>

Re: Dynamic Reservations

Posted by David McLaughlin <dm...@apache.org>.
So I read the docs again and I have one major question - do we even need
dynamic reservations for the current proposal?

The current goal of the proposed work is to keep an offer on a host and
prevent some other pending task from taking it before the next scheduling
round. This exact problem is solved in preemption and we could use a
similar technique for reserving offers after killing tasks when going
through the update loop. We wouldn't need to add tiers or reconciliation or
solve any of these other concerns. Reusing an offer skips so much of the
expensive stuff in the Scheduler that it would be a no-brainer for the
operator to turn it on for every single task in the cluster.


On Thu, Mar 2, 2017 at 7:52 AM, Steve Niemitz <sn...@apache.org> wrote:

> I read over the docs, it looks like a good start.  Personally I don't see
> much of a benefit for dynamically reserved cpu/mem, but I'm excited about
> the possibility of building off this for dynamically reserved persistent
> volumes.
>
> I would like to see more detail on how a reservation "times out", and the
> configuration options per job around that, as I feel like its the most
> complicated part of all of this.  Ideally there would also be hooks into
> the host maintenance APIs here.
>
> I also didn't see any mention of it, but I believe mesos requires the
> framework to reserve resources with a role.  By default aurora runs as the
> special "*" role, does this mean aurora will need to have a role specified
> now for this to work?  Or does mesos allow reserving resources without a
> role?
>
> On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <St...@blue-yonder.com>
> wrote:
>
> > Hi everyone,
> >
> > There have been two documents on Dynamic Reservations as a first step
> > towards persistent services:
> >
> > ·         RFC: https://docs.google.com/document/d/
> > 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.hcsc8tda08vy
> >
> > ·         Technical Design Doc:  https://docs.google.com/document/d/
> > 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.klg3urfbnq3v
> >
> > Since a couple of days there are also now two patches online for a MVP by
> > Dmitriy:
> >
> > ·         https://reviews.apache.org/r/56690/
> >
> > ·         https://reviews.apache.org/r/56691/
> >
> > From reading the documents, I am under the impression that there is a
> > rough consensus on the following points:
> >
> > ·         We want dynamic reservations. Our general goal is to enable the
> > re-scheduling of tasks on the same host they used in a previous run.
> >
> > ·         Dynamic reservations are a best-effort feature. If in doubt, a
> > task will be scheduled somewhere else.
> >
> > ·         Jobs opt into reserved resources using an appropriate tier
> > config.
> >
> > ·         The tier config in supposed to be neither preemptible nor
> > revocable. Reserving resources therefore requires appropriate quota.
> >
> > ·         Aurora will tag reserved Mesos resources by adding the unique
> > instance key of the reserving task instance as a label. Only this task
> > instance will be allowed to use those tagged resources.
> >
> > I am unclear on the following general questions as there is contradicting
> > content:
> >
> > a)       How does the user interact with reservations?  There are several
> > proposals in the documents to auto-reserve on `aurora job create` or
> > `aurora cron schedule` and to automatically un-reserve on the appropriate
> > reverse actions. But will we also allow a user further control over the
> > reservations so that they can manage those independent of the task/job
> > lifecycle? For example, how does Borg handle this?
> >
> > b)       The implementation proposal and patches include an
> > OfferReconciler, so this implies we don’t want to offer any control for
> the
> > user. The only control mechanism will be the cluster-wide offer wait time
> > limiting the number of seconds unused reserved resources can linger
> before
> > they are un-reserved.
> >
> > c)       Will we allow adhoc/cron jobs to reserve resources? Does it even
> > matter if we don’t give control to users and just rely on the
> > OfferReconciler?
> >
> >
> > I have a couple of questions on the MVP and some implementation details.
> I
> > will follow up with those in a separate mail.
> >
> > Thanks and best regards,
> > Stephan
> >
>

Re: Dynamic Reservations

Posted by Steve Niemitz <sn...@apache.org>.
I read over the docs, it looks like a good start.  Personally I don't see
much of a benefit for dynamically reserved cpu/mem, but I'm excited about
the possibility of building off this for dynamically reserved persistent
volumes.

I would like to see more detail on how a reservation "times out", and the
configuration options per job around that, as I feel like its the most
complicated part of all of this.  Ideally there would also be hooks into
the host maintenance APIs here.

I also didn't see any mention of it, but I believe mesos requires the
framework to reserve resources with a role.  By default aurora runs as the
special "*" role, does this mean aurora will need to have a role specified
now for this to work?  Or does mesos allow reserving resources without a
role?

On Thu, Mar 2, 2017 at 8:35 AM, Erb, Stephan <St...@blue-yonder.com>
wrote:

> Hi everyone,
>
> There have been two documents on Dynamic Reservations as a first step
> towards persistent services:
>
> ·         RFC: https://docs.google.com/document/d/
> 15n29HSQPXuFrnxZAgfVINTRP1Iv47_jfcstJNuMwr5A/edit#heading=h.hcsc8tda08vy
>
> ·         Technical Design Doc:  https://docs.google.com/document/d/
> 1L2EKEcKKBPmuxRviSUebyuqiNwaO-2hsITBjt3SgWvE/edit#heading=h.klg3urfbnq3v
>
> Since a couple of days there are also now two patches online for a MVP by
> Dmitriy:
>
> ·         https://reviews.apache.org/r/56690/
>
> ·         https://reviews.apache.org/r/56691/
>
> From reading the documents, I am under the impression that there is a
> rough consensus on the following points:
>
> ·         We want dynamic reservations. Our general goal is to enable the
> re-scheduling of tasks on the same host they used in a previous run.
>
> ·         Dynamic reservations are a best-effort feature. If in doubt, a
> task will be scheduled somewhere else.
>
> ·         Jobs opt into reserved resources using an appropriate tier
> config.
>
> ·         The tier config in supposed to be neither preemptible nor
> revocable. Reserving resources therefore requires appropriate quota.
>
> ·         Aurora will tag reserved Mesos resources by adding the unique
> instance key of the reserving task instance as a label. Only this task
> instance will be allowed to use those tagged resources.
>
> I am unclear on the following general questions as there is contradicting
> content:
>
> a)       How does the user interact with reservations?  There are several
> proposals in the documents to auto-reserve on `aurora job create` or
> `aurora cron schedule` and to automatically un-reserve on the appropriate
> reverse actions. But will we also allow a user further control over the
> reservations so that they can manage those independent of the task/job
> lifecycle? For example, how does Borg handle this?
>
> b)       The implementation proposal and patches include an
> OfferReconciler, so this implies we don’t want to offer any control for the
> user. The only control mechanism will be the cluster-wide offer wait time
> limiting the number of seconds unused reserved resources can linger before
> they are un-reserved.
>
> c)       Will we allow adhoc/cron jobs to reserve resources? Does it even
> matter if we don’t give control to users and just rely on the
> OfferReconciler?
>
>
> I have a couple of questions on the MVP and some implementation details. I
> will follow up with those in a separate mail.
>
> Thanks and best regards,
> Stephan
>