You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Zhitao Li <zh...@gmail.com> on 2018/04/20 16:40:10 UTC
Re: Reconsidering `allocatable` check in the allocator

To register our possible use case which also requires relaxing this check:

We are looking at whether it's feasible to use a custom resource type to
present additional tasks in a group. This means our scheduler could add a
TaskGroup with no cpu/mem to an existing executor. Because 1) new TaskGroup
cannot have empty resource and 2) it could have new routing primitive, we
might add several port(s) as only resources of the group.

To cover all corner cases, we'd like to ensure that framework would always
see an offer from an agent, even if it has no cpu/mem left to allocate but
only some ports. (In our setup, number of ports per host is much bigger
than possible density of tasks so we never run out of ports at any host).

I think I can see people who uses custom resource types could face have
similar issue caused by this `allocatable` check.

Thanks.

On Wed, Mar 7, 2018 at 5:29 PM, Benjamin Mahler <bm...@apache.org> wrote:

> +1 about it not being about network traffic.
>
> I the direction we want to head towards is to express and enforce a minimum
> granularity for scalar resources. For example:
>
> CPU: 0.001, if we say that we can only deal with milli-cpus.
> Disk: 1, if we say that we can only deal with the MB level of disk space
> isolation.
> GPU: 1, we can't let you consume a portion of a GPU.
>
> Note that this issue is caused by the lack of an "Integer" resource,
> because with an "Integer" resource we can just store the value based on the
> minimum granularity (e.g. 1 milli-cpus, 1 byte disk, etc). Note also that
> with Scalar resources, we currently only support three decimal points of
> precision:
> https://github.com/apache/mesos/blob/1.5.0/include/
> mesos/v1/mesos.proto#L1098-L1108
>
> If you were to check the minimum granularity on the input side (e.g.
> prevent frameworks from taking 4.0001 of 8 cpus in an offer), then you
> don't technically need to prevent the allocator from violating the minimum
> granularity (e.g. because we prevent 1.1 or 0.5 GPUs on the input side,
> only whole numbers of GPUs will be available).
>
> Re: Filtering discussion
>
> I think this is a separate discussion, worth having, about how to let
> frameworks give more information about what they want in order to tame the
> churn of offers.
>
> On Wed, Mar 7, 2018 at 9:53 AM, James Peach <jo...@gmail.com> wrote:
>
> >
> >
> > > On Mar 7, 2018, at 5:52 AM, Benjamin Bannier <
> > benjamin.bannier@mesosphere.io> wrote:
> > >
> > > Hi,
> > >
> > >> Chatted with BenM offline on this. There's another option what both of
> > us
> > >> agreed that it's probably better than any of the ones mentioned above.
> > >>
> > >> The idea is to make `allocable` return the portion of the input
> > resources
> > >> that are allocatable, and strip the unelectable portion.
> > >>
> > >> For example:
> > >> 1) If the input resources are "cpus:0.001,gpus:1", the `allocatable`
> > method
> > >> will return "gpus:1".
> > >> 2) If the input resources are "cpus:1,mem:1", the `allocatable` method
> > will
> > >> return "cpus:1".
> > >> 3) If the input resources are "cpus:0.001,mem:1", the `allocatable`
> > method
> > >> will return an empty Resources object.
> > >>
> > >> Basically, the algorithm is like the following:
> > >>
> > >> allocatable = input
> > >> foreach known resource type t: do
> > >> r = resources of type t from the input
> > >> if r is less than the min resource of type t; then
> > >>   allocatable -= r
> > >> fi
> > >> done
> > >> return allocatable
> > >
> > > I think that sounds like a faithful extension the current behavior to
> me
> > (removing too small resources from the offerable pool), but I feel we
> > should not just filter out any resource _kind_  below the minimum, but
> > inside a kind all _addable_ subresources,
> > >
> > >    allocatable : Resources = input
> > >      for (resource: Resource) in input:
> > >        if resource < min(resource.kind):
> > >          allocatable -= resource
> > >
> > >    return allocatable
> > >
> > > This would have the effect of clumping together each distinguishable
> > resource we care about instead of of accumulating say different disks
> which
> > in sum are potentially not that more interesting to frameworks (they
> would
> > prefer more of a particular disk than smaller pieces scattered across
> > multiple disks).
> > >
> > > @alexr
> > >> If we are about to offer some of the resources from a particular
> agent,
> > why
> > >> would we filter anything at all? I doubt we should be concerned about
> > the
> > >> size of the offer representation travelling through the network. If
> > >> available resources are "cpus:0.001,gpus:1" and we want to allocate
> GPU,
> > >> what is the benefit of filtering CPU?
> > >>
> > >> What about the following:
> > >> allocatable(R)
> > >> {
> > >> return true
> > >>   iff (there exists r in R for which size(r) > MIN(type(r)))
> > >> }
> > >
> > > I think this is less about communication overhead, but more a tool to
> > help to make sure that offered resources are actually useful to
> frameworks.
> >
> > I don't know whether there's a JIRA for this, but in the past we've
> > proposed the idea of schedulers suppressing or filtering offers with a
> > minimum resources specification, i.e. "don't bother me with offers that
> > aren't at least X"
> >
> > J
>



-- 
Cheers,

Zhitao Li