You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@yunikorn.apache.org by Adam Antal <ad...@cloudera.com.INVALID> on 2020/04/20 13:16:31 UTC

Resource request and limit for YuniKorn pods

Hi,

I am working on [YUNIKORN-86] (Set proper resource request and limit for
YuniKorn pods) issues and I would like to hear your feedback on this issue.
Link:https://issues.apache.org/jira/browse/YUNIKORN-86

Currently I am working on finding a justified request and limit value for
the scheduler and shim pods in k8s. So far:
- Researched a bit on the k8s default scheduler (kube-scheduler) and even
though it can be configured, that are no defaults for that pod. On AKS
there is a default that I could find on the internet:
>>>
      resources:
        requests:
          cpu: 100m
          memory: 128Mi
        limits:
          cpu: 4
          memory: 2Gi
>>>
- Tried to obtain some values from a deployed k8s cluster, but had troubles
with the metrics-server (in some old version there was heapster deployed)
- Ran the yunikorn-core/pkg/scheduler/tests/scheduler_perf_tests.go and
monitored the cpu/memory consumption. I don't have the run's full graph,
but these are the approximate values that I experienced
   - for most of the time cpu was on around 4 cores
   - on peak time it was around 5.2 cores
   - the memory consumption was moderate, it was around 4.3% of the 16Gb
memory of my laptop - which is around 700Mb when the perf test was ended.
The memory depends on the stored/currently running applications, so I think
1Gb should be fine for this for general purpose.
Note that the cpu numbers are related to the peak usage, when the scheduler
was under pressure, but when only a small amount of pods are
scheduled, this is a much smaller number.

I suggest the following (similar to AKS) number:
>>>
      resources:
        requests:
          cpu: 200m
          memory: 512Mi
        limits:
          cpu: 4
          memory: 2Gi
>>>

One additional thing: there were some measurements of YuniKorn in kubemark.
Could you please share your outputs if there's something related to the
scheduler and the shim pod's resource usage?

Regards,
Adam

Re: Resource request and limit for YuniKorn pods

Posted by Adam Antal <ad...@cloudera.com.INVALID>.

Hi,

Thanks for your inputs.

I have updated the patch with the defaults as Weiwei suggested and will
also make the limits configurable in the helm charts as Tao explained. If
you have some, please review the PR.

Thanks,
Adam

On Tue, Apr 21, 2020 at 11:31 AM Tao Yang <ta...@apache.org> wrote:

> Thanks Adam for this efforts!
>
> I have did some performance tests on kubemark cluster, if we can improve
> some key phases in the scheduling process, the scheduler pod can take 1~4
> CPUs at most times and 7~9 CPUs at peak times, the memory seems may easily
> go beyond 1G after tens of thousands pods have been scheduled in a cluster
> with thousands of nodes.
>
> Utilization of the scheduler pod may depends on the scale of cluster,
> number of running pods and some other factors, those initial configuration
> values Weiwei suggested make sense to me, I think they are suitable for
> most usage scenarios. Moreover, I think it's better to make them
> configurable in helm-charts/yunikorn/values.yaml, so that users can easily
> update them on demand if they have a large scale cluster or other concerns.
>
> Thanks,
> Tao
>
> Weiwei Yang <ab...@gmail.com> 于2020年4月21日周二 下午1:22写道：
>
> > Hi Adam
> >
> > Thanks for the investigation. These data are useful, I think what you
> > suggested looks good to me.
> >
> >    - I think it makes sense to set a relatively small number, this is to
> >    ensure we are not getting into trouble on small envs. But for memory,
> can
> >    we request for 1G at least? Memory cannot be throttled, 1G is safer.
> >    - For the CPU, it looks like we probably won't go beyond 4~5 CPUs. So
> >    setting a limit to 4 makes sense to me. I also checked an EKS
> cluster, I
> >    saw usage around 1 CPU there.
> >
> > So I think we can do
> >
> >     resources:
> >         requests:
> >           cpu: 200m
> >           memory: *1Gi*
> >         limits:
> >           cpu: 4
> >           memory: 2Gi
> >
> > @taoyang@apache.org <ta...@apache.org> could you help to review this?
> > Let us know if this makes sense to you.
> >
> > Weiwei
> >
> >
> > On Mon, Apr 20, 2020 at 6:16 AM Adam Antal <adam.antal@cloudera.com
> .invalid>
> > wrote:
> >
> >> Hi,
> >>
> >> I am working on [YUNIKORN-86] (Set proper resource request and limit for
> >> YuniKorn pods) issues and I would like to hear your feedback on this
> >> issue.
> >> Link:https://issues.apache.org/jira/browse/YUNIKORN-86
> >>
> >> Currently I am working on finding a justified request and limit value
> for
> >> the scheduler and shim pods in k8s. So far:
> >> - Researched a bit on the k8s default scheduler (kube-scheduler) and
> even
> >> though it can be configured, that are no defaults for that pod. On AKS
> >> there is a default that I could find on the internet:
> >> >>>
> >>       resources:
> >>         requests:
> >>           cpu: 100m
> >>           memory: 128Mi
> >>         limits:
> >>           cpu: 4
> >>           memory: 2Gi
> >> >>>
> >> - Tried to obtain some values from a deployed k8s cluster, but had
> >> troubles
> >> with the metrics-server (in some old version there was heapster
> deployed)
> >> - Ran the yunikorn-core/pkg/scheduler/tests/scheduler_perf_tests.go and
> >> monitored the cpu/memory consumption. I don't have the run's full graph,
> >> but these are the approximate values that I experienced
> >>    - for most of the time cpu was on around 4 cores
> >>    - on peak time it was around 5.2 cores
> >>    - the memory consumption was moderate, it was around 4.3% of the 16Gb
> >> memory of my laptop - which is around 700Mb when the perf test was
> ended.
> >> The memory depends on the stored/currently running applications, so I
> >> think
> >> 1Gb should be fine for this for general purpose.
> >> Note that the cpu numbers are related to the peak usage, when the
> >> scheduler
> >> was under pressure, but when only a small amount of pods are
> >> scheduled, this is a much smaller number.
> >>
> >> I suggest the following (similar to AKS) number:
> >> >>>
> >>       resources:
> >>         requests:
> >>           cpu: 200m
> >>           memory: 512Mi
> >>         limits:
> >>           cpu: 4
> >>           memory: 2Gi
> >> >>>
> >>
> >> One additional thing: there were some measurements of YuniKorn in
> >> kubemark.
> >> Could you please share your outputs if there's something related to the
> >> scheduler and the shim pod's resource usage?
> >>
> >> Regards,
> >> Adam
> >>
> >
>

Re: Resource request and limit for YuniKorn pods

Posted by Tao Yang <ta...@apache.org>.

Thanks Adam for this efforts!

I have did some performance tests on kubemark cluster, if we can improve
some key phases in the scheduling process, the scheduler pod can take 1~4
CPUs at most times and 7~9 CPUs at peak times, the memory seems may easily
go beyond 1G after tens of thousands pods have been scheduled in a cluster
with thousands of nodes.

Utilization of the scheduler pod may depends on the scale of cluster,
number of running pods and some other factors, those initial configuration
values Weiwei suggested make sense to me, I think they are suitable for
most usage scenarios. Moreover, I think it's better to make them
configurable in helm-charts/yunikorn/values.yaml, so that users can easily
update them on demand if they have a large scale cluster or other concerns.

Thanks,
Tao

Weiwei Yang <ab...@gmail.com> 于2020年4月21日周二 下午1:22写道：

> Hi Adam
>
> Thanks for the investigation. These data are useful, I think what you
> suggested looks good to me.
>
>    - I think it makes sense to set a relatively small number, this is to
>    ensure we are not getting into trouble on small envs. But for memory, can
>    we request for 1G at least? Memory cannot be throttled, 1G is safer.
>    - For the CPU, it looks like we probably won't go beyond 4~5 CPUs. So
>    setting a limit to 4 makes sense to me. I also checked an EKS cluster, I
>    saw usage around 1 CPU there.
>
> So I think we can do
>
>     resources:
>         requests:
>           cpu: 200m
>           memory: *1Gi*
>         limits:
>           cpu: 4
>           memory: 2Gi
>
> @taoyang@apache.org <ta...@apache.org> could you help to review this?
> Let us know if this makes sense to you.
>
> Weiwei
>
>
> On Mon, Apr 20, 2020 at 6:16 AM Adam Antal <ad...@cloudera.com.invalid>
> wrote:
>
>> Hi,
>>
>> I am working on [YUNIKORN-86] (Set proper resource request and limit for
>> YuniKorn pods) issues and I would like to hear your feedback on this
>> issue.
>> Link:https://issues.apache.org/jira/browse/YUNIKORN-86
>>
>> Currently I am working on finding a justified request and limit value for
>> the scheduler and shim pods in k8s. So far:
>> - Researched a bit on the k8s default scheduler (kube-scheduler) and even
>> though it can be configured, that are no defaults for that pod. On AKS
>> there is a default that I could find on the internet:
>> >>>
>>       resources:
>>         requests:
>>           cpu: 100m
>>           memory: 128Mi
>>         limits:
>>           cpu: 4
>>           memory: 2Gi
>> >>>
>> - Tried to obtain some values from a deployed k8s cluster, but had
>> troubles
>> with the metrics-server (in some old version there was heapster deployed)
>> - Ran the yunikorn-core/pkg/scheduler/tests/scheduler_perf_tests.go and
>> monitored the cpu/memory consumption. I don't have the run's full graph,
>> but these are the approximate values that I experienced
>>    - for most of the time cpu was on around 4 cores
>>    - on peak time it was around 5.2 cores
>>    - the memory consumption was moderate, it was around 4.3% of the 16Gb
>> memory of my laptop - which is around 700Mb when the perf test was ended.
>> The memory depends on the stored/currently running applications, so I
>> think
>> 1Gb should be fine for this for general purpose.
>> Note that the cpu numbers are related to the peak usage, when the
>> scheduler
>> was under pressure, but when only a small amount of pods are
>> scheduled, this is a much smaller number.
>>
>> I suggest the following (similar to AKS) number:
>> >>>
>>       resources:
>>         requests:
>>           cpu: 200m
>>           memory: 512Mi
>>         limits:
>>           cpu: 4
>>           memory: 2Gi
>> >>>
>>
>> One additional thing: there were some measurements of YuniKorn in
>> kubemark.
>> Could you please share your outputs if there's something related to the
>> scheduler and the shim pod's resource usage?
>>
>> Regards,
>> Adam
>>
>

Re: Resource request and limit for YuniKorn pods

Posted by Weiwei Yang <ab...@gmail.com>.

Hi Adam

Thanks for the investigation. These data are useful, I think what you
suggested looks good to me.

   - I think it makes sense to set a relatively small number, this is to
   ensure we are not getting into trouble on small envs. But for memory, can
   we request for 1G at least? Memory cannot be throttled, 1G is safer.
   - For the CPU, it looks like we probably won't go beyond 4~5 CPUs. So
   setting a limit to 4 makes sense to me. I also checked an EKS cluster, I
   saw usage around 1 CPU there.

So I think we can do

    resources:
        requests:
          cpu: 200m
          memory: *1Gi*
        limits:
          cpu: 4
          memory: 2Gi

@taoyang@apache.org <ta...@apache.org> could you help to review this? Let
us know if this makes sense to you.

Weiwei


On Mon, Apr 20, 2020 at 6:16 AM Adam Antal <ad...@cloudera.com.invalid>
wrote:

> Hi,
>
> I am working on [YUNIKORN-86] (Set proper resource request and limit for
> YuniKorn pods) issues and I would like to hear your feedback on this issue.
> Link:https://issues.apache.org/jira/browse/YUNIKORN-86
>
> Currently I am working on finding a justified request and limit value for
> the scheduler and shim pods in k8s. So far:
> - Researched a bit on the k8s default scheduler (kube-scheduler) and even
> though it can be configured, that are no defaults for that pod. On AKS
> there is a default that I could find on the internet:
> >>>
>       resources:
>         requests:
>           cpu: 100m
>           memory: 128Mi
>         limits:
>           cpu: 4
>           memory: 2Gi
> >>>
> - Tried to obtain some values from a deployed k8s cluster, but had troubles
> with the metrics-server (in some old version there was heapster deployed)
> - Ran the yunikorn-core/pkg/scheduler/tests/scheduler_perf_tests.go and
> monitored the cpu/memory consumption. I don't have the run's full graph,
> but these are the approximate values that I experienced
>    - for most of the time cpu was on around 4 cores
>    - on peak time it was around 5.2 cores
>    - the memory consumption was moderate, it was around 4.3% of the 16Gb
> memory of my laptop - which is around 700Mb when the perf test was ended.
> The memory depends on the stored/currently running applications, so I think
> 1Gb should be fine for this for general purpose.
> Note that the cpu numbers are related to the peak usage, when the scheduler
> was under pressure, but when only a small amount of pods are
> scheduled, this is a much smaller number.
>
> I suggest the following (similar to AKS) number:
> >>>
>       resources:
>         requests:
>           cpu: 200m
>           memory: 512Mi
>         limits:
>           cpu: 4
>           memory: 2Gi
> >>>
>
> One additional thing: there were some measurements of YuniKorn in kubemark.
> Could you please share your outputs if there's something related to the
> scheduler and the shim pod's resource usage?
>
> Regards,
> Adam
>