You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cloudstack.apache.org by GitBox <gi...@apache.org> on 2022/09/16 15:15:24 UTC

[GitHub] [cloudstack] mlsorensen commented on issue #6744: Problem with operating systems that use cgroup v2 related to cpu speed.

mlsorensen commented on issue #6744:
URL: https://github.com/apache/cloudstack/issues/6744#issuecomment-1249491365

Thanks for the report - I think you're right that we need to do something here about the values potentially going out of range.

To the second point - a VM with 10 shares and a VM with 80 shares: I think the problem lies in that we have to keep the values intact on the service offerings in order to make the allocation math work. In your scenario the allocator would only think it has allocated 90MHz of a host that probably has 100GHz or more to allocate.

In your scenario I think the weighting itself still works right in groups at the host. Just to make the numbers easier - if VM1 had 20 vCPUs and 20 shares and VM2 had 80vCPUs and 80 shares, when the scheduler breaks down the CPU scheduling into runtime periods and assuming no other workloads involved, it means VM1 gets 20% of each runtime period and VM2 gets 80% of each runtime period. On a (fictional) 100 core hypervisor host, this would mean VM2 gets ~20 cores worth of the system's CPU time and VM2 gets ~80 cores worth (not exactly and not implying pinning to real cores necessarily, just in regards to scheduler's view of CPU time per period considering all cores).

The bigger problem is really that this 100 core host has maybe 200GHz worth of CPU to allocate, and with 1Mhz CPU offerings cloudstack calculates that you have only scheduled 100Mhz to it! The allocators will quickly overload the system with more VMs.

My initial thought to fix this is to simply scale down the shares number that is applied at libvirt. Not so much that we can't offer different levels of performance, though.

Simple example, scale down by factor of 100:

2 vCPU x 2000Mhz offering = 4000MHz = 40 shares
4 vCPU x 500Mhz offering = 2000MHz = 20 shares
...
128 core x 2000MHz offering = 2560 shares

This seems to give us a reasonable enough resolution to maintain the share weighting and also handle differing MHz speeds in the CPU offerings, which would be important for service offerings that enforce these shares as a CPU cap (via CFS quota). That is, a 1vCPU 500MHz offering with CPU cap enabled should get 1/4 of the runtime per period that a 1vCPU 2000MHz offering gets, and that doesn't work if we just map 1 CPU to 1 share.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@cloudstack.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org