You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chris (JIRA)" <ji...@apache.org> on 2017/01/10 15:09:58 UTC

[jira] [Comment Edited] (MESOS-5342) CPU pinning/binding support for CgroupsCpushareIsolatorProcess

    [ https://issues.apache.org/jira/browse/MESOS-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815215#comment-15815215 ] 

Chris edited comment on MESOS-5342 at 1/10/17 3:09 PM:
-------------------------------------------------------

Implemented a mesos module and resource estimator - source has been posted to https://github.com/ct-clmsn/mesos-cpusets

Added performance counter enabled tools and mesos executor - source posted to https://github.com/ct-clmsn/mesos-papi



was (Author: ct.clmsn):
mesos modules for this feature set have been implemented:

https://github.com/ct-clmsn/mesos-cpusets

https://github.com/ct-clmsn/mesos-papi

> CPU pinning/binding support for CgroupsCpushareIsolatorProcess
> --------------------------------------------------------------
>
>                 Key: MESOS-5342
>                 URL: https://issues.apache.org/jira/browse/MESOS-5342
>             Project: Mesos
>          Issue Type: Improvement
>          Components: cgroups, containerization
>    Affects Versions: 0.28.1
>            Reporter: Chris
>              Labels: cgroups, cpu, cpu-usage, gpu, isolation, isolator, mentor, perfomance
>
> The cgroups isolator currently lacks support for binding (also called pinning) containers to a set of cores. The GNU/Linux kernel is known to make sub-optimal core assignments for processes and threads. Poor assignments impact program performance, specifically in terms of cache locality. Applications requiring GPU resources can benefit from this feature by getting access to cores closest to the GPU hardware, which reduces cpu-gpu copy latency.
> Most cluster management systems from the HPC community (SLURM) provide both cgroup isolation and cpu binding. This feature would provide similar capabilities. The current interest in supporting Intel's Cache Allocation Technology, and the advent of Intel's Knights-series processors, will require making choices about where container's are going to run on the mesos-agent's processor(s) cores - this feature is a step toward developing a robust solution.
> The improvement in this JIRA ticket will handle hardware topology detection, track container-to-core utilization in a histogram, and use a mathematical optimization technique to select cores for container assignment based on latency and the container-to-core utilization histogram.
> For GPU tasks, the improvement will prioritize selection of cores based on latency between the GPU and cores in an effort to minimize copy latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)