You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Bartek Plotka (JIRA)" <ji...@apache.org> on 2016/02/05 13:16:39 UTC

[jira] [Updated] (MESOS-4595) Add support for newest pre-defined Perf events to PerfEventIsolator

     [ https://issues.apache.org/jira/browse/MESOS-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bartek Plotka updated MESOS-4595:
---------------------------------
    Description: 
Currently, Perf Event Isolator is able to monitor all (specified in {{--perf_events=...}}) Perf Events, but it can map only part of them in {{ResourceUsage.proto}} (to be more exact in [PerfStatistics.proto | https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L862])

Since the last time {{PerfStatistics.proto}} was updated, list of supported events expanded much and is growing constantly. I have created some comparison table:

|| Events type || Num of matched events in PerfStatistics vs perf 4.3.3 || perf 4.3.3 events ||
| HW events  | 8  | 8  |
| SW events | 9 | 10 |
| HW cache event | 20 | 20 |
| *Kernel PMU events* | *0* | *37* |
| Tracepoint events | 0 | billion (: |

For advance analysis (e.g during Oversubscription in QoS Controller) having support for additional events is crucial. For instance in [Serenity|https://github.com/mesosphere/serenity] we based some of our revocation algorithms on the new [CMT| https://01.org/packet-processing/cache-monitoring-technology-memory-bandwidth-monitoring-cache-allocation-technology-code-and-data] feature which gives additional, useful event called {{llc_occupancy}}.

I think we all agree that it would be great to support more (or even all) perf events in {{Mesos PerfEventIsolator}} (:
----
Let's start a discussion over the approach. Within this task we have three issues:
# What events do we want to support in Mesos?
## all?
## only add Kernel PMU Events?
---
I don't have a strong opinion on that, since i have never used {{Tracepoint events}}. We currently need PMU events.
# How to add new (or modify existing) events in {{mesos.proto}}?
We can distinguish here 3 approaches:
*# Add new events statically in {{PerfStatistics.proto}} as separate optional fields. (like it is currently)
*# Instead of optional fields in {{PerfStatistics.proto}} message we could have a {{key-value}} map (something like {{labels}} in other messages) and feed it dynamically in {{PerfEventIsolator}}
*# We could mix above approaches and just add mentioned map to existing {{PerfStatistics.proto}} for additional events (:
---
IMO: Approach 1) is somehow explicit - users can view what events to expect (although they are parsed in a different manner e.g {{"-"}} to {{"_"}}), but we would end with a looong message and a lot of copy-paste work. And we have to maintain that!
Approach 2 & 3 are more elastic, and we don't have problem mentioned in the issue below (: And we *always* support *all* perf events in all kernel versions (:
IMO approaches 2 & 3 are the best.
# How to support different naming format? For instance {{intel_cqm/llc_occupancy/}} with {{"/"}} in name or  {{migrate:mm_migrate_pages}} with {{":"}}. I don't think it is possible to have these as the field names in {{.proto}} syntax


  was:
Currently, Perf Event Isolator is able to monitor all (specified in {{--perf_events=...}}) Perf Events, but it can map only part of them in {{ResourceUsage.proto}} (to be more exact in [PerfStatistics.proto | https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L862])

Since the last time {{PerfStatistics.proto}} was updated, list of supported events expanded much and is growing constantly. I have created some comparison table:

|| Events type || Num of matched events in PerfStatistics vs perf 4.3.3 || perf 4.3.3 events ||
| HW events  | 8  | 8  |
| SW events | 9 | 10 |
| HW cache event | 20 | 20 |
| *Kernel PMU events* | *0* | *37* |
| Tracepoint events | 0 | billion (: |

For advance analysis (e.g during Oversubscription in QoS Controller) having support for additional events is crucial. For instance in [Serenity|https://github.com/mesosphere/serenity] we based some of our revocation algorithms on the new [CMT| https://01.org/packet-processing/cache-monitoring-technology-memory-bandwidth-monitoring-cache-allocation-technology-code-and-data] feature which gives additional, useful event called {{llc_occupancy}}.

I think we all agree that it would be great to support more (or even all) perf events in {{Mesos PerfEventIsolator}} (:
----
Let's start a discussion over the approach. Within this task we have three issues:
# What events do we want to support in Mesos?
## all?
## only add Kernel PMU Events?
---
I don't have a strong opinion on that, since i have never used {{Tracepoint events}}. We currently need PMU events.
# How to add new (or modify existing) events in {{mesos.proto}}?
We can distinguish here 3 approaches:
*# Add new events statically in {{PerfStatistics.proto}} as a separate optional fields. (like it is currently)
*# Instead of optional fields in {{PerfStatistics.proto}} message we could have a {{key-value}} map (something like {{labels}} in other messages) and feed it dynamically in {{PerfEventIsolator}}
*# We could mix above approaches and just add mentioned map to existing {{PerfStatistics.proto}} for additional events (:
---
IMO: Approach 1) is somehow explicit - users can view what events to expect (although they are parsed in a different manner e.g {{"-"}} to {{"_"}}), but we would end with a looong message and a lot of copy-paste work. And we have to maintain that!
Approach 2 & 3 are more elastic, and we don't have problem mentioned in the issue below (: And we *always* support *all* perf events in all kernel versions (:
IMO approaches 2 & 3 are the best.
# How to support different naming format? For instance {{intel_cqm/llc_occupancy/}} with {{"/"}} in name or  {{migrate:mm_migrate_pages}} with {{":"}}. I don't think it is possible to have these as the field names in {{.proto}} syntax



> Add support for newest pre-defined Perf events to PerfEventIsolator
> -------------------------------------------------------------------
>
>                 Key: MESOS-4595
>                 URL: https://issues.apache.org/jira/browse/MESOS-4595
>             Project: Mesos
>          Issue Type: Task
>          Components: isolation
>            Reporter: Bartek Plotka
>            Assignee: Bartek Plotka
>
> Currently, Perf Event Isolator is able to monitor all (specified in {{--perf_events=...}}) Perf Events, but it can map only part of them in {{ResourceUsage.proto}} (to be more exact in [PerfStatistics.proto | https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L862])
> Since the last time {{PerfStatistics.proto}} was updated, list of supported events expanded much and is growing constantly. I have created some comparison table:
> || Events type || Num of matched events in PerfStatistics vs perf 4.3.3 || perf 4.3.3 events ||
> | HW events  | 8  | 8  |
> | SW events | 9 | 10 |
> | HW cache event | 20 | 20 |
> | *Kernel PMU events* | *0* | *37* |
> | Tracepoint events | 0 | billion (: |
> For advance analysis (e.g during Oversubscription in QoS Controller) having support for additional events is crucial. For instance in [Serenity|https://github.com/mesosphere/serenity] we based some of our revocation algorithms on the new [CMT| https://01.org/packet-processing/cache-monitoring-technology-memory-bandwidth-monitoring-cache-allocation-technology-code-and-data] feature which gives additional, useful event called {{llc_occupancy}}.
> I think we all agree that it would be great to support more (or even all) perf events in {{Mesos PerfEventIsolator}} (:
> ----
> Let's start a discussion over the approach. Within this task we have three issues:
> # What events do we want to support in Mesos?
> ## all?
> ## only add Kernel PMU Events?
> ---
> I don't have a strong opinion on that, since i have never used {{Tracepoint events}}. We currently need PMU events.
> # How to add new (or modify existing) events in {{mesos.proto}}?
> We can distinguish here 3 approaches:
> *# Add new events statically in {{PerfStatistics.proto}} as separate optional fields. (like it is currently)
> *# Instead of optional fields in {{PerfStatistics.proto}} message we could have a {{key-value}} map (something like {{labels}} in other messages) and feed it dynamically in {{PerfEventIsolator}}
> *# We could mix above approaches and just add mentioned map to existing {{PerfStatistics.proto}} for additional events (:
> ---
> IMO: Approach 1) is somehow explicit - users can view what events to expect (although they are parsed in a different manner e.g {{"-"}} to {{"_"}}), but we would end with a looong message and a lot of copy-paste work. And we have to maintain that!
> Approach 2 & 3 are more elastic, and we don't have problem mentioned in the issue below (: And we *always* support *all* perf events in all kernel versions (:
> IMO approaches 2 & 3 are the best.
> # How to support different naming format? For instance {{intel_cqm/llc_occupancy/}} with {{"/"}} in name or  {{migrate:mm_migrate_pages}} with {{":"}}. I don't think it is possible to have these as the field names in {{.proto}} syntax



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)