You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Erik Krogen (Jira)" <ji...@apache.org> on 2021/04/28 18:38:00 UTC

[jira] [Commented] (SPARK-35259) ExternalBlockHandler metrics have misleading unit in the name

    [ https://issues.apache.org/jira/browse/SPARK-35259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334932#comment-17334932 ] 

Erik Krogen commented on SPARK-35259:
-------------------------------------

I have a PR for this but it is based on the PR for SPARK-35258 so I will hold off posting it for now.

While that goes through -- [~rxin] or [~jlaskowski] -- I see you participated in the discussions on SPARK-16405 when these were added, do you have any comment here? Maybe I am missing something?

> ExternalBlockHandler metrics have misleading unit in the name
> -------------------------------------------------------------
>
>                 Key: SPARK-35259
>                 URL: https://issues.apache.org/jira/browse/SPARK-35259
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 3.1.1
>            Reporter: Erik Krogen
>            Priority: Major
>
> Today {{ExternalBlockHandler}} exposes a few {{Timer}} metrics:
> {code}
>     // Time latency for open block request in ms
>     private final Timer openBlockRequestLatencyMillis = new Timer();
>     // Time latency for executor registration latency in ms
>     private final Timer registerExecutorRequestLatencyMillis = new Timer();
>     // Time latency for processing finalize shuffle merge request latency in ms
>     private final Timer finalizeShuffleMergeLatencyMillis = new Timer();
> {code}
> However these Dropwizard Timers by default use nanoseconds ([documentation|https://metrics.dropwizard.io/3.2.3/getting-started.html#timers]). It's certainly possible to extract milliseconds from them, but it seems misleading to have millis in the name here.
> {{YarnShuffleServiceMetrics}} currently doesn't expose any incorrect metrics since it doesn't export any timing information from these metrics (which I am trying to address in SPARK-35258), but these names still result in kind of misleading metric names like {{finalizeShuffleMergeLatency_count}} -- a count doesn't have a unit. It should be up to the metrics exporter, like {{YarnShuffleServiceMetrics}}, to decide the unit and adjust the name accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org