You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by LucaCanali <gi...@git.apache.org> on 2018/08/24 12:09:25 UTC

[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

GitHub user LucaCanali opened a pull request:

    https://github.com/apache/spark/pull/22218

    [SPARK-25228][CORE]Add executor CPU time metric.

    ## What changes were proposed in this pull request?
    
    Add a new metric to measure the executor's process (JVM) CPU time.
    
    ## How was this patch tested?
    
    Manually tested on a Spark cluster (see SPARK-25228 for an example screenshot).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/LucaCanali/spark AddExecutrCPUTimeMetric

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22218.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22218
    
----
commit c715096657c36beedd43c64104f56a78b2eb268d
Author: LucaCanali <lu...@...>
Date:   2018-08-24T12:05:59Z

    Add Executor CPU Time metric

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212964193
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -17,11 +17,13 @@
     
     package org.apache.spark.executor
     
    +import java.lang.management.ManagementFactory
     import java.util.concurrent.ThreadPoolExecutor
     
     import scala.collection.JavaConverters._
     
     import com.codahale.metrics.{Gauge, MetricRegistry}
    +import com.sun.management.OperatingSystemMXBean
    --- End diff --
    
    I think it's safest to a little reflection here to make sure this doesn't cause the whole app to crash every time.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214489390
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    +          attribute.asInstanceOf[Long]
    +        } else {
    +          -1L
    +        }
    +      } catch {
    +        case _ : Exception => -1L
    --- End diff --
    
    `case NonFatal(_) => -1`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    **[Test build #4330 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4330/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/e72966e38dc50be7b501387a9f719f85017a7aa8).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212784518
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,13 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process (JVM) CPU time.
    +  // The value is returned in nanoseconds, the method return -1 if this operation is not supported.
    +  val osMXBean = ManagementFactory.getOperatingSystemMXBean.asInstanceOf[OperatingSystemMXBean]
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = osMXBean.getProcessCpuTime()
    --- End diff --
    
    This metric is useful for users?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    **[Test build #4331 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4331/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/e72966e38dc50be7b501387a9f719f85017a7aa8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214484633
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    --- End diff --
    
    Nit: the comments should begin on the next line. But this is scaladoc syntax, and inside a code block, normally we just use `//` block comments.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22218


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214484739
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    --- End diff --
    
    The problem here is that these become fields in the parent object. These should go inside the `new Gauge... {` I think?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212939411
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -17,11 +17,13 @@
     
     package org.apache.spark.executor
     
    +import java.lang.management.ManagementFactory
     import java.util.concurrent.ThreadPoolExecutor
     
     import scala.collection.JavaConverters._
     
     import com.codahale.metrics.{Gauge, MetricRegistry}
    +import com.sun.management.OperatingSystemMXBean
    --- End diff --
    
    Indeed this is a very good point that I had overlooked. I have now directly checked and this appears to work OK on OpenJDK (and on Oracle JVM of course). In addition, I tested manually with IBM JDK (IBM J9 VM, Java 1.8.0_181,  where one would indeed suspect incompatibilities and surprisingly this appears to work in that case too. I believe this may come from recent work by IBM to make `com.ibm.lang.management.OperatingSystemMXBean.getProcessCpuTime` compatible with `com.sun.management.OperatingSystemMXBean.getProcessCpuTime`? See also [this link](https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/dcomibmlangmanagementosmxbeaniscputime100ns.html)
    
    I guess that if this is confirmed, we should be fine with a large fraction of the commonly used JDKs. In addition, we could handle the exception in case getProcessCpuTime is not available on a particular platform where the executor is running, for example returning the value -1 for this gauge in that case. Any thoughts/suggestions on this proposal?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214522065
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    +    override def getValue: Long = {
    +      val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    --- End diff --
    
    Although I actually mean to put these inside the anonymous `Gauge` instance but outside the method, so as to compute them once, I doubt there is much overhead here. Getting the bean is just returning a field, although constructing the ObjectName is a little non-trivial. I suppose metrics are infrequently computed so this doesn't matter much.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214489886
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    --- End diff --
    
    a little confused with the exsiting `cpuTime`. How about `jvmCpuTime`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    I have implemented the changes as from the latest comments by @maropu and @srowen 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    I have implemented the changes as from the latest comments, namely inlined the method.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214415690
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,31 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // will try to get JVM Process CPU time or return -1 otherwise
    +  // will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean if available
    +  def tryToGetJVMProcessPCUTime() : Long = {
    --- End diff --
    
    Can you just inline this method below?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212833235
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -17,11 +17,13 @@
     
     package org.apache.spark.executor
     
    +import java.lang.management.ManagementFactory
     import java.util.concurrent.ThreadPoolExecutor
     
     import scala.collection.JavaConverters._
     
     import com.codahale.metrics.{Gauge, MetricRegistry}
    +import com.sun.management.OperatingSystemMXBean
    --- End diff --
    
    Good point.
    This class cannot be loaded at least on IBM JDK as reported [here](https://issues.apache.org/jira/browse/DRILL-6702).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214605458
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    +    val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +    val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    --- End diff --
    
    Indeed good point. I'll remove this additional check for null value.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214490061
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    +          attribute.asInstanceOf[Long]
    +        } else {
    +          -1L
    --- End diff --
    
    Any reason to return -1 instead of 0?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214505265
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    --- End diff --
    
    nit: `name("executorCPUTime" )` -> `name("executorCPUTime")`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214415816
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,31 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // will try to get JVM Process CPU time or return -1 otherwise
    +  // will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean if available
    +  def tryToGetJVMProcessPCUTime() : Long = {
    +    val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +    try {
    +      val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +      val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +      if (attribute != null) {
    +        attribute.asInstanceOf[Long]
    +      }
    +      else {
    --- End diff --
    
    Nit: pull onto previous line


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214521537
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    +          attribute.asInstanceOf[Long]
    +        } else {
    +          -1L
    --- End diff --
    
    I took the idea from com.sun.management.OperatingSystemMXBean.getProcessCpuTime, according to the documentation: "Returns: the CPU time used by the process in nanoseconds, or -1 if this operation is not supported."
    I guess it makes sense to return an invalid value as -1L for the CPU time if something goes wrong with gathering CPU Time values, so the error condition will appear evident to the end user of the metric. Returning 0 is also possible, of course.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    **[Test build #4330 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4330/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/e72966e38dc50be7b501387a9f719f85017a7aa8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    **[Test build #4331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4331/testReport)** for PR 22218 at commit [`e72966e`](https://github.com/apache/spark/commit/e72966e38dc50be7b501387a9f719f85017a7aa8).
     * This patch passes all tests.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214544279
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    --- End diff --
    
    So this isn't exposed except through dropwizard... not plumbed through to the driver too like some of the metrics below? just checking that this is all that needs to happen, that the metric can be used by external users but is not otherwise touched by Spark.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214561873
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    +    val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +    val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    --- End diff --
    
    I checked the doc for `getAttribute though, when does it return null?
    https://docs.oracle.com/javase/8/docs/api/javax/management/MBeanServerConnection.html#getAttribute-javax.management.ObjectName-java.lang.String-


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214561269
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,29 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  /** Dropwizard metrics gauge measuring the executor's process CPU time.
    +   *  This code will try to get JVM Process CPU time or return -1 otherwise.
    +   *  The CPU time value is returned in nanoseconds.
    +   *  It will use proprietary extensions as com.sun.management.OperatingSystemMXBean or
    +   *  com.ibm.lang.management.OperatingSystemMXBean if available
    +   */
    +  val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +  val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    +          attribute.asInstanceOf[Long]
    +        } else {
    +          -1L
    --- End diff --
    
    ok, thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    I have refactored the code now using the BeanServer which should address the comments about availability of com.sun.management.OperatingSystemMXBean across different JDKs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214675924
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    +    val mBean: MBeanServer = ManagementFactory.getPlatformMBeanServer
    +    val name = new ObjectName("java.lang", "type", "OperatingSystem")
    +    override def getValue: Long = {
    +      try {
    +        val attribute = mBean.getAttribute(name, "ProcessCpuTime")
    +        if (attribute != null) {
    --- End diff --
    
    I personally don't mind the defensive checks, because who knows what to really expect from these implementations? but this is OK by me. In case of a bad impl this would still return -1.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    Merged to master


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212799646
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -17,11 +17,13 @@
     
     package org.apache.spark.executor
     
    +import java.lang.management.ManagementFactory
     import java.util.concurrent.ThreadPoolExecutor
     
     import scala.collection.JavaConverters._
     
     import com.codahale.metrics.{Gauge, MetricRegistry}
    +import com.sun.management.OperatingSystemMXBean
    --- End diff --
    
    Is this com.sun class going to be available in all JDKs? Thinking of OpenJDK and IBM JDKs


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214549896
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +76,28 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process CPU time.
    +  // This Gauge will try to get and return the JVM Process CPU time or return -1 otherwise.
    +  // The CPU time value is returned in nanoseconds.
    +  // It will use proprietary extensions such as com.sun.management.OperatingSystemMXBean or
    +  // com.ibm.lang.management.OperatingSystemMXBean, if available.
    +  metricRegistry.register(MetricRegistry.name("jvmCpuTime"), new Gauge[Long] {
    --- End diff --
    
    Indeed, this is exposed only through dropwizard metrics system and not used otherwise in the Spark code. Another point worth mentioning is that currently executorSource is not registered when running in local mode.
    On a related topic (although maybe for a more general discussion than the scope of this PR) I was wondering if it would make sense to introduce a few SparkConf properties to switch on/off certain families of (dropwizard) metrics in the Spark, as the list of available metrics is mecoming long in recent versions.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/22218
  
    Thanks @srowen 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r214393554
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -17,11 +17,13 @@
     
     package org.apache.spark.executor
     
    +import java.lang.management.ManagementFactory
     import java.util.concurrent.ThreadPoolExecutor
     
     import scala.collection.JavaConverters._
     
     import com.codahale.metrics.{Gauge, MetricRegistry}
    +import com.sun.management.OperatingSystemMXBean
    --- End diff --
    
    I have refactored the code with a different approach using the BeanServer which should address the comments about avialability of com.sun.management.OperatingSystemMXBean across different JDKs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22218#discussion_r212933292
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
    @@ -73,6 +75,13 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, executorId: String) extends
         registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
       }
     
    +  // Dropwizard metrics gauge measuring the executor's process (JVM) CPU time.
    +  // The value is returned in nanoseconds, the method return -1 if this operation is not supported.
    +  val osMXBean = ManagementFactory.getOperatingSystemMXBean.asInstanceOf[OperatingSystemMXBean]
    +  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new Gauge[Long] {
    +    override def getValue: Long = osMXBean.getProcessCpuTime()
    --- End diff --
    
    I believe the proposed metric tracking the executor CPU time is useful and adds additional information and convenience on top of the task CPU metric, as implemented in SPARK-22190. A couple of considerations to support this argument from some of the recent findings and experimentation on this:
    - the process CPU time contains all the CPU consumed by the JVM, notably including the CPU consumed by garbage collection, which can be important in some cases and definitely something we want to measure and analyze
    - the CPU time collected from the tasks is "harder to consume" in a dashboard as the CPU value is only updated at the end of the successful execution of the task, which makes it harder to handle for a dashboard in case of long-running tasks. In contrast, the executor process CPU time "dropwizard gauge" gives an up-to-date value of the CPU consumed by the executor at any time as it takes it directly from the OS.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org