You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by LucaCanali <gi...@git.apache.org> on 2017/10/04 07:39:44 UTC

[GitHub] spark pull request #19426: [SPARK-22190][CORE] Add Spark executor task metri...

GitHub user LucaCanali opened a pull request:

    https://github.com/apache/spark/pull/19426

    [SPARK-22190][CORE] Add Spark executor task metrics to Dropwizard metrics

    ## What changes were proposed in this pull request?
    
    This proposed patch is about making Spark executor task metrics available as Dropwizard metrics. This is intended to be of aid in monitoring Spark jobs and when drilling down on performance troubleshooting issues.
    
    ## How was this patch tested?
    
    Manually tested on a Spark cluster (see JIRA for an example screenshot).


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/LucaCanali/spark SparkTaskMetricsDropWizard

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19426
    
----
commit ca43764327e8796ea01d5053fc73ca7136b57835
Author: LucaCanali <lu...@cern.ch>
Date:   2017-10-03T13:48:44Z

    Add Spark executor task metrics to Dropwizard metrics

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    LGTM, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19426: [SPARK-22190][CORE] Add Spark executor task metri...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19426


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    **[Test build #83268 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83268/testReport)** for PR 19426 at commit [`dd3b9a2`](https://github.com/apache/spark/commit/dd3b9a21eef37ecb216df5f21226bdc489ebdbd2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Thanks @umehrot2 for the review and comments. Indeed well spotted that I had forgotten a couple of metrics and added one of them twice. This is hopefully fixed with the latest commit. As for your comment on the Spark Executor task metric JVMGCTime, my first reaction is that (1) I think it is good to have this type of info in the context of the executor metrics (2) It does not appear to me that JVMGCTime measured at the executor task level provides a direct duplication of info available in the JVM source (I am available to investigte this in more details in follow-up comments if you wish).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    **[Test build #83268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83268/testReport)** for PR 19426 at commit [`dd3b9a2`](https://github.com/apache/spark/commit/dd3b9a21eef37ecb216df5f21226bdc489ebdbd2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83041/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83268/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by LucaCanali <gi...@git.apache.org>.
Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Thanks @hvanhovell  for pushing the test. I see that the test build threw an error on Scala style tests: the logs report "/Executor.scala:443:0: Whitespace at end of line". However, I cannot reproduce the error when I run scalastyle locally, moreover it appears to me that line Executor.scala:443 is empty (so no extra space there). 



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    **[Test build #83041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83041/testReport)** for PR 19426 at commit [`ca43764`](https://github.com/apache/spark/commit/ca43764327e8796ea01d5053fc73ca7136b57835).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19426
  
    **[Test build #83041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83041/testReport)** for PR 19426 at commit [`ca43764`](https://github.com/apache/spark/commit/ca43764327e8796ea01d5053fc73ca7136b57835).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19426: [SPARK-22190][CORE] Add Spark executor task metri...

Posted by umehrot2 <gi...@git.apache.org>.
Github user umehrot2 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19426#discussion_r147537143
  
    --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
    @@ -406,6 +406,41 @@ private[spark] class Executor(
             task.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
             task.metrics.setResultSerializationTime(afterSerialization - beforeSerialization)
     
    +        // Expose task metrics using the dropwizard metrics system.
    +        // Update task metrics counters
    +        executorSource.METRIC_CPU_TIME.inc(task.metrics.executorCpuTime)
    +        executorSource.METRIC_RUN_TIME.inc(task.metrics.executorRunTime)
    +        executorSource.METRIC_JVM_GC_TIME.inc(task.metrics.jvmGCTime)
    +        executorSource.METRIC_DESERIALIZE_TIME.inc(task.metrics.executorDeserializeTime)
    +        executorSource.METRIC_DESERIALIZE_CPU_TIME.inc(task.metrics.executorDeserializeCpuTime)
    +        executorSource.METRIC_RESULT_SERIALIZE_TIME.inc(task.metrics.resultSerializationTime)
    +        executorSource.METRIC_SHUFFLE_FETCH_WAIT_TIME
    +          .inc(task.metrics.shuffleReadMetrics.fetchWaitTime)
    +        executorSource.METRIC_SHUFFLE_WRITE_TIME.inc(task.metrics.shuffleWriteMetrics.writeTime)
    +        executorSource.METRIC_SHUFFLE_TOTAL_BYTES_READ
    +          .inc(task.metrics.shuffleReadMetrics.totalBytesRead)
    +        executorSource.METRIC_SHUFFLE_REMOTE_BYTES_READ
    +          .inc(task.metrics.shuffleReadMetrics.remoteBytesRead)
    +        executorSource.METRIC_SHUFFLE_LOCAL_BYTES_READ
    +          .inc(task.metrics.shuffleReadMetrics.localBytesRead)
    +        executorSource.METRIC_SHUFFLE_RECORDS_READ
    +          .inc(task.metrics.shuffleReadMetrics.recordsRead)
    +        executorSource.METRIC_SHUFFLE_BYTES_WRITTEN
    +          .inc(task.metrics.shuffleWriteMetrics.bytesWritten)
    +        executorSource.METRIC_SHUFFLE_RECORDS_WRITTEN
    +          .inc(task.metrics.shuffleWriteMetrics.recordsWritten)
    +        executorSource.METRIC_INPUT_BYTES_READ
    +          .inc(task.metrics.inputMetrics.bytesRead)
    +        executorSource.METRIC_INPUT_RECORDS_READ
    +          .inc(task.metrics.inputMetrics.recordsRead)
    +        executorSource.METRIC_OUTPUT_BYTES_WRITTEN
    +          .inc(task.metrics.outputMetrics.bytesWritten)
    +        executorSource.METRIC_INPUT_RECORDS_READ
    --- End diff --
    
    This has been defined twice in this file. I think what you meant instead is METRIC_OUTPUT_RECORDS_WRITTEN which is missing here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org