You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by skonto <gi...@git.apache.org> on 2018/09/10 13:43:19 UTC

[GitHub] spark pull request #22381: [SPARK-25394][Core] Add application status metric...

GitHub user skonto opened a pull request:

    https://github.com/apache/spark/pull/22381

    [SPARK-25394][Core] Add application status metrics source

    ## What changes were proposed in this pull request?
    
    - Exposes several metrics regarding application status as a source. 
    - Metrics are gathered when a job ends, could be more fine-grained but most metrics like tasks completed are also counted by executors. More metrics could be exposed in the future to avoid scraping executors.
    
    ## How was this patch tested?
    
    This was manually tested with a jmx source and prometheus server:
    ![metrics](https://user-images.githubusercontent.com/7945591/45300945-63064d00-b518-11e8-812a-d9b4155ba0c0.png)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/skonto/spark add_app_status_metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22381.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22381
    
----
commit b43190d5393217f751f38149839a060df88baf1f
Author: Stavros Kontopoulos <st...@...>
Date:   2018-09-10T13:37:33Z

    add application status source

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by aditanase <gi...@git.apache.org>.
Github user aditanase commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r223667852
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = metricRegistry.counter(MetricRegistry.name("failedStages"))
    +
    +  val SKIPPED_STAGES = metricRegistry.counter(MetricRegistry.name("skippedStages"))
    +
    +  val COMPLETED_STAGES = metricRegistry.counter(MetricRegistry.name("completedStages"))
    +
    +  val SUCCEEDED_JOBS = metricRegistry.counter(MetricRegistry.name("succeededJobs"))
    +
    +  val FAILED_JOBS = metricRegistry.counter(MetricRegistry.name("failedJobs"))
    +
    +  val COMPLETED_TASKS = metricRegistry.counter(MetricRegistry.name("completedTasks"))
    +
    +  val FAILED_TASKS = metricRegistry.counter(MetricRegistry.name("failedTasks"))
    +
    +  val KILLED_TASKS = metricRegistry.counter(MetricRegistry.name("killedTasks"))
    +
    +  val SKIPPED_TASKS = metricRegistry.counter(MetricRegistry.name("skippedTasks"))
    +
    +  val BLACKLISTED_EXECUTORS = metricRegistry.counter(MetricRegistry.name("blackListedExecutors"))
    --- End diff --
    
    Given that most metrics are counters, I would add a small inline function that reduces some of the repetition here, e.g.
    ```scala
    def getCounter(prefix: String, name: String): Counter =
      metricRegistry.counter(MetricRegistry.name(prefix, name))
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2978/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225302668
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +391,38 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        source <- appStatusSource
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +          val localSubmissionTime =
    +            LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +          val localCompletionTime =
    +            LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +          val duration = Duration.between(localSubmissionTime, localCompletionTime)
    --- End diff --
    
    I'm a little confused about why all these calls are needed. Isn't duration `Duration.ofMillis(completionTime.getTime() - submissionTime.getTime())`?
    
    And since the metric is actually in milliseconds, isn't this whole block basically:
    
    ```
    source.JOB_DURATION.value.set(completionTime.getTime() - submissionTime.getTime())
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3211/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216430755
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ---
    @@ -503,9 +503,12 @@ private[spark] object AppStatusStore {
       /**
        * Create an in-memory store for a live application.
        */
    -  def createLiveStore(conf: SparkConf): AppStatusStore = {
    +  def createLiveStore(
    +      conf: SparkConf,
    +      appStatusSource: Option[AppStatusSource] = None):
    --- End diff --
    
    appStatusSource needs to be passed here as it is created in the SparkContext, do you mean something else? 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r224070656
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = metricRegistry.counter(MetricRegistry.name("failedStages"))
    +
    +  val SKIPPED_STAGES = metricRegistry.counter(MetricRegistry.name("skippedStages"))
    +
    +  val COMPLETED_STAGES = metricRegistry.counter(MetricRegistry.name("completedStages"))
    +
    +  val SUCCEEDED_JOBS = metricRegistry.counter(MetricRegistry.name("succeededJobs"))
    +
    +  val FAILED_JOBS = metricRegistry.counter(MetricRegistry.name("failedJobs"))
    +
    +  val COMPLETED_TASKS = metricRegistry.counter(MetricRegistry.name("completedTasks"))
    +
    +  val FAILED_TASKS = metricRegistry.counter(MetricRegistry.name("failedTasks"))
    +
    +  val KILLED_TASKS = metricRegistry.counter(MetricRegistry.name("killedTasks"))
    +
    +  val SKIPPED_TASKS = metricRegistry.counter(MetricRegistry.name("skippedTasks"))
    +
    +  val BLACKLISTED_EXECUTORS = metricRegistry.counter(MetricRegistry.name("blackListedExecutors"))
    --- End diff --
    
    +1


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @vanzin ready to merge.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][Core] Add application status metrics sourc...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @felixcheung psl review.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #97324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97324/testReport)** for PR 22381 at commit [`86bf7ec`](https://github.com/apache/spark/commit/86bf7ec4bbe1bca268f85464b73aa0cdcb0bf163).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225721730
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -17,6 +17,7 @@
     
     package org.apache.spark.status
     
    +import java.time.Duration
    --- End diff --
    
    Unused?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3115/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225300578
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    This doesn't need `yield`, though, since you're not really using the result. It can just be `for { ... } { ... }`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96215/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216378185
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ---
    @@ -503,9 +503,12 @@ private[spark] object AppStatusStore {
       /**
        * Create an in-memory store for a live application.
        */
    -  def createLiveStore(conf: SparkConf): AppStatusStore = {
    +  def createLiveStore(
    +      conf: SparkConf,
    +      appStatusSource: Option[AppStatusSource] = None):
    --- End diff --
    
    nit: `: AppStatusStore`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216391459
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,12 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach{_.BLACKLISTED_EXECUTORS.inc(1)}
    --- End diff --
    
    Nit: you don't need a block here, just parens. Same for several calls below.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216465915
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +        val localSubmissionTime =
    +          LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +        val localCompletionTime =
    +          LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +        val duration = Duration.between(localCompletionTime, localSubmissionTime)
    --- End diff --
    
    need to reverse this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3931/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by aditanase <gi...@git.apache.org>.
Github user aditanase commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r223667312
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = metricRegistry.counter(MetricRegistry.name("failedStages"))
    --- End diff --
    
    I would suggest creating groups of metrics by adding a prefix with the "context", e.g. `stages`, `jobs`, `tasks`.
    This is a best practice followed by other sources. See:
    - https://github.com/apache/spark/blob/9fed6abfdcb7afcf92be56e5ccbed6599fe66bc4/core/src/main/scala/org/apache/spark/scheduler/DAGSchedulerSource.scala#L29
    - https://github.com/apache/spark/blob/a4491626ed8169f0162a0dfb78736c9b9e7fb434/core/src/main/scala/org/apache/spark/storage/BlockManagerSource.scala#L38


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22381


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97324/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    I don't think I know enough about monitoring to merge this. I am concerned about the duplication here, although I understand your argument why it's valuable.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @srowen I was checking git history and I see that you merged this one: https://github.com/apache/spark/pull/22218 again related to metrics. Could you call someone who can do the merge?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    test failure is irrelevant. jenkins, test this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @erikerlandson I added an option to disable them, by default they are now disabled. Is it possible to move forward with this? @srowen ? Btw I removed any code triggered by this unless explicitly enabled.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96159/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216377621
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,12 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach{_.BLACKLISTED_EXECUTORS.inc(1)}
    +      }
    +      else {
    --- End diff --
    
    nit: } else {


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96151 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96151/testReport)** for PR 22381 at commit [`677a7c3`](https://github.com/apache/spark/commit/677a7c3da10ae7c241a1b8fc5cb9a9768109c6b1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216377526
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -560,6 +561,7 @@ class SparkContext(config: SparkConf) extends Logging {
     
         setupAndStartListenerBus()
         postEnvironmentUpdate()
    +    _env.metricsSystem.registerSource(appStatusSource)
    --- End diff --
    
    Better to put this line to +574


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225721809
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import AppStatusSource.getCounter
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override implicit val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = getCounter("stages", "failedStages")
    +
    +  val SKIPPED_STAGES = getCounter("stages", "skippedStages")
    +
    +  val COMPLETED_STAGES = getCounter("stages", "completedStages")
    +
    +  val SUCCEEDED_JOBS = getCounter("jobs", "succeededJobs")
    +
    +  val FAILED_JOBS = getCounter("jobs", "failedJobs")
    +
    +  val COMPLETED_TASKS = getCounter("tasks", "completedTasks")
    +
    +  val FAILED_TASKS = getCounter("tasks", "failedTasks")
    +
    +  val KILLED_TASKS = getCounter("tasks", "killedTasks")
    +
    +  val SKIPPED_TASKS = getCounter("tasks", "skippedTasks")
    +
    +  val BLACKLISTED_EXECUTORS = getCounter("tasks", "blackListedExecutors")
    +
    +  val UNBLACKLISTED_EXECUTORS = getCounter("tasks", "unblackListedExecutors")
    +}
    +
    +private[spark] object AppStatusSource {
    +
    +  def getCounter(prefix: String, name: String)(implicit metricRegistry: MetricRegistry): Counter = {
    +    metricRegistry.counter (MetricRegistry.name (prefix, name) )
    --- End diff --
    
    nit: no space before `(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96153/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216430819
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,55 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.metrics.source.Source
    +
    +private[spark] class AppStatusSource extends Source{
    --- End diff --
    
    :+1: 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3173/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @vanzin thanx for the review I think It looks ok now.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r218660634
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ---
    @@ -503,9 +503,12 @@ private[spark] object AppStatusStore {
       /**
        * Create an in-memory store for a live application.
        */
    -  def createLiveStore(conf: SparkConf): AppStatusStore = {
    +  def createLiveStore(
    +      conf: SparkConf,
    +      appStatusSource: Option[AppStatusSource] = None):
    --- End diff --
    
    Yep, sorry for the late reply. :(


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216470277
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,55 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.metrics.source.Source
    +
    +private[spark] class AppStatusSource extends Source{
    +
    +  override val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val JOB_DURATION = metricRegistry.histogram(MetricRegistry.name("jobDuration"))
    +
    +  val FAILED_STAGES = metricRegistry.counter(MetricRegistry.name("failedStages"))
    +
    +  val SKIPPED_STAGES = metricRegistry.counter(MetricRegistry.name("skippedStages"))
    +
    +  val COMPLETED_STAGES = metricRegistry.counter(MetricRegistry.name("completedStages"))
    +
    +  val COMPLETED_JOBS = metricRegistry.counter(MetricRegistry.name("completedJobs"))
    --- End diff --
    
    this is not used, need to remove.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #97415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97415/testReport)** for PR 22381 at commit [`a08836c`](https://github.com/apache/spark/commit/a08836cc9ca91164cf5dd95a63929d0613779cf2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95880/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Yeah, this is bigger than adding one metric; raise it on dev@? I'm not sure who if anyone has an opinion on it. I'd rather hear more feedback on the history of these choices and what the right future state is. What's the use case, etc


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96153 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96153/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    please test this


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96076/testReport)** for PR 22381 at commit [`635f5aa`](https://github.com/apache/spark/commit/635f5aae4c91890c62b18edcf2b71400861cc463).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Thanx @vanzin!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225339134
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +391,38 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        source <- appStatusSource
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +          val localSubmissionTime =
    +            LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +          val localCompletionTime =
    +            LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +          val duration = Duration.between(localSubmissionTime, localCompletionTime)
    --- End diff --
    
    Yeah just did the Java 8 thing, guess too much, will fix. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2977/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95881/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96215/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95880 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95880/testReport)** for PR 22381 at commit [`b43190d`](https://github.com/apache/spark/commit/b43190d5393217f751f38149839a060df88baf1f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3169/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225339164
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import AppStatusSource.getCounter
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override implicit val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = getCounter("stages", "failedStages")
    +
    +  val SKIPPED_STAGES = getCounter("stages", "skippedStages")
    +
    +  val COMPLETED_STAGES = getCounter("stages", "completedStages")
    +
    +  val SUCCEEDED_JOBS = getCounter("jobs", "succeededJobs")
    +
    +  val FAILED_JOBS = getCounter("jobs", "failedJobs")
    +
    +  val COMPLETED_TASKS = getCounter("tasks", "completedTasks")
    +
    +  val FAILED_TASKS = getCounter("tasks", "failedTasks")
    +
    +  val KILLED_TASKS = getCounter("tasks", "killedTasks")
    +
    +  val SKIPPED_TASKS = getCounter("tasks", "skippedTasks")
    +
    +  val BLACKLISTED_EXECUTORS = getCounter("tasks", "blackListedExecutors")
    +
    +  val UNBLACKLISTED_EXECUTORS = getCounter("tasks", "unblackListedExecutors")
    +}
    +
    +private[spark] object AppStatusSource {
    +
    +  def getCounter(prefix: String, name: String)(implicit metricRegistry: MetricRegistry): Counter = {
    +    metricRegistry.counter (MetricRegistry.name (prefix, name) )
    +  }
    +
    +  def createSource(conf: SparkConf): Option[AppStatusSource] = {
    +    Option(conf.get(AppStatusSource.APP_STATUS_METRICS_ENABLED))
    +      .filter(identity)
    +      .map(_ => new AppStatusSource())
    --- End diff --
    
    ok :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95905/testReport)** for PR 22381 at commit [`a8fc89d`](https://github.com/apache/spark/commit/a8fc89d51971e16c37783cf336daa07d18e6d3c1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #97415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97415/testReport)** for PR 22381 at commit [`a08836c`](https://github.com/apache/spark/commit/a08836cc9ca91164cf5dd95a63929d0613779cf2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96153/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    jenkins, test this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by aditanase <gi...@git.apache.org>.
Github user aditanase commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r223664993
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,11 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach(_.BLACKLISTED_EXECUTORS.inc(1))
    --- End diff --
    
    You could just say `.inc()` - 1 is the default. Same everywhere else.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r217510789
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ---
    @@ -503,9 +503,12 @@ private[spark] object AppStatusStore {
       /**
        * Create an in-memory store for a live application.
        */
    -  def createLiveStore(conf: SparkConf): AppStatusStore = {
    +  def createLiveStore(
    +      conf: SparkConf,
    +      appStatusSource: Option[AppStatusSource] = None):
    --- End diff --
    
    I think the comment may be that that you can move `AppStatusStore = {` on to to same line


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2991/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96159 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96159/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @srowen thanks is there someone I could call? @vanzin ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216433852
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    `java.util.Date.getTime` gives you ms since the epoch directly, so I think you can subtract them and be done?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96151 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96151/testReport)** for PR 22381 at commit [`677a7c3`](https://github.com/apache/spark/commit/677a7c3da10ae7c241a1b8fc5cb9a9768109c6b1).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225339052
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +391,38 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        source <- appStatusSource
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +          val localSubmissionTime =
    +            LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +          val localCompletionTime =
    +            LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +          val duration = Duration.between(localSubmissionTime, localCompletionTime)
    +          source.JOB_DURATION.value.set(duration.toMillis)
    +      }
    +
    +      // update global app status counters
    +      appStatusSource.foreach(_.COMPLETED_STAGES.inc(job.completedStages.size))
    --- End diff --
    
    ok will fix.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95880 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95880/testReport)** for PR 22381 at commit [`b43190d`](https://github.com/apache/spark/commit/b43190d5393217f751f38149839a060df88baf1f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @vanzin AFAIK the operations just update metrics in the underlying dropwizard metrics library. I dont htink anything is shipped anywhere. How should I proceed? Is there anyone who is familiar with the internals?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    UT fixed by #22452.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216476831
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +        val localSubmissionTime =
    +          LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +        val localCompletionTime =
    +          LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +        val duration = Duration.between(localCompletionTime, localSubmissionTime)
    +        appStatusSource.foreach{_.JOB_DURATION.update(duration.toMillis)}
    --- End diff --
    
    Actually I will transform it to gauge makes more sense.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216496620
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    @srowen regarding multiple options its a valid technique: https://alvinalexander.com/scala/how-to-use-multiple-options-for-loop-comprehension
    Anyway I can keep it simple and  just use ifDefined and then get the values, it liked the former though, more idiomatic.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4005/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95881/testReport)** for PR 22381 at commit [`b43190d`](https://github.com/apache/spark/commit/b43190d5393217f751f38149839a060df88baf1f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by aditanase <gi...@git.apache.org>.
Github user aditanase commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @skonto @vanzin You can rest assured the metrics are not shipped anywhere. Collection is cheap/free and completely decoupled from reporting.
    The registry is only a data bag, shipping metrics is done via reporters that are configured to "pull" metrics from the registry at regular intervals.
    Also, you're mostly adding counters that are extremely simple. Histograms and timers are more involved but the cost associated is far outweighed by the benefits of having data points.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95881/testReport)** for PR 22381 at commit [`b43190d`](https://github.com/apache/spark/commit/b43190d5393217f751f38149839a060df88baf1f).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r224071170
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,11 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach(_.BLACKLISTED_EXECUTORS.inc(1))
    --- End diff --
    
    ok


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96215/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @srowen @xuanyuanking ready for another round.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #95905 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95905/testReport)** for PR 22381 at commit [`a8fc89d`](https://github.com/apache/spark/commit/a8fc89d51971e16c37783cf336daa07d18e6d3c1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @srowen integration tests seem so flaky! Seen this before?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    You guys need to be a little patient...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216472252
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    It should be possible just tried the elegant java way: https://stackoverflow.com/questions/4927856/how-to-calculate-time-difference-in-java :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    > that mirroring of metrics seems like a bigger question 
    @srowen thanks for the comments, rest api is not always available or a standard way to export stuff.
    This is also the experience coming from [Opsclarity](https://www.opsclarity.com/product/features/) work.
    It it much more standard to use jmx for many systems out there AFAIK. This way using one component only like jmx exporter I can have all metrics in one place and no need to scrape the rest api.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    I don't really have an opinion about the metrics or names or things like that; but not being familiar with the internals of this metric system, I'd just like to make sure that updating these metrics is cheap and won't do things like actually try to send data to the sinks, which would be a no-no in this particular listener.
    
    Also, if the metrics system can be disabled (isn't it disabled by default?), I'd avoid even updating these metrics in the listener at all.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96076 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96076/testReport)** for PR 22381 at commit [`635f5aa`](https://github.com/apache/spark/commit/635f5aae4c91890c62b18edcf2b71400861cc463).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    ok will do.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r224071384
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,81 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = metricRegistry.counter(MetricRegistry.name("failedStages"))
    --- End diff --
    
    ok makes sense


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @erikerlandson I dont any tests for sources AFAIK, they exist for the metric system itself. From the commit history this look like an old feature. Regarding enabling or disabling them no at the moment. But it is fairly easy to do via spark conf as there is one flag for structured streaming already `spark.sql.streaming.metricsEnabled`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216430860
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,12 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach{_.BLACKLISTED_EXECUTORS.inc(1)}
    +      }
    +      else {
    --- End diff --
    
    ok


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #97324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97324/testReport)** for PR 22381 at commit [`86bf7ec`](https://github.com/apache/spark/commit/86bf7ec4bbe1bca268f85464b73aa0cdcb0bf163).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216377882
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,55 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import com.codahale.metrics.{Gauge, MetricRegistry}
    +
    +import org.apache.spark.metrics.source.Source
    +
    +private[spark] class AppStatusSource extends Source{
    --- End diff --
    
    nit: Source {


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by dbrinegar <gi...@git.apache.org>.
Github user dbrinegar commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    This is quite a nice improvement for monitoring.  The rest api is great for UI and consolidated analytics, but monitoring through it is not as straightforward as when the data emits directly from the source like this.  There is all kinds of nice context that we get when the data from this spark node is collected directly from the node itself, and not proxied through another collector / reporter.  It is easier to build a monitoring data model across the cluster when node, jmx, pod, resource manifests, and spark data all align by virtue of coming from the same collector.  Building a similar view of the cluster just from the rest api, as a comparison, is simply harder and quite challenging to do in general purpose terms.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @vanzin could I get a merge pls?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225303236
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusSource.scala ---
    @@ -0,0 +1,85 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.status
    +
    +import java.util.concurrent.atomic.AtomicLong
    +
    +import AppStatusSource.getCounter
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry}
    +
    +import org.apache.spark.SparkConf
    +import org.apache.spark.internal.config.ConfigBuilder
    +import org.apache.spark.metrics.source.Source
    +
    +private [spark] class JobDuration(val value: AtomicLong) extends Gauge[Long] {
    +  override def getValue: Long = value.get()
    +}
    +
    +private[spark] class AppStatusSource extends Source {
    +
    +  override implicit val metricRegistry = new MetricRegistry()
    +
    +  override val sourceName = "appStatus"
    +
    +  val jobDuration = new JobDuration(new AtomicLong(0L))
    +
    +  // Duration of each job in milliseconds
    +  val JOB_DURATION = metricRegistry
    +    .register(MetricRegistry.name("jobDuration"), jobDuration)
    +
    +  val FAILED_STAGES = getCounter("stages", "failedStages")
    +
    +  val SKIPPED_STAGES = getCounter("stages", "skippedStages")
    +
    +  val COMPLETED_STAGES = getCounter("stages", "completedStages")
    +
    +  val SUCCEEDED_JOBS = getCounter("jobs", "succeededJobs")
    +
    +  val FAILED_JOBS = getCounter("jobs", "failedJobs")
    +
    +  val COMPLETED_TASKS = getCounter("tasks", "completedTasks")
    +
    +  val FAILED_TASKS = getCounter("tasks", "failedTasks")
    +
    +  val KILLED_TASKS = getCounter("tasks", "killedTasks")
    +
    +  val SKIPPED_TASKS = getCounter("tasks", "skippedTasks")
    +
    +  val BLACKLISTED_EXECUTORS = getCounter("tasks", "blackListedExecutors")
    +
    +  val UNBLACKLISTED_EXECUTORS = getCounter("tasks", "unblackListedExecutors")
    +}
    +
    +private[spark] object AppStatusSource {
    +
    +  def getCounter(prefix: String, name: String)(implicit metricRegistry: MetricRegistry): Counter = {
    +    metricRegistry.counter (MetricRegistry.name (prefix, name) )
    +  }
    +
    +  def createSource(conf: SparkConf): Option[AppStatusSource] = {
    +    Option(conf.get(AppStatusSource.APP_STATUS_METRICS_ENABLED))
    +      .filter(identity)
    +      .map(_ => new AppStatusSource())
    --- End diff --
    
    `.map { foo => blah }`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95905/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216429755
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    Yeah I wanted to do this properly (java 8 wise, didnt think much of it) so on one hand I do have the timestamp on the other hand I do have the date object. So it was straightforward to pick the date. I didnt extract a timestamp out of the date object.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r225301006
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +391,38 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        source <- appStatusSource
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +          val localSubmissionTime =
    +            LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +          val localCompletionTime =
    +            LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +          val duration = Duration.between(localSubmissionTime, localCompletionTime)
    +          source.JOB_DURATION.value.set(duration.toMillis)
    +      }
    +
    +      // update global app status counters
    +      appStatusSource.foreach(_.COMPLETED_STAGES.inc(job.completedStages.size))
    --- End diff --
    
    Better to call `foreach` only once.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    **[Test build #96159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96159/testReport)** for PR 22381 at commit [`3a2db16`](https://github.com/apache/spark/commit/3a2db16813a2bab3160f444fe0855855187a6178).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96076/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by xuanyuanking <gi...@git.apache.org>.
Github user xuanyuanking commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    retest this please.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216469838
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    +        val localSubmissionTime =
    +          LocalDateTime.ofInstant(submissionTime.toInstant, ZoneId.systemDefault)
    +        val localCompletionTime =
    +          LocalDateTime.ofInstant(completionTime.toInstant, ZoneId.systemDefault)
    +        val duration = Duration.between(localCompletionTime, localSubmissionTime)
    +        appStatusSource.foreach{_.JOB_DURATION.update(duration.toMillis)}
    --- End diff --
    
    Ideally this should be a gauge due to this: https://prometheus.io/docs/instrumenting/writing_exporters/#drop-less-useful-statistics



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    @vanzin didn't mean to press or anything :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216430264
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -280,6 +284,12 @@ private[spark] class AppStatusListener(
       private def updateBlackListStatus(execId: String, blacklisted: Boolean): Unit = {
         liveExecutors.get(execId).foreach { exec =>
           exec.isBlacklisted = blacklisted
    +      if (blacklisted) {
    +        appStatusSource.foreach{_.BLACKLISTED_EXECUTORS.inc(1)}
    --- End diff --
    
    yeah I thought intellij complained, will fix.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97415/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Looks like different failures. I am not sure if these are known issues. I'd try again later today.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216392303
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -382,11 +392,37 @@ private[spark] class AppStatusListener(
           }
     
           job.status = event.jobResult match {
    -        case JobSucceeded => JobExecutionStatus.SUCCEEDED
    -        case JobFailed(_) => JobExecutionStatus.FAILED
    +        case JobSucceeded =>
    +          appStatusSource.foreach{_.SUCCEEDED_JOBS.inc(1)}
    +          JobExecutionStatus.SUCCEEDED
    +        case JobFailed(_) =>
    +          appStatusSource.foreach{_.FAILED_JOBS.inc(1)}
    +          JobExecutionStatus.FAILED
           }
     
           job.completionTime = if (event.time > 0) Some(new Date(event.time)) else None
    +
    +      for {
    +        submissionTime <- job.submissionTime
    +        completionTime <- job.completionTime
    +      } yield {
    --- End diff --
    
    Nit: don't you technically need `foreach` here? I think it might be clearer to just wrap this in `if (job.submissionTime.isDefined && ...)` rather than the `for yield` that doesn't really loop.
    Below -- isn't this just trying to take the difference in time in milliseconds between two dates? if so don't you just subtract their timestamps or is it more subtle here?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r216431046
  
    --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
    @@ -560,6 +561,7 @@ class SparkContext(config: SparkConf) extends Logging {
     
         setupAndStartListenerBus()
         postEnvironmentUpdate()
    +    _env.metricsSystem.registerSource(appStatusSource)
    --- End diff --
    
    Yeah I was wondering where that would be bet to put. So at the end makes sense on the other hand what if I want to have metrics as early as the app starts. Anyway, for now its ok.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96151/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3167/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22381#discussion_r217672239
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ---
    @@ -503,9 +503,12 @@ private[spark] object AppStatusStore {
       /**
        * Create an in-memory store for a live application.
        */
    -  def createLiveStore(conf: SparkConf): AppStatusStore = {
    +  def createLiveStore(
    +      conf: SparkConf,
    +      appStatusSource: Option[AppStatusSource] = None):
    --- End diff --
    
    Ok will do thanx.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by erikerlandson <gi...@git.apache.org>.
Github user erikerlandson commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    These new metrics seem useful.  Is there a way to provide unit or integration testing for it?
    Do these have enable/disable via metrics.properties files?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Thanx a lot @aditanase ! I will update the PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22381
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org