You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by nickbp <gi...@git.apache.org> on 2018/06/08 23:45:02 UTC

[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

GitHub user nickbp opened a pull request:

    https://github.com/apache/spark/pull/21516

    [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

    Adds metrics for the Mesos Dispatcher:
    - Counters: The total number of times that submissions have entered states
    - Timers: The duration from submit or launch until a submission entered a given state
    - Histogram: The retry counts at time of retry
    
    Also adds metrics for the Mesos coarse-grained driver implementation. The deprecated fine-grained driver is left as-is.
    The driver metrics include e.g. frequency of RPCs to/from Mesos, the internal state tracking of tasks/agents, and durations for bringing up the tasks.
    
    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickbp/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21516
    
----
commit 17c4c755703019ebe835404a626f2298fd9e0171
Author: Nicholas Parker <ni...@...>
Date:   2018-05-04T17:11:23Z

    [SPARK-24501][MESOS] Add Dispatcher and Driver metrics
    
    Adds metrics for the Mesos Dispatcher:
    - Counters: The total number of times that submissions have entered states
    - Timers: The duration from submit or launch until a submission entered a given state
    - Histogram: The retry counts at time of retry
    
    Also adds metrics for the Mesos coarse-grained driver implementation. The deprecated fine-grained driver is left as-is.
    The driver metrics include e.g. frequency of RPCs to/from Mesos, the internal state tracking of tasks/agents, and durations for bringing up the tasks.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93657/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194687233
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
     
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    --- End diff --
    
    Counter is unused import.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    jenkins, please test this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Jenkins, ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200463853
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala ---
    @@ -122,8 +122,9 @@ private[spark] class MesosClusterScheduler(
         conf: SparkConf)
       extends Scheduler with MesosSchedulerUtils {
       var frameworkUrl: String = _
    +  private val metricsSource = new MesosClusterSchedulerSource(this)
    --- End diff --
    
    This is just moving the prior initialization from L308 slightly earlier. This is just being done to make the `sourceName` member available below (avoid duplicate `mesos_cluster` constants across different files).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200470228
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
     
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    +import org.apache.mesos.Protos.{TaskState => MesosTaskState}
    +
    +import org.apache.spark.TaskState
    +import org.apache.spark.deploy.mesos.MesosDriverDescription
     import org.apache.spark.metrics.source.Source
     
     private[mesos] class MesosClusterSchedulerSource(scheduler: MesosClusterScheduler)
    -  extends Source {
    +  extends Source with MesosSchedulerUtils {
    +
    +  // Submission state transitions, to derive metrics from:
    +  // - submit():
    +  //     From: NULL
    +  //     To:   queuedDrivers
    +  // - offers/scheduleTasks():
    +  //     From: queuedDrivers and any pendingRetryDrivers scheduled for retry
    +  //     To:   launchedDrivers if success, or
    +  //           finishedDrivers(fail) if exception
    +  // - taskStatus/statusUpdate():
    +  //     From: launchedDrivers
    +  //     To:   finishedDrivers(success) if success (or fail and not eligible to retry), or
    +  //           pendingRetryDrivers if failed (and eligible to retry)
    +  // - pruning/retireDriver():
    +  //     From: finishedDrivers:
    +  //     To:   NULL
     
       override val sourceName: String = "mesos_cluster"
    -  override val metricRegistry: MetricRegistry = new MetricRegistry()
    +  override val metricRegistry: MetricRegistry = new MetricRegistry
     
    -  metricRegistry.register(MetricRegistry.name("waitingDrivers"), new Gauge[Int] {
    +  // PULL METRICS:
    +  // These gauge metrics are periodically polled/pulled by the metrics system
    +
    +  metricRegistry.register(MetricRegistry.name("driver", "waiting"), new Gauge[Int] {
         override def getValue: Int = scheduler.getQueuedDriversSize
       })
     
    -  metricRegistry.register(MetricRegistry.name("launchedDrivers"), new Gauge[Int] {
    +  metricRegistry.register(MetricRegistry.name("driver", "launched"), new Gauge[Int] {
         override def getValue: Int = scheduler.getLaunchedDriversSize
       })
     
    -  metricRegistry.register(MetricRegistry.name("retryDrivers"), new Gauge[Int] {
    +  metricRegistry.register(MetricRegistry.name("driver", "retry"), new Gauge[Int] {
         override def getValue: Int = scheduler.getPendingRetryDriversSize
       })
    +
    +  metricRegistry.register(MetricRegistry.name("driver", "finished"), new Gauge[Int] {
    +    override def getValue: Int = scheduler.getFinishedDriversSize
    +  })
    +
    +  // PUSH METRICS:
    +  // These metrics are updated directly as events occur
    +
    +  private val queuedCounter = metricRegistry.counter(MetricRegistry.name("driver", "waiting_count"))
    +  private val launchedCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "launched_count"))
    +  private val retryCounter = metricRegistry.counter(MetricRegistry.name("driver", "retry_count"))
    +  private val exceptionCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "exception_count"))
    +  private val finishedCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "finished_count"))
    +
    +  // Same as finishedCounter above, except grouped by MesosTaskState.
    +  private val finishedMesosStateCounters = MesosTaskState.values
    +    // Avoid registering 'finished' metrics for states that aren't considered finished:
    +    .filter(state => TaskState.isFinished(mesosToTaskState(state)))
    +    .map(state => (state, metricRegistry.counter(
    +      MetricRegistry.name("driver", "finished_count_mesos_state", state.name.toLowerCase))))
    +    .toMap
    +  private val finishedMesosUnknownStateCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "finished_count_mesos_state", "UNKNOWN"))
    +
    +  // Duration from submission to FIRST launch.
    +  // This omits retries since those would exaggerate the time since original submission.
    +  private val submitToFirstLaunch =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_first_launch"))
    +  // Duration from initial submission to an exception.
    +  private val submitToException =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_exception"))
    +
    +  // Duration from (most recent) launch to a retry.
    +  private val launchToRetry = metricRegistry.timer(MetricRegistry.name("driver", "launch_to_retry"))
    +
    +  // Duration from initial submission to finished.
    +  private val submitToFinish =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_finish"))
    +  // Duration from (most recent) launch to finished.
    +  private val launchToFinish =
    +    metricRegistry.timer(MetricRegistry.name("driver", "launch_to_finish"))
    +
    +  // Same as submitToFinish and launchToFinish above, except grouped by Spark TaskState.
    +  class FinishStateTimers(state: String) {
    +    val submitToFinish =
    +      metricRegistry.timer(MetricRegistry.name("driver", "submit_to_finish_state", state))
    +    val launchToFinish =
    +      metricRegistry.timer(MetricRegistry.name("driver", "launch_to_finish_state", state))
    +  }
    +  private val finishSparkStateTimers = HashMap.empty[TaskState.TaskState, FinishStateTimers]
    +  for (state <- TaskState.values) {
    +    // Avoid registering 'finished' metrics for states that aren't considered finished:
    +    if (TaskState.isFinished(state)) {
    +      finishSparkStateTimers += (state -> new FinishStateTimers(state.toString.toLowerCase))
    +    }
    +  }
    +  private val submitToFinishUnknownState = metricRegistry.timer(
    +    MetricRegistry.name("driver", "submit_to_finish_state", "UNKNOWN"))
    +  private val launchToFinishUnknownState = metricRegistry.timer(
    +    MetricRegistry.name("driver", "launch_to_finish_state", "UNKNOWN"))
    +
    +  // Histogram of retry counts at retry scheduling
    +  private val retryCount = metricRegistry.histogram(MetricRegistry.name("driver", "retry_counts"))
    +
    +  // Records when a submission initially enters the launch queue.
    +  def recordQueuedDriver(): Unit = queuedCounter.inc
    +
    +  // Records when a submission has failed an attempt and is eligible to be retried
    +  def recordRetryingDriver(state: MesosClusterSubmissionState): Unit = {
    +    state.driverDescription.retryState.foreach(retryState => retryCount.update(retryState.retries))
    +    recordTimeSince(state.startDate, launchToRetry)
    +    retryCounter.inc
    +  }
    +
    +  // Records when a submission is launched.
    +  def recordLaunchedDriver(desc: MesosDriverDescription): Unit = {
    +    if (!desc.retryState.isDefined) {
    --- End diff --
    
    Switched to `isEmpty`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194689746
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    --- End diff --
    
    Same as previously. Scala style here: java.util.Date is in wrong order relative to java.util.concurrent.atomic.AtomicBoolean


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93875 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93875/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194677777
  
    --- Diff: core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala ---
    @@ -205,7 +208,7 @@ private[spark] class MetricsSystem private (
               }
             } catch {
               case e: Exception =>
    -            logError("Sink class " + classPath + " cannot be instantiated")
    +            logError(s"Sink class $classPath cannot be instantiated")
    --- End diff --
    
    +1 for all the log msgs. One question is should some of them be at debug level?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200469780
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
     
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    --- End diff --
    
    Removed


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200470719
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    +
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    +import org.apache.mesos.Protos.{TaskState => MesosTaskState}
    +
    +import org.apache.spark.TaskState
    +import org.apache.spark.deploy.mesos.MesosDriverDescription
    --- End diff --
    
    Cleaned up


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194690227
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    +
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    --- End diff --
    
    Counter unused import.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200470645
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    --- End diff --
    
    Reordered


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    @tnachen @susanxhuynh @mgummelt @skonto 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200469746
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
    --- End diff --
    
    Reordered


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194687110
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
    --- End diff --
    
    This causes a scala style error:
    java.util.Date is in wrong order relative to java.util.concurrent.TimeUnit


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93875/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194688618
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala ---
    @@ -796,6 +811,38 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
           None
         }
       }
    +
    +  // Calls used for metrics polling, see MesosCoarseGrainedSchedulerSource:
    +
    +  def getCoresUsed(): Double = totalCoresAcquired
    --- End diff --
    
    Are there any issues with concurrency here. When 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93875/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    @nickbp Any isntructions how to run this? Could you also provide a design doc?
    There are some isntructions [here](https://github.com/mesosphere/spark-build/blob/dd9d9891d87040825fd9e45195ebed7e5a47e027/monitoring/README.md) for example, but it would be good to have this in Spark on mesos, describe what you get as metrics for the both the driver and the dispatcher. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93424/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194687575
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterSchedulerSource.scala ---
    @@ -17,25 +17,170 @@
     
     package org.apache.spark.scheduler.cluster.mesos
     
    -import com.codahale.metrics.{Gauge, MetricRegistry}
    +import java.util.concurrent.TimeUnit
    +import java.util.Date
     
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    +import org.apache.mesos.Protos.{TaskState => MesosTaskState}
    +
    +import org.apache.spark.TaskState
    +import org.apache.spark.deploy.mesos.MesosDriverDescription
     import org.apache.spark.metrics.source.Source
     
     private[mesos] class MesosClusterSchedulerSource(scheduler: MesosClusterScheduler)
    -  extends Source {
    +  extends Source with MesosSchedulerUtils {
    +
    +  // Submission state transitions, to derive metrics from:
    +  // - submit():
    +  //     From: NULL
    +  //     To:   queuedDrivers
    +  // - offers/scheduleTasks():
    +  //     From: queuedDrivers and any pendingRetryDrivers scheduled for retry
    +  //     To:   launchedDrivers if success, or
    +  //           finishedDrivers(fail) if exception
    +  // - taskStatus/statusUpdate():
    +  //     From: launchedDrivers
    +  //     To:   finishedDrivers(success) if success (or fail and not eligible to retry), or
    +  //           pendingRetryDrivers if failed (and eligible to retry)
    +  // - pruning/retireDriver():
    +  //     From: finishedDrivers:
    +  //     To:   NULL
     
       override val sourceName: String = "mesos_cluster"
    -  override val metricRegistry: MetricRegistry = new MetricRegistry()
    +  override val metricRegistry: MetricRegistry = new MetricRegistry
     
    -  metricRegistry.register(MetricRegistry.name("waitingDrivers"), new Gauge[Int] {
    +  // PULL METRICS:
    +  // These gauge metrics are periodically polled/pulled by the metrics system
    +
    +  metricRegistry.register(MetricRegistry.name("driver", "waiting"), new Gauge[Int] {
         override def getValue: Int = scheduler.getQueuedDriversSize
       })
     
    -  metricRegistry.register(MetricRegistry.name("launchedDrivers"), new Gauge[Int] {
    +  metricRegistry.register(MetricRegistry.name("driver", "launched"), new Gauge[Int] {
         override def getValue: Int = scheduler.getLaunchedDriversSize
       })
     
    -  metricRegistry.register(MetricRegistry.name("retryDrivers"), new Gauge[Int] {
    +  metricRegistry.register(MetricRegistry.name("driver", "retry"), new Gauge[Int] {
         override def getValue: Int = scheduler.getPendingRetryDriversSize
       })
    +
    +  metricRegistry.register(MetricRegistry.name("driver", "finished"), new Gauge[Int] {
    +    override def getValue: Int = scheduler.getFinishedDriversSize
    +  })
    +
    +  // PUSH METRICS:
    +  // These metrics are updated directly as events occur
    +
    +  private val queuedCounter = metricRegistry.counter(MetricRegistry.name("driver", "waiting_count"))
    +  private val launchedCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "launched_count"))
    +  private val retryCounter = metricRegistry.counter(MetricRegistry.name("driver", "retry_count"))
    +  private val exceptionCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "exception_count"))
    +  private val finishedCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "finished_count"))
    +
    +  // Same as finishedCounter above, except grouped by MesosTaskState.
    +  private val finishedMesosStateCounters = MesosTaskState.values
    +    // Avoid registering 'finished' metrics for states that aren't considered finished:
    +    .filter(state => TaskState.isFinished(mesosToTaskState(state)))
    +    .map(state => (state, metricRegistry.counter(
    +      MetricRegistry.name("driver", "finished_count_mesos_state", state.name.toLowerCase))))
    +    .toMap
    +  private val finishedMesosUnknownStateCounter =
    +    metricRegistry.counter(MetricRegistry.name("driver", "finished_count_mesos_state", "UNKNOWN"))
    +
    +  // Duration from submission to FIRST launch.
    +  // This omits retries since those would exaggerate the time since original submission.
    +  private val submitToFirstLaunch =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_first_launch"))
    +  // Duration from initial submission to an exception.
    +  private val submitToException =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_exception"))
    +
    +  // Duration from (most recent) launch to a retry.
    +  private val launchToRetry = metricRegistry.timer(MetricRegistry.name("driver", "launch_to_retry"))
    +
    +  // Duration from initial submission to finished.
    +  private val submitToFinish =
    +    metricRegistry.timer(MetricRegistry.name("driver", "submit_to_finish"))
    +  // Duration from (most recent) launch to finished.
    +  private val launchToFinish =
    +    metricRegistry.timer(MetricRegistry.name("driver", "launch_to_finish"))
    +
    +  // Same as submitToFinish and launchToFinish above, except grouped by Spark TaskState.
    +  class FinishStateTimers(state: String) {
    +    val submitToFinish =
    +      metricRegistry.timer(MetricRegistry.name("driver", "submit_to_finish_state", state))
    +    val launchToFinish =
    +      metricRegistry.timer(MetricRegistry.name("driver", "launch_to_finish_state", state))
    +  }
    +  private val finishSparkStateTimers = HashMap.empty[TaskState.TaskState, FinishStateTimers]
    +  for (state <- TaskState.values) {
    +    // Avoid registering 'finished' metrics for states that aren't considered finished:
    +    if (TaskState.isFinished(state)) {
    +      finishSparkStateTimers += (state -> new FinishStateTimers(state.toString.toLowerCase))
    +    }
    +  }
    +  private val submitToFinishUnknownState = metricRegistry.timer(
    +    MetricRegistry.name("driver", "submit_to_finish_state", "UNKNOWN"))
    +  private val launchToFinishUnknownState = metricRegistry.timer(
    +    MetricRegistry.name("driver", "launch_to_finish_state", "UNKNOWN"))
    +
    +  // Histogram of retry counts at retry scheduling
    +  private val retryCount = metricRegistry.histogram(MetricRegistry.name("driver", "retry_counts"))
    +
    +  // Records when a submission initially enters the launch queue.
    +  def recordQueuedDriver(): Unit = queuedCounter.inc
    +
    +  // Records when a submission has failed an attempt and is eligible to be retried
    +  def recordRetryingDriver(state: MesosClusterSubmissionState): Unit = {
    +    state.driverDescription.retryState.foreach(retryState => retryCount.update(retryState.retries))
    +    recordTimeSince(state.startDate, launchToRetry)
    +    retryCounter.inc
    +  }
    +
    +  // Records when a submission is launched.
    +  def recordLaunchedDriver(desc: MesosDriverDescription): Unit = {
    +    if (!desc.retryState.isDefined) {
    --- End diff --
    
    How about desc.retryState.isEmpty()?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93424/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93657/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200463540
  
    --- Diff: core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala ---
    @@ -205,7 +208,7 @@ private[spark] class MetricsSystem private (
               }
             } catch {
               case e: Exception =>
    -            logError("Sink class " + classPath + " cannot be instantiated")
    +            logError(s"Sink class $classPath cannot be instantiated")
    --- End diff --
    
    Switched the added info logs to debug both here and in `MetricsConfig`.
    
    I was originally thinking info level for these, just to make it easier to validate that metrics have been initialized. I could see someone wanting to determine, after the fact, why their metrics weren't forwarded as expected. But that said, they could always do that by relaunching with a debug log level.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194690191
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    +
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    +import org.apache.mesos.Protos.{TaskState => MesosTaskState}
    +
    +import org.apache.spark.TaskState
    +import org.apache.spark.deploy.mesos.MesosDriverDescription
    --- End diff --
    
    Unused import.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    **[Test build #93657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93657/testReport)** for PR 21516 at commit [`50c1c1e`](https://github.com/apache/spark/commit/50c1c1e810fa27480ae7e72640cc8f67b44a60f1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93424/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by skonto <gi...@git.apache.org>.
Github user skonto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r194679672
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala ---
    @@ -122,8 +122,9 @@ private[spark] class MesosClusterScheduler(
         conf: SparkConf)
       extends Scheduler with MesosSchedulerUtils {
       var frameworkUrl: String = _
    +  private val metricsSource = new MesosClusterSchedulerSource(this)
    --- End diff --
    
    Shouldn't this be configurable? Is it ok to start the dispatcher by default with a metric system set up?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Jenkins, retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver metrics

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21516
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200470681
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerSource.scala ---
    @@ -0,0 +1,250 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.scheduler.cluster.mesos
    +
    +import java.util.concurrent.TimeUnit
    +import java.util.concurrent.atomic.AtomicBoolean
    +import java.util.Date
    +
    +import scala.collection.mutable.HashMap
    +
    +import com.codahale.metrics.{Counter, Gauge, MetricRegistry, Timer}
    --- End diff --
    
    Cleaned up


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21516: [SPARK-24501][MESOS] Add Dispatcher and Driver me...

Posted by nickbp <gi...@git.apache.org>.
Github user nickbp commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21516#discussion_r200472180
  
    --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala ---
    @@ -796,6 +811,38 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
           None
         }
       }
    +
    +  // Calls used for metrics polling, see MesosCoarseGrainedSchedulerSource:
    +
    +  def getCoresUsed(): Double = totalCoresAcquired
    --- End diff --
    
    In practice there shouldn't be. In the case of a race, values could be briefly out of date but they'd be fine on the following refresh. This structure mirrors the dispatcher in `MesosClusterScheduler.scala`, see e.g. `getQueuedDriversSize`, `getLaunchedDriversSize`, and `getPendingRetryDriversSize` in there.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org