You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wing Yew Poon (JIRA)" <ji...@apache.org> on 2019/01/17 17:36:00 UTC
[jira] [Commented] (SPARK-26329) ExecutorMetrics should poll faster than heartbeats

    [ https://issues.apache.org/jira/browse/SPARK-26329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745313#comment-16745313 ] 

Wing Yew Poon commented on SPARK-26329:
---------------------------------------

I am working on this. In the executor, we can track what stages have a positive number of running tasks (when a task starts, we can get its stage id and attempt number). When polling the executor memory metrics, we attribute the memory to the active stage(s), and update the peaks. In a heartbeat, we send the per-stage peaks (for stages active at that time), and then reset the peaks. The semantics would be that the per-stage peaks sent in each heartbeat are the peaks since the last heartbeat. In addition, we keep a map of task ids to metrics (for running tasks), which tracks the peaks of the metrics and the polling thread updates this as well. At task end, we send the peak values associated with the task in the task result. These are the peak values of the executor metrics during the lifetime of the task. (Of course, this does not mean that that task alone contributed to those peaks, only that those were the peak memory values seen while that task was running.)
 If between heartbeats, a stage completes, so there are no more running tasks for that stage, then in the next heartbeat, there are no metrics sent for that stage; however, at the end of a task that belonged to that stage, the metrics would have been sent in the task result, so we do not lose those peaks.
 We continue to do the stage-level aggregation in the EventLoggingListener.
 For the driver, I do not plan to poll more frequently. I think this is ok since most memory issues are with executors rather than with the driver. We will still poll when the driver heartbeats. What the driver sends will be the current values of the metrics in the driver at the time of the heartbeat. This is semantically the same as before.

> ExecutorMetrics should poll faster than heartbeats
> --------------------------------------------------
>
>                 Key: SPARK-26329
>                 URL: https://issues.apache.org/jira/browse/SPARK-26329
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Web UI
>    Affects Versions: 3.0.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> We should allow faster polling of the executor memory metrics (SPARK-23429 / SPARK-23206) without requiring a faster heartbeat rate.  We've seen the memory usage of executors pike over 1 GB in less than a second, but heartbeats are only every 10 seconds (by default).  Spark needs to enable fast polling to capture these peaks, without causing too much strain on the system.
> In the current implementation, the metrics are polled along with the heartbeat, but this leads to a slow rate of polling metrics by default.  If users were to increase the rate of the heartbeat, they risk overloading the driver on a large cluster, with too many messages and too much work to aggregate the metrics.  But, the executor could poll the metrics more frequently, and still only send the *max* since the last heartbeat for each metric.  This keeps the load on the driver the same, and only introduces a small overhead on the executor to grab the metrics and keep the max.
> The downside of this approach is that we still need to wait for the next heartbeat for the driver to be aware of the new peak.   If the executor dies or is killed before then, then we won't find out.  A potential future enhancement would be to send an update *anytime* there is an increase by some percentage, but we'll leave that out for now.
> Another possibility would be to change the metrics themselves to track peaks for us, so we don't have to fine-tune the polling rate.  For example, some jvm metrics provide a usage threshold, and notification: https://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold
> But, that is not available on all metrics.  This proposal gives us a generic way to get a more accurate peak memory usage for *all* metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org