You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/21 00:59:31 UTC

[GitHub] [spark] wypoon edited a comment on issue #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.

wypoon edited a comment on issue #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.
URL: https://github.com/apache/spark/pull/23767#issuecomment-494165380
 
 
   @squito I have implemented your suggestions.
   
   Also, @squito and @edwinalu, I ran some experiments. Previously, I'd run experiments with spark.executor.heartbeatInterval set to 1s. This time, I did not set this, so it defaults to 10s. In this case, if I did not set spark.executor.metrics.pollingInterval, so polling only happens at heartbeats, then we sometimes see metric peaks reported that are all zero. This happens when a task is very short, on the order of 10s; the executor heartbeat does not necessarily start when the executor starts, but some random time up to 10s later. Metric peaks of all zeros are seen both in the task metrics in SparkListenerTaskEnd events and in executor metrics in SparkListenerStageExecutorMetrics events.
   
   When a task starts in an executor, an entry for it is created in a CHM. The metrics associated with this entry are all zero to begin with, and don't change until a poll happens. On task end, if polling hasn't happened in the executor, the metrics are all zero. A SparkListenerTaskEnd event will be written with zeros for metrics. The EventLoggingListener keeps track of metric peaks per stage per executor; it updates the peaks on task end and on executor update (this happens on heartbeat). On stage end, a set of SparkListenerStageExecutorMetrics (one for each executor) will be written to the event log. If no heartbeat and thus no polling has happened in the executor and a stage ends, we will see a SparkListenerStageExecutorMetrics event for that executor with zeros for the metric peaks.
   
   We discussed this possibility, at least for task metrics, quite early on above, and the consensus was that it was ok to report zeros. I still think this is ok, but I think it would be helpful to have some kind of documentation that describes this behavior, but I'm not sure where would be an appropriate place to document this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org