You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/12 14:13:28 UTC

[GitHub] [spark] tgravescs commented on a change in pull request #29977: [SPARK-33088][CORE] Enhance ExecutorPlugin API to include callbacks on task start and end events

tgravescs commented on a change in pull request #29977:
URL: https://github.com/apache/spark/pull/29977#discussion_r503323475



##########
File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
##########
@@ -332,7 +332,8 @@ private[spark] class Executor(
 
   class TaskRunner(
       execBackend: ExecutorBackend,
-      private val taskDescription: TaskDescription)
+      private val taskDescription: TaskDescription,
+      private val plugins: Option[PluginContainer])

Review comment:
       @rshkv, what is the reason to make this default to None?  This is an internal api and only called from here. It's an option already so people can check it easily.  In some ways its nice to force it so you make sure all uses of it have been updated.  
   Are there cases you know this is used outside Spark?

##########
File path: core/src/main/java/org/apache/spark/api/plugin/ExecutorPlugin.java
##########
@@ -54,4 +54,39 @@ default void init(PluginContext ctx, Map<String, String> extraConf) {}
    */
   default void shutdown() {}
 
+  /**
+   * Perform any action before the task is run.
+   * <p>
+   * This method is invoked from the same thread the task will be executed.
+   * Task-specific information can be accessed via {@link org.apache.spark.TaskContext#get}.
+   * <p>
+   * Plugin authors should avoid expensive operations here, as this method will be called
+   * on every task, and doing something expensive can significantly slow down a job.
+   * It is not recommended for a user to call a remote service, for example.
+   * <p>
+   * Exceptions thrown from this method do not propagate - they're caught,
+   * logged, and suppressed. Therefore exceptions when executing this method won't
+   * make the job fail.
+   */

Review comment:
       add @since 3.1.0 to each of the new functions docs

##########
File path: core/src/main/scala/org/apache/spark/scheduler/Task.scala
##########
@@ -123,8 +125,12 @@ private[spark] abstract class Task[T](
       Option(taskAttemptId),
       Option(attemptNumber)).setCurrentContext()
 
+    plugins.foreach(_.onTaskStart())

Review comment:
       maybe I'm misunderstanding but the documentation states "Exceptions thrown from this method do not propagate", there is nothing here preventing that. I think perhaps you meant to say the user needs to make sure they don't propagate?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org