You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/12 18:53:29 UTC

[GitHub] [spark] timarmstrong commented on a change in pull request #34245: [SPARK-33277][PYSPARK][SQL] Writer thread must not access input after task completion listener returns

timarmstrong commented on a change in pull request #34245:
URL: https://github.com/apache/spark/pull/34245#discussion_r727407075



##########
File path: core/src/main/scala/org/apache/spark/TaskContextImpl.scala
##########
@@ -82,23 +82,33 @@ private[spark] class TaskContextImpl(
   @volatile private var _fetchFailedException: Option[FetchFailedException] = None
 
   @GuardedBy("this")
-  override def addTaskCompletionListener(listener: TaskCompletionListener)
-      : this.type = synchronized {
-    if (completed) {
+  override def addTaskCompletionListener(listener: TaskCompletionListener): this.type = {
+    val needToCallListener = synchronized {
+      if (completed) {
+        true
+      } else {
+        onCompleteCallbacks += listener
+        false
+      }
+    }

Review comment:
       The API doesn't intend to guarantee any ordering of when the task completion listeners are called AFAICT. I think before this change the implementation ends up guaranteeing that the listeners are called sequentially.
   
   This might be overengineering it, but we could have a scheme that avoided the deadlock issues and guaranteed sequential execution of callbacks. You would have at most one single thread at any point in time responsible for invoking callbacks. If another thread needs to invoke a callback, it either delegates it to the current callback invocation thread, or it becomes the callback execution thread itself.  This means that the callback invocation thread needs to first invoke all of the current registered callbacks, but when it's done with those, check to see if any more callbacks have been queued.
   
   I think we could do that by having the callback invocation thread taking ownership of the current callbacks list, but after invoking those callbacks checking to see if any more have been queued. We'd also need a variable to track if there's a current callback execution thread.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org