You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sandeep Pal (Jira)" <ji...@apache.org> on 2022/05/25 03:06:00 UTC

[jira] [Created] (SPARK-39283) Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorter

Sandeep Pal created SPARK-39283:
-----------------------------------

             Summary: Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorter
                 Key: SPARK-39283
                 URL: https://issues.apache.org/jira/browse/SPARK-39283
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: Sandeep Pal
         Attachments: DeadlockSparkTasks.png

We are seems this deadlock between {{TaskMemoryManager}} and {{UnsafeExternalSorter}} pretty often on our workload. Sometime, the retry is successful but sometimes we have to do hacky ways to break the deadlocks such as turning down the worker machines explicitly. 

Below is the thread dump from the Spark UI showing the deadlock :
!image-2022-05-24-20-03-35-287.png!

 

I believe there was a related Jira on the similar deadlock between the same threads and it was resolved. 
https://issues.apache.org/jira/browse/SPARK-27338

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org