You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2009/10/07 09:42:31 UTC
[jira] Updated: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet
[ https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated MAPREDUCE-1070:
-----------------------------------
Attachment: deadlock.png
See attached diagram displaying inconsistent lock order based on dynamic analysis.
Here's a stack trace from an instance we saw this in production:
{noformat}
Thread 60324 (1823988020@qtp0-4064):
State: BLOCKED
Blocked count: 52
Waited count: 32
Blocked on org.apache.hadoop.mapred.JobInProgress@5d2044dd
Blocked by 113 (IPC Server handler 9 on 7277)
Stack:
org.apache.hadoop.mapred.JobInProgress.finishedMaps(JobInProgress.java:560)
org.apache.hadoop.mapred.FairSchedulerServlet.showJobs(FairSchedulerServlet.java:235)
org.apache.hadoop.mapred.FairSchedulerServlet.doGet(FairSchedulerServlet.java:136)
javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
...
Thread 113 (IPC Server handler 9 on 7277):
State: BLOCKED
Blocked count: 540572
Waited count: 2658131
Blocked on org.apache.hadoop.mapred.FairScheduler@a12d500
Blocked by 60324 (1823988020@qtp0-4064)
Stack:
org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2069)
org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2538)
org.apache.hadoop.mapred.JobInProgress.jobComplete(JobInProgress.java:2181)
org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2125)
org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:892)
org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:3415)
org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:2712)
org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2507)
{noformat}
The solution is that the servlet should synchronize on JobTracker before synchronizing on jobs
> Deadlock in FairSchedulerServlet
> --------------------------------
>
> Key: MAPREDUCE-1070
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.20.1, 0.21.0, 0.22.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: deadlock.png
>
>
> FairSchedulerServlet can cause a deadlock with the JobTracker
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.