You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2009/10/07 09:42:31 UTC

[jira] Updated: (MAPREDUCE-1070) Deadlock in FairSchedulerServlet

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-1070:
-----------------------------------

    Attachment: deadlock.png

See attached diagram displaying inconsistent lock order based on dynamic analysis.

Here's a stack trace from an instance we saw this in production:

{noformat}
Thread 60324 (1823988020@qtp0-4064):
  State: BLOCKED
  Blocked count: 52
  Waited count: 32
  Blocked on org.apache.hadoop.mapred.JobInProgress@5d2044dd
  Blocked by 113 (IPC Server handler 9 on 7277)

  Stack:
    org.apache.hadoop.mapred.JobInProgress.finishedMaps(JobInProgress.java:560)
    org.apache.hadoop.mapred.FairSchedulerServlet.showJobs(FairSchedulerServlet.java:235)
    org.apache.hadoop.mapred.FairSchedulerServlet.doGet(FairSchedulerServlet.java:136)

    javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 ...
Thread 113 (IPC Server handler 9 on 7277):
  State: BLOCKED
  Blocked count: 540572
  Waited count: 2658131
  Blocked on org.apache.hadoop.mapred.FairScheduler@a12d500

  Blocked by 60324 (1823988020@qtp0-4064)
  Stack:
    org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:2069)
    org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:2538)
    org.apache.hadoop.mapred.JobInProgress.jobComplete(JobInProgress.java:2181)

    org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:2125)
    org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:892)
    org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:3415)

    org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:2712)
    org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2507)
{noformat}

The solution is that the servlet should synchronize on JobTracker before synchronizing on jobs

> Deadlock in FairSchedulerServlet
> --------------------------------
>
>                 Key: MAPREDUCE-1070
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1070
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: deadlock.png
>
>
> FairSchedulerServlet can cause a deadlock with the JobTracker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.