You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Michael Bieniosek (JIRA)" <ji...@apache.org> on 2007/10/06 01:32:50 UTC

[jira] Created: (HADOOP-2001) Deadlock in jobtracker

Deadlock in jobtracker
----------------------

                 Key: HADOOP-2001
                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
             Project: Hadoop
          Issue Type: Bug
    Affects Versions: 0.14.0
            Reporter: Michael Bieniosek
            Priority: Critical


My jobtracker deadlocked; the output from kill -QUIT is:

Found one Java-level deadlock:
=============================
"IPC Server handler 2 on 10001":
  waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
  which is held by "SocketListener0-1"
"SocketListener0-1":
  waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
  which is held by "IPC Server handler 2 on 10001"

Java stack information for the threads listed above:
===================================================
"IPC Server handler 2 on 10001":
        at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
        - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
        at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
        - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
        at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
        - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
"SocketListener0-1":
        at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
        - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
        at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
        - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
        - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
        at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
        at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
        at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
        at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
        at org.mortbay.http.HttpServer.service(HttpServer.java:954)
        at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
        at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
        at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
        at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
        at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
        at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Michael Bieniosek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532930 ] 

Michael Bieniosek commented on HADOOP-2001:
-------------------------------------------

That seems wrong (though, like I say, I don't really know this code).  If you have to hold the jobtracker lock in order to hold the per-job lock, then why bother acquiring the per-job lock at all?

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das reassigned HADOOP-2001:
-----------------------------------

    Assignee: Devaraj Das

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2001) Deadlock in jobtracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy reassigned HADOOP-2001:
-------------------------------------

    Assignee: Arun C Murthy  (was: Devaraj Das)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Attachment: 2001.patch

Sorry, should have checked for compilation problems before submitting the patch. Here is the correct patch.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Arun!

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Michael Bieniosek (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Bieniosek updated HADOOP-2001:
--------------------------------------

    Fix Version/s: 0.15.0

It would be nice to get this fixed for 0.15.0, since it does deadlock the jobtracker.  I can submit the patch I describe if someone thinks that's a good idea.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Priority: Critical
>             Fix For: 0.15.0
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533151 ] 

Devaraj Das commented on HADOOP-2001:
-------------------------------------

Arun, could you pls upload a patch as per your comments (if you have it ready). Thanks.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532831 ] 

Hadoop QA commented on HADOOP-2001:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12367181/2001.patch
against trunk revision r582443.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/897/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/897/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/897/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/897/console

This message is automatically generated.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Status: Open  (was: Patch Available)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Status: Patch Available  (was: Open)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Attachment: 2001.patch

This should take care of the deadlock. Esentially, this patch locks the JobTracker before killing a job or changing its priority.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-2001:
--------------------------------

    Priority: Blocker  (was: Critical)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Priority: Blocker
>             Fix For: 0.15.0
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-2001:
--------------------------------

    Fix Version/s:     (was: 0.15.0)
                   0.14.4

Need to merge to 0.14.4

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.14.4
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533279 ] 

Hadoop QA commented on HADOOP-2001:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12367306/HADOOP-2001_2_20071006.patch
against trunk revision r583037.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/907/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/907/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/907/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/907/console

This message is automatically generated.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Michael Bieniosek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532798 ] 

Michael Bieniosek commented on HADOOP-2001:
-------------------------------------------

I could submit a quick fix patch that unmarks JobTracker.finalizeJob synchronized, but I don't really know if that would break other things, or if it could miss other deadlock paths.

Anybody else know more about this code?


> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Priority: Critical
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533719 ] 

Hudson commented on HADOOP-2001:
--------------------------------

Integrated in Hadoop-Nightly #267 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/267/])

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2001) Deadlock in jobtracker

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532825 ] 

Hadoop QA commented on HADOOP-2001:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12367179/2001.patch
against trunk revision r582443.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/896/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/896/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/896/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/896/console

This message is automatically generated.

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2001:
----------------------------------

    Status: Open  (was: Patch Available)

bq. If you have to hold the jobtracker lock in order to hold the per-job lock, then why bother acquiring the per-job lock at all?

It's a limitation of the current code-base where one does have to lock the {{JobTracker}} to manipulate {{JobInProgress}}, see HADOOP-869 for more details.

-*-*-

The culprit code is 2 features:
a) Change job priority
b) Kill a job

IMHO a better fix, short-to-mid term, is to remove {{JobTracker#getJob(String)}} (of course, deprecate for 0.15.0) so as to prevent similar synchronization issues in the future. Relying on anyone locking the {{JobTracker}} before calling methods on {{JobInProgress}} is fairly brittle.

W.r.t current issue we could provide the above features by:
a) Introduce a synchronized, package-private {{JobTracker#setJobPriority(String, Priority))}} method which is called by jobdetails.jsp. This would change the job's priority and call {{JobTracker.resortPriority}}, which could be made a synchronized method too.
b) Use {{JobTracker#killJob(String)}} rather than {{JobInProgress.kill}} from jobdetails.jsp

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2001:
----------------------------------

    Status: Patch Available  (was: Open)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2001:
----------------------------------

    Attachment: HADOOP-2001_2_20071006.patch

Here is a a patch implementing what I described in my previous comment...

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch, HADOOP-2001_2_20071006.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2001) Deadlock in jobtracker

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-2001:
--------------------------------

    Status: Patch Available  (was: Open)

> Deadlock in jobtracker
> ----------------------
>
>                 Key: HADOOP-2001
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2001
>             Project: Hadoop
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Michael Bieniosek
>            Assignee: Devaraj Das
>            Priority: Blocker
>             Fix For: 0.15.0
>
>         Attachments: 2001.patch, 2001.patch
>
>
> My jobtracker deadlocked; the output from kill -QUIT is:
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 2 on 10001":
>   waiting to lock monitor 0x0813724c (object 0xd5175488, a org.apache.hadoop.mapred.JobInProgress),
>   which is held by "SocketListener0-1"
> "SocketListener0-1":
>   waiting to lock monitor 0x081146d4 (object 0xd24d9c50, a org.apache.hadoop.mapred.JobTracker),
>   which is held by "IPC Server handler 2 on 10001"
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 2 on 10001":
>         at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:367)
>         - waiting to lock <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:1719)
>         at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:1240)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1116)
>         - locked <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:340)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:566)
> "SocketListener0-1":
>         at org.apache.hadoop.mapred.JobTracker.finalizeJob(JobTracker.java:907)
>         - waiting to lock <0xd24d9c50> (a org.apache.hadoop.mapred.JobTracker)
>         at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1059)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:891)
>         - locked <0xd5175488> (a org.apache.hadoop.mapred.JobInProgress)
>         at org.apache.hadoop.mapred.jobdetails_jsp._jspService(jobdetails_jsp.java:158)
>         at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:94)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>         at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> Found 1 deadlock.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.