You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2007/03/30 20:42:25 UTC
[jira] Commented: (HADOOP-1186) deadlock in Abstract Metrics
Context
[ https://issues.apache.org/jira/browse/HADOOP-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485604 ]
Hadoop QA commented on HADOOP-1186:
-----------------------------------
-1, because the patch command could not apply the latest attachment http://issues.apache.org/jira/secure/attachment/12354627/metrics-deadlock.patch as a patch to trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/524205. Please note that this message is automatically generated and may represent a problem with the automation system and not the patch. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch
> deadlock in Abstract Metrics Context
> ------------------------------------
>
> Key: HADOOP-1186
> URL: https://issues.apache.org/jira/browse/HADOOP-1186
> Project: Hadoop
> Issue Type: Bug
> Components: metrics
> Affects Versions: 0.12.1
> Environment: using ganglia metrics
> Reporter: Michael Bieniosek
> Priority: Critical
> Attachments: metrics-deadlock.patch
>
>
> There appears to be a lock-inversion deadlock in AbstractMetricsContext.
> When using ganglia metrics, sometimes the jobtracker will start timing out requests. The logs then reveal:
> 2007-03-30 13:59:50,942 WARN org.apache.hadoop.ipc.Server: Call queue overflow discarding oldest call heartbeat(org.apache.hadoop.mapred.Task
> TrackerStatus@1c19919, false, true, 407) from 10.255.62.129:50215
> A kill -QUIT dump shows:
> "IPC Server handler 6 on 10001" daemon prio=1 tid=0x08515c08 nid=0x526a waiting for monitor entry [0x4e6f4000..0x4e6f4f40]
> at org.apache.hadoop.metrics.spi.AbstractMetricsContext.createRecord(AbstractMetricsContext.java:192)
> - waiting to lock <0x5a562c98> (a org.apache.hadoop.metrics.ganglia.GangliaContext)
> at org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:130)
> at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1384)
> - locked <0x5a446330> (a org.apache.hadoop.mapred.JobTracker)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> at java.lang.reflect.Method.invoke(Unknown Source)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)
> ...
> "Timer-0" prio=1 tid=0x08664040 nid=0x5274 waiting for monitor entry [0x4e36d000..0x4e36df40]
> at org.apache.hadoop.mapred.JobTracker.getRunningJobs(JobTracker.java:944)
> - waiting to lock <0x5a446330> (a org.apache.hadoop.mapred.JobTracker)
> at org.apache.hadoop.mapred.JobTracker$JobTrackerMetrics.doUpdates(JobTracker.java:429)
> at org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:275)
> - locked <0x5a562c98> (a org.apache.hadoop.metrics.ganglia.GangliaContext)
> at org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:48)
> at org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:242)
> at java.util.TimerThread.mainLoop(Unknown Source)
> at java.util.TimerThread.run(Unknown Source)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.