You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2007/05/02 01:22:15 UTC

[jira] Commented: (HADOOP-1312) heartbeat monitor thread goes away

    [ https://issues.apache.org/jira/browse/HADOOP-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492980 ] 

dhruba borthakur commented on HADOOP-1312:
------------------------------------------

namenode .out file.
Exception in thread
"org.apache.hadoop.dfs.FSNamesystem$HeartbeatMonitor@5b9d2de4" java.util.ConcurrentModificationException
  at
java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
  at java.util.AbstractList$Itr.next(AbstractList.java:343)
  at
org.apache.hadoop.dfs.FSNamesystem.heartbeatCheck(FSNamesystem.java:1933)
  at
org.apache.hadoop.dfs.FSNamesystem$HeartbeatMonitor.run(FSNamesystem.java:1697)
  at java.lang.Thread.run(Thread.java:619)

> heartbeat monitor thread goes away
> ----------------------------------
>
>                 Key: HADOOP-1312
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1312
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>
> The heartbeat monitor thread encounters a ConcurrentModificationException while iterating over the "heartbeats" data structure. This occurs when the namenode was getting restarted. There are actuallt two bugs here:
> 1. The Heartbeat Monitor thread needs to catch Exceptions and continue, instead of exiting.
> 2. The heartbeats data structures is protected by the heartbeats lock. The registerDatanode() method invokes removeDatanode() without acquiring the heartbeats monitor lock. This causes the ConcurrentModificationException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.