You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2009/04/04 01:41:14 UTC

[jira] Commented: (HADOOP-5573) TestBackupNode sometimes fails

    [ https://issues.apache.org/jira/browse/HADOOP-5573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695642#action_12695642 ] 

Konstantin Shvachko commented on HADOOP-5573:
---------------------------------------------

The first two bugs (NPE) are fixed by HADOOP-5119.
The story here is that {{testBackupRegistration()}} starts two backup nodes one ofter another. The first one keeps making chackpoints. But the second is just initializing. During initialization it creates new {{FSNamesystem}} class, which in the beginning sets the static variable {{fsNamesystemObject}} to null. It takes time to initialize the BackupNode until it will set {{fsNamesystemObject = this}}.
In the meantime the first backup node start a checkpoint, which accesses {{FSNamesystem}} via {{fsNamesystemObject}}. Since it is static it contains the value the second node assigned it, which is null at that moment. Therefore different NPEs depending on the timing of the checkpoint.
We should not see that again, since HADOOP-5119 eliminated {{fsNamesystemObject}}.

Third error is also gone, because {{processIOError()}} was recently changed by HADOOP-4045.
But I am still looking at it. I am getting some strange asserts there.

> TestBackupNode sometimes fails
> ------------------------------
>
>                 Key: HADOOP-5573
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5573
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>
> TestBackupNode may fail with different reasons:
> - Unable to open edit log file .\build\test\data\dfs\name-backup1\current\edits (FSEditLog.java:open(371))
> - NullPointerException at org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.flushAndSync(EditLogBackupOutputStream.java:163)
> - Fatal Error : All storage directories are inaccessible.
> Will provide more information in the comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.