You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Chris Douglas (JIRA)" <ji...@apache.org> on 2007/09/12 22:09:33 UTC

[jira] Updated: (HADOOP-1885) Race condition in MiniDFSCluster shutdown

     [ https://issues.apache.org/jira/browse/HADOOP-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-1885:
----------------------------------

    Attachment: 1885.patch

> Race condition in MiniDFSCluster shutdown
> -----------------------------------------
>
>                 Key: HADOOP-1885
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1885
>             Project: Hadoop
>          Issue Type: Bug
>          Components: test
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>         Attachments: 1885.patch
>
>
> Hudson has been sporadically failing tests that start- or follow tests that start- multiple datanodes in MiniDFSCluster, particularly on Solaris and Windows. The following appears to be at least partially responsible (much credit to Nigel for helping to discern this).
> A common error:
> {noformat}
> java.io.IOException: Cannot remove data directory: /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/dfs/data
> 	at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:126)
> 	at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:80)
> 	at org.apache.hadoop.dfs.TestFsck.testFsckNonExistent(TestFsck.java:96)
> {noformat}
> MiniDFSCluster starts multiple DataNodes by calling DataNode::createDataNode, which creates and starts a DataNode thread, assigns the instance to a static member, and returns the Runnable. Of course, each call from MiniDFSCluster overwrites this instance. Since DataNode::shutdown() calls join() on the same Thread, each subsequent join is essentially a noop after the first DataNode finishes. When MiniDFSCluster::shutdown() returns, it may not have released its resources, so the next MiniDFSCluster may fail to start.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.