You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2008/10/28 14:06:44 UTC

[jira] Commented: (HADOOP-4532) Interrupting the namenode thread triggers System.exit()

    [ https://issues.apache.org/jira/browse/HADOOP-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643194#action_12643194 ] 

Steve Loughran commented on HADOOP-4532:
----------------------------------------

stack trace. FSImage does not like to be interrupted.

[sf-startdaemon-debug] 08/10/28 12:50:22 [Thread-305] ERROR common.Storage : Cannot write file /tmp/hadoop/dfs/name
[sf-startdaemon-debug] java.nio.channels.ClosedByInterruptException
[sf-startdaemon-debug]  at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
[sf-startdaemon-debug]  at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:271)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.write(Storage.java:268)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.write(Storage.java:244)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.FSImage.rollFSImage(FSImage.java:1316)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:1034)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:290)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:165)
[sf-startdaemon-debug]  at org.apache.hadoop.hdfs.server.namenode.NameNode.innerStart(NameNode.java:226)
[sf-startdaemon-debug]  at org.apache.hadoop.util.Service.start(Service.java:188)
[sf-startdaemon-debug]  at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.innerDeploy(HadoopServiceImpl.java:479)
[sf-startdaemon-debug]  at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl.access$000(HadoopServiceImpl.java:46)
[sf-startdaemon-debug]  at org.smartfrog.services.hadoop.components.cluster.HadoopServiceImpl$ServiceDeployerThread.execute(HadoopServiceImpl.java:628)


> Interrupting the namenode thread triggers System.exit()
> -------------------------------------------------------
>
>                 Key: HADOOP-4532
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4532
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> My service setup/teardown tests are managing to trigger system exits in the namenode, which seems overkill.
> 1. Interrupting the thread that is starting the namesystem up raises a java.nio.channels.ClosedByInterruptException.
> 2. This is caught in FSImage.rollFSImage, and handed off to processIOError
> 3. This triggers a call to Runtime.getRuntime().exit(-1); "All storage directories are inaccessible.".
> Stack trace to follow. Exiting the JVM is somewhat overkill; if someone has interrupted the thread is is (presumably) because they want to stop the namenode, which may not imply they want to kill the JVM at the same time. Certainly JUnit does not expect it. 
> Some possibilities
>  -ClosedByInterruptException get handled differently as some form of shutdown request
>  -Calls to system exit are factored out into something that can have its behaviour changed by policy options to throw a RuntimeException instead. 
> Hosting a Namenode in a security manager that blocks off System.exit() is the simplest workaround; this is fairly simple, but it means that what would be a straight exit does now get turned into an exception, so callers may be surprised by what happens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.