You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2009/03/10 18:32:50 UTC

[jira] Commented: (HADOOP-5453) Could FSEditLog report problems more elegantly than with System.exit(-1)

    [ https://issues.apache.org/jira/browse/HADOOP-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680545#action_12680545 ] 

Konstantin Shvachko commented on HADOOP-5453:
---------------------------------------------

FSEditLog calls {{System.exit(-1)} when there are no more edit streams to write the name-space modifications to. No streams means the name-space state is not persistent anymore and may not be current when the name-node restarts.
So this is not about reporting problems but rather about the consistency of the system. Namely, if the system cannot persist changes it dies.
Though I agree dying might not be the most elegant solution. Now since we have "saveNamespace" command the loss of all edit streams can be treated as just switching to safe mode. When local disks are restored the administrator can save the namespace. Alternatively a secondary node can be started to perform an emergency checkpoint.


> Could FSEditLog report problems more elegantly than with System.exit(-1)
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-5453
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5453
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> When FSEdit encounters problems, it prints something and then exits.
> It would be better for any in-JVM deployments of FSEdit for these to be raised in some other way (such as throwing an exception), rather than taking down the whole JVM. That could be in JUnit tests, or it could be inside other applications. Test runners and the like can intercept those System.exit() calls with their own Security Manager -often turning the System.exit() operation into an exception there and then. If FSEdit did that itself, it may be easier to stay in control. 
> The current approach has some benefits -it can exit regardless of which thread has encountered problems, but it is tricky to test.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.