You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "YiSheng Lien (Jira)" <ji...@apache.org> on 2020/10/15 09:09:00 UTC

[jira] [Commented] (HDDS-4269) Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS root directory

    [ https://issues.apache.org/jira/browse/HDDS-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214539#comment-17214539 ] 

YiSheng Lien commented on HDDS-4269:
------------------------------------

Thanks [~weichiu] for reporting the issue.
And thanks [~flirmnave] for fixing it.

> Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS root directory
> --------------------------------------------------------------------------------------------
>
>                 Key: HDDS-4269
>                 URL: https://issues.apache.org/jira/browse/HDDS-4269
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 1.1.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Zheng Huang-Mu
>            Priority: Major
>              Labels: newbie, pull-request-available
>
> Took me some time to debug a trivial bug.
> DataNode crashes after this mysterious error and no explanation:
> {noformat}
> 10:11:44.382 PM	INFO	MutableVolumeSet	Moving Volume : /var/lib/hadoop-ozone/fake_datanode/data/hdds to failed Volumes
> 10:11:46.287 PM	ERROR	StateContext	Critical error occurred in StateMachine, setting shutDownMachine
> 10:11:46.287 PM	ERROR	DatanodeStateMachine	DatanodeStateMachine Shutdown due to an critical error
> {noformat}
> Turns out that if there are unexpected files under the hdds directory ($hdds.datanode.dir/hdds), DN thinks the volume is bad and move it to failed volume list, without an error explanation. I was editing the VERSION file and vim created a temp file under the directory. This is impossible to debug without reading the code.
> {code:java|title=HddsVolumeUtil#checkVolume()}
> } else if(hddsFiles.length == 2) {
>       // The files should be Version and SCM directory
>       if (scmDir.exists()) {
>         return true;
>       } else {
>         logger.error("Volume {} is in Inconsistent state, expected scm " +
>                 "directory {} does not exist", volumeRoot, scmDir
>             .getAbsolutePath());
>         return false;
>       }
>     } else {
>       // The hdds root dir should always have 2 files. One is Version file
>       // and other is SCM directory.
>       <---- HERE!
>       return false;
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org