You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Eli Collins (JIRA)" <ji...@apache.org> on 2011/04/20 02:06:05 UTC

[jira] [Created] (HDFS-1849) Respect failed.volumes.tolerated on startup

Respect failed.volumes.tolerated on startup
-------------------------------------------

                 Key: HDFS-1849
                 URL: https://issues.apache.org/jira/browse/HDFS-1849
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: data-node
            Reporter: Eli Collins
             Fix For: 0.23.0


The current failed.volumes.tolerated behavior is not user friendly, datanodes can be configured to tolerate N volume failures and still offer service, but if the cluster is restarted all the datanodes with failed volumes will not start unless the failed volumes have been removed from the hdfs configuration files on the respective hosts.

The failed.volumes.tolerated configuration option should be respected on startup. The datanode should only refuse to startup if more than failed.volumes.tolerated (HDFS-1161) have failed, or if a configured critical volume (HDFS-1848) has failed (which is probably not an issue in practice since dn startup probably fails eg if the root volume has gone readonly).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-1849) Respect failed.volumes.tolerated on startup

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HDFS-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins resolved HDFS-1849.
-------------------------------

    Resolution: Duplicate

This is a dupe of HDFS-1592.

> Respect failed.volumes.tolerated on startup
> -------------------------------------------
>
>                 Key: HDFS-1849
>                 URL: https://issues.apache.org/jira/browse/HDFS-1849
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Eli Collins
>             Fix For: 0.23.0
>
>
> The current failed.volumes.tolerated behavior is not user friendly, datanodes can be configured to tolerate N volume failures and still offer service, but if the cluster is restarted all the datanodes with failed volumes will not start unless the failed volumes have been removed from the hdfs configuration files on the respective hosts.
> The failed.volumes.tolerated configuration option should be respected on startup. The datanode should only refuse to startup if more than failed.volumes.tolerated (HDFS-1161) have failed, or if a configured critical volume (HDFS-1848) has failed (which is probably not an issue in practice since dn startup probably fails eg if the root volume has gone readonly).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira