You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Alejandro Fernandez (JIRA)" <ji...@apache.org> on 2015/09/23 00:03:04 UTC

[jira] [Created] (AMBARI-13194) Alert definition when DataNode data dirs are likely to become unmounted

Alejandro Fernandez created AMBARI-13194:
--------------------------------------------

             Summary: Alert definition when DataNode data dirs are likely to become unmounted
                 Key: AMBARI-13194
                 URL: https://issues.apache.org/jira/browse/AMBARI-13194
             Project: Ambari
          Issue Type: Bug
          Components: ambari-agent, ambari-server
    Affects Versions: 2.1.2
            Reporter: Alejandro Fernandez
            Assignee: Alejandro Fernandez
             Fix For: 2.2.0, 2.1.2


Ambari uses the dfs.datanode.data.dir.mount.file property in HDFS, whose value is typically /etc/hadoop/conf/dfs_data_dir_mount.hist
to track the mount points for each of the data dirs.

E.g.,
{code}
/hadoop01/data,/device1
/hadoop02/data,/device2
/hadoop03/data,/     # this one is on root, the others are all on mount points.
{code}

Whenever a drive becomes unmounted, Ambari detects that it was previously on a mount and will not create that data dir; HDFS can still tolerate the failure if dfs.datanode.failed.volumes.tolerated is greater than 0.
Now, if the /etc/hadoop/conf/dfs_data_dir_mount.hist file is deleted, then Ambari won't have this knowledge, and will create the datadir (even if it's on the root partition).

To improve tracking, create an alert definition that checks the following
* warning status if the /etc/hadoop/conf/dfs_data_dir_mount.hist file is deleted
* critical status if at least one of the data dirs is mounted on the root partition, and at least one data dir is on a mount



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)