You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2019/09/11 17:27:00 UTC
[jira] [Created] (HDDS-2113) Update JMX metrics in SCMNodeMetrics
for Decommission and Maintenance
Stephen O'Donnell created HDDS-2113:
---------------------------------------
Summary: Update JMX metrics in SCMNodeMetrics for Decommission and Maintenance
Key: HDDS-2113
URL: https://issues.apache.org/jira/browse/HDDS-2113
Project: Hadoop Distributed Data Store
Issue Type: Sub-task
Components: SCM
Affects Versions: 0.5.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
Currently the class SCMNodeMetrics exposes JMX metrics for the number of HEALTHY, STALE and DEAD nodes.
It also exposes the disk capacity of the cluster and the amount of space used and available.
We need to decide how we want to display things in JMX when nodes are in and entering maintenance, decommissioning and decommissioned.
We now have 15 states rather than the previous 3, as we can have nodes in:
* IN_SERVICE
* ENTERING_MAINTENANCE
* IN_MAINTENANCE
* DECOMMISSIONING
* DECOMMISSIONED
And in each of these states, nodes can be:
* HEALTHY
* STALE
* DEAD
The simplest case would be to expose these 15 states directly in JMX, as it gives the complete picture, but I wonder if we need any summary JMX metrics too?
We also need to consider how to count disk capacity and usage. For example:
# Do we count capacity and usage on a DECOMMISSIONING node? This is not a clear cut answer, as a decommissioning node does not provide any capacity for writers in the cluster, but it does use capacity.
# For a DECOMMISSIONED node, we probably should not count capacity or usage
# For an ENTERING_MAINTENANCE node, do we count capacity and usage? I suspect we should include the capacity and usage in the totals, however a node in this state will not be available for writes.
# For an IN_MAINTENANCE node that is healthy?
# For an IN_MAINTENANCE node that is dead?
I would welcome any thoughts on this before changing the code.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org