You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2020/01/09 13:16:00 UTC
[jira] [Updated] (HDDS-2113) Update JMX metrics for node count in
SCMNodeMetrics for Decommission and Maintenance
[ https://issues.apache.org/jira/browse/HDDS-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDDS-2113:
------------------------------------
Summary: Update JMX metrics for node count in SCMNodeMetrics for Decommission and Maintenance (was: Update JMX metrics in SCMNodeMetrics for Decommission and Maintenance)
> Update JMX metrics for node count in SCMNodeMetrics for Decommission and Maintenance
> ------------------------------------------------------------------------------------
>
> Key: HDDS-2113
> URL: https://issues.apache.org/jira/browse/HDDS-2113
> Project: Hadoop Distributed Data Store
> Issue Type: Sub-task
> Components: SCM
> Affects Versions: 0.5.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
>
> Currently the class SCMNodeMetrics exposes JMX metrics for the number of HEALTHY, STALE and DEAD nodes.
> It also exposes the disk capacity of the cluster and the amount of space used and available.
> We need to decide how we want to display things in JMX when nodes are in and entering maintenance, decommissioning and decommissioned.
> We now have 15 states rather than the previous 3, as we can have nodes in:
> * IN_SERVICE
> * ENTERING_MAINTENANCE
> * IN_MAINTENANCE
> * DECOMMISSIONING
> * DECOMMISSIONED
> And in each of these states, nodes can be:
> * HEALTHY
> * STALE
> * DEAD
> The simplest case would be to expose these 15 states directly in JMX, as it gives the complete picture, but I wonder if we need any summary JMX metrics too?
>
> We also need to consider how to count disk capacity and usage. For example:
> # Do we count capacity and usage on a DECOMMISSIONING node? This is not a clear cut answer, as a decommissioning node does not provide any capacity for writers in the cluster, but it does use capacity.
> # For a DECOMMISSIONED node, we probably should not count capacity or usage
> # For an ENTERING_MAINTENANCE node, do we count capacity and usage? I suspect we should include the capacity and usage in the totals, however a node in this state will not be available for writes.
> # For an IN_MAINTENANCE node that is healthy?
> # For an IN_MAINTENANCE node that is dead?
> I would welcome any thoughts on this before changing the code.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org