You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Jie Yao (Jira)" <ji...@apache.org> on 2022/05/06 03:24:00 UTC

[jira] [Commented] (HDDS-6697) EC: ReplicationManager - create class to detect container health issues

    [ https://issues.apache.org/jira/browse/HDDS-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532602#comment-17532602 ] 

Jie Yao commented on HDDS-6697:
-------------------------------

thanks [~sodonnell] for opening this jira.  i think before this,  we need to refactor RM further。

in [HDDS-6572|https://issues.apache.org/jira/browse/HDDS-6572], i extract movescheduler as a standalone class. meanwhile, i add a inflightActionsManager to manager all the inflight actions.

after this , we can create a new Jira to extract all the command sending function(Eg. sendDeleteCommand, SendReplicateCommand) to a standalone class or function. it will receive a list of commands(may be returned by detect container health), and then fire the event to send them.

also , i have a doc for detecting EC container health, and have uploaded it 

> EC: ReplicationManager - create class to detect container health issues
> -----------------------------------------------------------------------
>
>                 Key: HDDS-6697
>                 URL: https://issues.apache.org/jira/browse/HDDS-6697
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>
> Define an interface to allow a single containers health to be checked. The check health method should receive as parameters everything it needs to check the container health (eg the ContainerInfo, ContainerReplica list, ... ?) and return a status indicating the health of the container, eg HEALTHY, UNDER_REPLICATED, OVER_REPLICATED ...
> The status object could also container some commands needing send to the command queue, eg to close the container, delete a replica, force close etc.
> The idea here is to create a standalone health check class for EC with few dependencies so it can be tested in isolation easily.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org