You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Jie Yao (Jira)" <ji...@apache.org> on 2022/05/06 03:29:00 UTC

[jira] [Comment Edited] (HDDS-6697) EC: ReplicationManager - create class to detect container health issues

    [ https://issues.apache.org/jira/browse/HDDS-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17532602#comment-17532602 ] 

Jie Yao edited comment on HDDS-6697 at 5/6/22 3:28 AM:
-------------------------------------------------------

thanks [~sodonnell] for opening this jira.  i think before this,  we need to refactor RM further。

in HDDS-6572, i extract movescheduler as a standalone class. meanwhile, i add a inflightActionsManager to manager all the inflight actions.

after this , we can create a new Jira to extract all the command sending function(Eg. sendDeleteCommand, SendReplicateCommand) to a standalone class or function. it will receive a list of commands(may be returned by detect container health), and then fire the event to send them.

also , i have a doc for detecting EC container health, and have uploaded it , please take a look


was (Author: jacksonyao):
thanks [~sodonnell] for opening this jira.  i think before this,  we need to refactor RM further。

in [HDDS-6572|https://issues.apache.org/jira/browse/HDDS-6572], i extract movescheduler as a standalone class. meanwhile, i add a inflightActionsManager to manager all the inflight actions.

after this , we can create a new Jira to extract all the command sending function(Eg. sendDeleteCommand, SendReplicateCommand) to a standalone class or function. it will receive a list of commands(may be returned by detect container health), and then fire the event to send them.

also , i have a doc for detecting EC container health, and have uploaded it 

> EC: ReplicationManager - create class to detect container health issues
> -----------------------------------------------------------------------
>
>                 Key: HDDS-6697
>                 URL: https://issues.apache.org/jira/browse/HDDS-6697
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: EC Container Group Health Check.pdf
>
>
> Define an interface to allow a single containers health to be checked. The check health method should receive as parameters everything it needs to check the container health (eg the ContainerInfo, ContainerReplica list, ... ?) and return a status indicating the health of the container, eg HEALTHY, UNDER_REPLICATED, OVER_REPLICATED ...
> The status object could also container some commands needing send to the command queue, eg to close the container, delete a replica, force close etc.
> The idea here is to create a standalone health check class for EC with few dependencies so it can be tested in isolation easily.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org