You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Siyao Meng (Jira)" <ji...@apache.org> on 2022/09/21 22:18:00 UTC

[jira] [Commented] (HDDS-7098) Provide a way for admin to identify all unhealthy container replicas

    [ https://issues.apache.org/jira/browse/HDDS-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607988#comment-17607988 ] 

Siyao Meng commented on HDDS-7098:
----------------------------------

> Recon's API and UI does not expose replica information either.

Right now Recon only exposes missing container info via {{/api/v1/containers/missing}} , which is displayed in page {{/#/MissingContainers}} :

 !MissingContainers.png! 

But _MISSING_ is only 1 out of the 5 unhealthy container states that Recon is aware of:

{code:java|title=https://github.com/apache/ozone/blob/e86119f2d029ec0f7e6042be364079077cd1c88f/hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java#L44-L54}
  /**
   * ENUM describing the allowed container states which can be stored in the
   * unhealthy containers table.
   */
  public enum UnHealthyContainerStates {
    MISSING,
    UNDER_REPLICATED,
    OVER_REPLICATED,
    MIS_REPLICATED,
    ALL_REPLICAS_UNHEALTHY
  }
{code}

Among the rest 4 unhealthy container states, MIS_REPLICATED, UNDER_REPLICATED and OVER_REPLICATED are also [tracked|https://github.com/apache/ozone/blob/a02c6df497b2a636c7a81c26e53f39daa7958841/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/fsck/ContainerHealthTask.java#L264-L275] by Recon (done in HDDS-3082, thanks [~sodonnell]). But we have not exposed those 3 unhealthy states to the API/UI whatsoever. We need to build new tabs or pages to display that in Recon UI, similar to what we have done with the missing container page above.

cc [~zitadombi]

> Provide a way for admin to identify all unhealthy container replicas
> --------------------------------------------------------------------
>
>                 Key: HDDS-7098
>                 URL: https://issues.apache.org/jira/browse/HDDS-7098
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ethan Rose
>            Priority: Major
>         Attachments: MissingContainers.png
>
>
> Currently UNHEALTHY is a state that a container replica can be in (ContainerReplicaProto#State), but not a state that the container can be in overall (LifeCycleState). This means {{ozone admin container list}} has no info about unhealthy containers, because it currently does not print replica information. [Recon's API|https://ozone.apache.org/docs/current/interface/reconapi.html] and UI does not expose replica information either. The only way to determine unhealthy containers is to run {{ozone admin container info <ID>}} for a container that is already suspected to have unhealthy replicas. This jira aims to provide a way to identify and filter container replica states, through either Recon's UI, Recon's REST API, or client CLI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org