You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Siddhant Sangwan (Jira)" <ji...@apache.org> on 2023/09/26 19:11:00 UTC

[jira] [Created] (HDDS-9354) LegacyReplicationManager: Unhealthy replicas of a sufficiently replicated container can block decommissioning

Siddhant Sangwan created HDDS-9354:
--------------------------------------

             Summary: LegacyReplicationManager: Unhealthy replicas of a sufficiently replicated container can block decommissioning
                 Key: HDDS-9354
                 URL: https://issues.apache.org/jira/browse/HDDS-9354
             Project: Apache Ozone
          Issue Type: Sub-task
          Components: SCM
            Reporter: Siddhant Sangwan
            Assignee: Siddhant Sangwan


Mix of quasi-closed and unhealthy replicas blocks decommission even if sufficiently replicated.
a. Caused when only some of the replicas hit the error during write.
b. Can be fixed by removing this check:
{code}
if (!replicaSet.isHealthy()) {
          if (LOG.isDebugEnabled()) {
            unhealthyIDs.add(cid);
          }
          if (unhealthy < CONTAINER_DETAILS_LOGGING_LIMIT
{code}

However, simply removing that check is not a complete solution. We need to try and preserve any UNHEALTHY replicas that have the greatest Sequence ID. https://issues.apache.org/jira/browse/HDDS-9321 takes care of the Legacy Replication Manager side of things to preserve such UNHEALTHY replicas. This jira focuses on the Decommissioning side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org