You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2022/07/01 09:28:00 UTC

[jira] [Created] (HDDS-6975) EC: Define the value of Maintenance Redundancy for EC containers

Stephen O'Donnell created HDDS-6975:
---------------------------------------

             Summary: EC: Define the value of Maintenance Redundancy for EC containers
                 Key: HDDS-6975
                 URL: https://issues.apache.org/jira/browse/HDDS-6975
             Project: Apache Ozone
          Issue Type: Sub-task
          Components: SCM
            Reporter: Stephen O'Donnell


For Ratis containers, we have a setting hdds.scm.replication.maintenance.replica.minimum, which defaults to 2. This indicates how many replicas must still be online when a node is allowed to go into maintenance.

For EC, we need to decide if we reuse this setting, and what it means.

For example, for Ratis containers, with 3 replicas and the default of 2, you should be able to take any single node offline without any replication.

With EC, if you had 3-2 containers, and must have a remaining redundancy of 2, then you must replicate if any node goes offline.

However for the other EC schemes, 6-3, 10-4, they are at least as good as Ratis with a default of 2.

A better setting for EC might be parityNum / 2, rounded down by integer division:

3-2 = 1
6-3 = 1
10-4 = 2 remaining redundancy.

Or perhaps parityNum - X, where X is the number of replicas allowed to be offline in a nEC group.

Its also worth noting that for EC, maintenance mode makes reads much more expensive. Potentially all reads will turn into online-reconstruction. For Ratis, it just reduces the available nodes to read from.

With that in mind, another argument for EC, is that all data containers are kept online for maintenance, with only parity + redundancy allowed to be offline. I feel that would be a more tricky feature and something we may consider in the future.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org