You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/11/29 18:59:57 UTC

[GitHub] [ozone] avijayanhwx opened a new pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

avijayanhwx opened a new pull request #2874:
URL: https://github.com/apache/ozone/pull/2874


   ## What changes were proposed in this pull request?
   
   The ContainerHealthTask in Recon currently classifies a container whose 3 replicas are UNHEALTHY as 'MISSING'. Although the 3 UNHEALTHY replica state could also be potentially bad, Recon should be able to distinguish between the two.
   
   The above case can merely be classified as UNDER REPLICATED, with the metadata from Recon DB carrying the actual vs desired replica count for further analysis.
   
   This patch also adds support to track the bcsId reported by datanodes to Recon. This information is kept in the container cache, and flushed to disk only periodically. 
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-5965
   https://issues.apache.org/jira/browse/HDDS-6050
   
   ## How was this patch tested?
   Manually tested.
   - [ ] Unit tests pending.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
smengcl commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r762208410



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ContainerReplicaHistory.java
##########
@@ -37,11 +39,22 @@
   // Last reported time of the replica
   private Long lastSeenTime;
 
+  public void setBcsId(long bcsId) {
+    this.bcsId = bcsId;
+  }
+

Review comment:
       nit: ordering. move the setter next to its corresponding getter below.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHealthSchemaManager.java
##########
@@ -69,8 +71,14 @@ public ContainerHealthSchemaManager(
     SelectQuery<Record> query = dslContext.selectQuery();
     query.addFrom(UNHEALTHY_CONTAINERS);
     if (state != null) {
-      query.addConditions(
-          UNHEALTHY_CONTAINERS.CONTAINER_STATE.eq(state.toString()));
+      if (state.equals(ALL_REPLICAS_UNHEALTHY)) {
+        query.addConditions(UNHEALTHY_CONTAINERS.CONTAINER_STATE
+            .eq(UNDER_REPLICATED.toString()));
+        query.addConditions(UNHEALTHY_CONTAINERS.ACTUAL_REPLICA_COUNT.eq(0));

Review comment:
       Q: In which other cases are `ACTUAL_REPLICA_COUNT` set? Apart from this `ALL_REPLICAS_UNHEALTHY` case.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ReconDBDefinition.java
##########
@@ -74,6 +74,16 @@ public ReconDBDefinition(String dbName) {
           NSSummary.class,
           new NSSummaryCodec());
 
+  // Container Replica History with bcsId tracking.
+  public static final DBColumnFamilyDefinition
+      <Long, ContainerReplicaHistoryList> REPLICA_HISTORY_V2 =
+      new DBColumnFamilyDefinition<Long, ContainerReplicaHistoryList>(
+          "replica_history_v2",

Review comment:
       So in a future Recon upgrade process we should probably drop (or the upgrade layout of) `REPLICA_HISTORY` (V1) table right? to avoid wasted storage space.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHistory.java
##########
@@ -30,14 +30,21 @@
   private String datanodeHost;
   private long firstSeenTime;
   private long lastSeenTime;
+  private long bcsId;

Review comment:
       Note to myself: `bcsId` stands for `blockCommitSequenceId`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759631843



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -39,7 +39,7 @@
     implements Codec<ContainerReplicaHistoryList> {
 
   // UUID takes 2 long to store. Each timestamp takes 1 long to store.
-  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  static final int SIZE_PER_ENTRY = 5 * Long.BYTES;

Review comment:
       I am planning to do it in a follow up patch in this same PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759584953



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ReconDBDefinition.java
##########
@@ -74,6 +74,16 @@ public ReconDBDefinition(String dbName) {
           NSSummary.class,
           new NSSummaryCodec());
 
+  // Container Replica History with bcsId tracking.
+  public static final DBColumnFamilyDefinition
+      <Long, ContainerReplicaHistoryList> REPLICA_HISTORY_V2 =

Review comment:
       ContainerReplicaHistory is not a must have data, and hence the plan is to skip migrating data from the old column family to the new. If there is a need for that, we can write a one time migration on startup.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r761371507



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       Oh, that is not part of this patch. MIS_REPLICATED is the name SCM (& Recon) uses to label containers that may or may not have sufficient replicas, but definitely do not conform to the topology requirements.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r761364586



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       Actually, my concerns were about MIS_REPLICATED state, sorry for not being specific with the earlier comment, since we already have unhealthy, as a user mis_replicated seems a bit incomprehensible. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759493615



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ReconDBDefinition.java
##########
@@ -74,6 +74,16 @@ public ReconDBDefinition(String dbName) {
           NSSummary.class,
           new NSSummaryCodec());
 
+  // Container Replica History with bcsId tracking.
+  public static final DBColumnFamilyDefinition
+      <Long, ContainerReplicaHistoryList> REPLICA_HISTORY_V2 =

Review comment:
       So this means the existing replica history table is left asis. Do we not have to worry about lost history, if not should there be a cleanup state step on upgrade?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx merged pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx merged pull request #2874:
URL: https://github.com/apache/ozone/pull/2874


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r763222852



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHealthSchemaManager.java
##########
@@ -69,8 +71,14 @@ public ContainerHealthSchemaManager(
     SelectQuery<Record> query = dslContext.selectQuery();
     query.addFrom(UNHEALTHY_CONTAINERS);
     if (state != null) {
-      query.addConditions(
-          UNHEALTHY_CONTAINERS.CONTAINER_STATE.eq(state.toString()));
+      if (state.equals(ALL_REPLICAS_UNHEALTHY)) {
+        query.addConditions(UNHEALTHY_CONTAINERS.CONTAINER_STATE
+            .eq(UNDER_REPLICATED.toString()));
+        query.addConditions(UNHEALTHY_CONTAINERS.ACTUAL_REPLICA_COUNT.eq(0));

Review comment:
       ACTUAL_REPLICA_COUNT can be 0 / 1 / 2. All of the count as UNDER_REPLICATED.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r761371507



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       Oh, that is not part of this patch. MIS_REPLICATED is the name SCM (& Recon) uses to label containers that have sufficient replicas, but do not conform to the topology requirements.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759491112



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -39,7 +39,7 @@
     implements Codec<ContainerReplicaHistoryList> {
 
   // UUID takes 2 long to store. Each timestamp takes 1 long to store.
-  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  static final int SIZE_PER_ENTRY = 5 * Long.BYTES;

Review comment:
       Where is this used, is it in a map? Do we need to account for reference overheads?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759595472



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ReconDBDefinition.java
##########
@@ -74,6 +74,16 @@ public ReconDBDefinition(String dbName) {
           NSSummary.class,
           new NSSummaryCodec());
 
+  // Container Replica History with bcsId tracking.
+  public static final DBColumnFamilyDefinition
+      <Long, ContainerReplicaHistoryList> REPLICA_HISTORY_V2 =

Review comment:
       Ok, can you file a jira for this, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759595772



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -39,7 +39,7 @@
     implements Codec<ContainerReplicaHistoryList> {
 
   // UUID takes 2 long to store. Each timestamp takes 1 long to store.
-  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  static final int SIZE_PER_ENTRY = 5 * Long.BYTES;

Review comment:
       Ok, can you file a jira for this, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r761380407



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       Got it, can we make sure this state is documented in the code, will be confusing for users.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r763222852



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/persistence/ContainerHealthSchemaManager.java
##########
@@ -69,8 +71,14 @@ public ContainerHealthSchemaManager(
     SelectQuery<Record> query = dslContext.selectQuery();
     query.addFrom(UNHEALTHY_CONTAINERS);
     if (state != null) {
-      query.addConditions(
-          UNHEALTHY_CONTAINERS.CONTAINER_STATE.eq(state.toString()));
+      if (state.equals(ALL_REPLICAS_UNHEALTHY)) {
+        query.addConditions(UNHEALTHY_CONTAINERS.CONTAINER_STATE
+            .eq(UNDER_REPLICATED.toString()));
+        query.addConditions(UNHEALTHY_CONTAINERS.ACTUAL_REPLICA_COUNT.eq(0));

Review comment:
       ACTUAL_REPLICA_COUNT can be 0 / 1 / 2. All of them count as UNDER_REPLICATED.

##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ContainerReplicaHistory.java
##########
@@ -37,11 +39,22 @@
   // Last reported time of the replica
   private Long lastSeenTime;
 
+  public void setBcsId(long bcsId) {
+    this.bcsId = bcsId;
+  }
+

Review comment:
       Fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#issuecomment-987128964


   Thanks for the reviews @swagle & @smengcl. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] swagle commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
swagle commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r760744832



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       Why do we need this? It is still ambiguous and I see the value is not set right now. Unhealthy is already a catch-all.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r763223536



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/ReconDBDefinition.java
##########
@@ -74,6 +74,16 @@ public ReconDBDefinition(String dbName) {
           NSSummary.class,
           new NSSummaryCodec());
 
+  // Container Replica History with bcsId tracking.
+  public static final DBColumnFamilyDefinition
+      <Long, ContainerReplicaHistoryList> REPLICA_HISTORY_V2 =
+      new DBColumnFamilyDefinition<Long, ContainerReplicaHistoryList>(
+          "replica_history_v2",

Review comment:
       Yes, we can clean up in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759586028



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -39,7 +39,7 @@
     implements Codec<ContainerReplicaHistoryList> {
 
   // UUID takes 2 long to store. Each timestamp takes 1 long to store.
-  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  static final int SIZE_PER_ENTRY = 5 * Long.BYTES;

Review comment:
       I am planning on changing the serialize/desereialize to a proto based logic to allow us to add more fields in the future without worrying about compatibility.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r759584410



##########
File path: hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/codec/ContainerReplicaHistoryListCodec.java
##########
@@ -39,7 +39,7 @@
     implements Codec<ContainerReplicaHistoryList> {
 
   // UUID takes 2 long to store. Each timestamp takes 1 long to store.
-  static final int SIZE_PER_ENTRY = 4 * Long.BYTES;
+  static final int SIZE_PER_ENTRY = 5 * Long.BYTES;

Review comment:
       We don't need to, since we are storing as pure Longs. This is used in the Codec.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] avijayanhwx commented on a change in pull request #2874: HDDS-5965. Recon should be able to distinguish between containers that have no replicas and those have all replicas as UNHEALTHY.

Posted by GitBox <gi...@apache.org>.
avijayanhwx commented on a change in pull request #2874:
URL: https://github.com/apache/ozone/pull/2874#discussion_r761334629



##########
File path: hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/ContainerSchemaDefinition.java
##########
@@ -49,7 +49,8 @@
     MISSING,
     UNDER_REPLICATED,
     OVER_REPLICATED,
-    MIS_REPLICATED
+    MIS_REPLICATED,

Review comment:
       I wanted to introduce a new query state that has Recon return containers that have all replicas as UNH, as opposed to the more generic 'UNDER_REPLICATED' state that typically denotes a transient state. I have modified the enum field to reflect it better. This can be queried as follows
   
   /containers/unhealthy/UNDER_REPLICATED
   /containers/unhealthy/ALL_REPLICAS_UNHEALTHY
   /containers/unhealthy --> will return containers in any bad state.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org