You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/07/04 09:24:25 UTC

[GitHub] [hadoop-ozone] ChenSammi opened a new pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

ChenSammi opened a new pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162


   https://issues.apache.org/jira/browse/HDDS-3921


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r452035232



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);

Review comment:
       This delta calculation and check in line 550 is required.  Currenlty placementStatus.isPolicySatisfied() only checks whether topology requirement is satisfied or not. It doesn't check whether it has enough replicas. 
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r452541315



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);

Review comment:
       Got it. I agree delta calculation is needed. I mean move delta calculation after placementStatus is not necessary. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-657697063


   LGTM, +1. Thanks @ChenSammi for the contribution and all for the reviews. I will merge the PR shortly. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] sodonnel commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
sodonnel commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-654823654


   Thanks for this change. I also wonder if there is a bug in the method `isContainerUnderReplicated(...)` which leads to this problem, and results in more processing that necessary for mis-replicated containers with inFlight Additions.
   
   In `isContainerUnderReplicated` is uses only the live replicas to check for mis-replication, but then it considers inflight add and delete for under replication. Should we also include the inflightAdds when considering mis-replication in that method?
   
   For `isContainerOverReplicated` it also uses the inflightAdds and deletes to check for over replication.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r450820397



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);
+        if (placementStatus.isPolicySatisfied() && delta <= 0) {

Review comment:
       Rather than this new IF block here, would it make sense to simply add:
   
   ```
   if (replicasNeeded <= 0) {
     LOG.debug(...);
     return;
   }
   ```
   
   At line 554 / 558, just after the line:
   
   ```
           final int replicasNeeded
               = delta < misRepDelta ? misRepDelta : delta;
   ```
   
   That would avoid needing to check call `placementStatus.isPolicySatisfied()` and then `placementStatus.misReplicationCount()`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi merged pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi merged pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
xiaoyuyao commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r450514295



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);

Review comment:
       NIT: this line to move the line to calculate delta from 547->549 is not needed.  Only the line 550 is the core change that breaks out when we've already handle the under replicate container via inflight replication. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r450820397



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);
+        if (placementStatus.isPolicySatisfied() && delta <= 0) {

Review comment:
       Rather than this new IF block here, would it make sense to simply add:
   
   ```
   if (replicasNeeded <= 0) {
     LOG.debug(...);
     return;
   }
   ```
   
   At line 554 / 558, just after the line:
   
   ```
           final int replicasNeeded
               = delta < misRepDelta ? misRepDelta : delta;
   ```
   
   That would avoid needing to check call `placementStatus.isPolicySatisfied()` and then `placementStatus.misReplicationCount()` afterwards, as `misReplicationCount()` calls `isPolicySatisfied()` anyway.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r452052352



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);
+        if (placementStatus.isPolicySatisfied() && delta <= 0) {

Review comment:
       Make sense. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-657387772


   Thanks @sodonnel  and @xiaoyuyao for the review. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
sodonnel commented on a change in pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#discussion_r450820397



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -543,11 +543,15 @@ private void handleUnderReplicatedContainer(final ContainerInfo container,
         List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
         // Then add any pending additions
         targetReplicas.addAll(replicationInFlight);
-
-        int delta = replicationFactor - getReplicaCount(id, replicas);
         final ContainerPlacementStatus placementStatus =
             containerPlacement.validateContainerPlacement(
                 targetReplicas, replicationFactor);
+        int delta = replicationFactor - getReplicaCount(id, replicas);
+        if (placementStatus.isPolicySatisfied() && delta <= 0) {

Review comment:
       Rather than this new IF block here, would it make sense to simply add:
   
   ```
   if (replicasNeeded <= 0) {
     LOG.debug(...);
     return;
   }
   
   At line 554 / 558, just after the line:
   
   ```
           final int replicasNeeded
               = delta < misRepDelta ? misRepDelta : delta;
   ```
   
   That would avoid needing to check call `placementStatus.isPolicySatisfied()` and then `placementStatus.misReplicationCount()`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-655969692


   
   
   
   > Thanks @ChenSammi for reporting the issue and propose the fix. The change LGTM.
   > 
   > I have a question wrt the excess calculation in handleOverReplicationContainer where we only consider inflightDeletion without inflightReplication at around line 615. Could this contribute the over replication issue as we will break out when a smaller excess value reaches 0 but the inflighreplication is still going?
   
   @xiaoyuyao , I think inflightReplication not considered here is safer since replication has the change to fail.  Imaging we have 2 healthy replicas and 2 inflight replications, this case, send the command to delete the extra 1 replica until we are sure that we have 4 healthy replicas in hand. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-655983803


   > Thanks for this change. I also wonder if there is a bug in the method `isContainerUnderReplicated(...)` which leads to this problem, and results in more processing that necessary for mis-replicated containers with inFlight Additions.
   > 
   > In `isContainerUnderReplicated` is uses only the live replicas to check for mis-replication, but then it considers inflight add and delete for under replication. Should we also include the inflightAdds when considering mis-replication in that method?
   > 
   > For `isContainerOverReplicated` it also uses the inflightAdds and deletes to check for over replication.
   
   It's a good point. I don't think it's a real bug, but we can improve it.  HDDS-3942 to track it. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1162: HDDS-3921. IllegalArgumentException triggered in SCMContainerPlacemen…

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #1162:
URL: https://github.com/apache/hadoop-ozone/pull/1162#issuecomment-656443918


   testDeleteKeyWithSlowFollower failed at leader membership check step. The test passed locally. It seems a timing issue, not relevant to this patch. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org