You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/24 16:10:03 UTC

[GitHub] [ozone] siddhantsangwan opened a new pull request, #3547: HDDS-6940. EC: Skip the EC container for balancer

siddhantsangwan opened a new pull request, #3547:
URL: https://github.com/apache/ozone/pull/3547

   ## What changes were proposed in this pull request?
   
   Excluding EC containers from balancing by removing them from the set of Candidate containers selected in `ContainerBalancerSelectionCriteria`.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6940
   
   ## How was this patch tested?
   
   Existing UTs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] myskov commented on a diff in pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
myskov commented on code in PR #3547:
URL: https://github.com/apache/ozone/pull/3547#discussion_r906315309


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java:
##########
@@ -136,6 +140,19 @@ public NavigableSet<ContainerID> getCandidateContainers(
     });
 
     containerIDSet.removeIf(this::isContainerReplicatingOrDeleting);
+
+    // remove EC containers
+    containerIDSet.removeIf(containerID -> {
+      try {

Review Comment:
   Wouldn't it be cleaner to move the try-catch block from lambda to the isECContainer() method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1166793790

   @siddhantsangwan thanks for this patch. can you explain why we should skip EC container for now, and when can we take EC container into account?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1180120651

   @siddhantsangwan i am ok with your proposal. lets skip EC container at both containerBalancer side and RM side until we complete refactoring RM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1182198806

   Thank you @siddhantsangwan and @JacksonYao287 for the discussion and coming to a conclusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1183203145

   @myskov @JacksonYao287 Can you also review the latest changes please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a diff in pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on code in PR #3547:
URL: https://github.com/apache/ozone/pull/3547#discussion_r908403439


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java:
##########
@@ -136,6 +140,19 @@ public NavigableSet<ContainerID> getCandidateContainers(
     });
 
     containerIDSet.removeIf(this::isContainerReplicatingOrDeleting);
+
+    // remove EC containers
+    containerIDSet.removeIf(containerID -> {
+      try {

Review Comment:
   I agree with @myskov.  What's more, we could merge these conditions that require a container lookup.  Something like this:
   
   ```
       // single removeIf
       containerIDSet.removeIf(this::shouldBeExcluded);
       ...
     }
   
     boolean shouldBeExcluded(ContainerID id) {
       ContainerInfo container;
       try {
         container = containerManager.getContainer(id);
       } catch (ContainerNotFoundException e) {
         LOG.warn(...);
         return true;
       }
   
       return isClosed(container) || ... || isECContainer(container);
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1183526147

    I have just merged this as we have received approval. I have overlooked the above question from @siddhantsangwan sorry. Feel free to file a followup if you have any points to cover.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao merged pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
umamaheswararao merged PR #3547:
URL: https://github.com/apache/ozone/pull/3547


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3547:
URL: https://github.com/apache/ozone/pull/3547#discussion_r919007652


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java:
##########
@@ -136,6 +140,19 @@ public NavigableSet<ContainerID> getCandidateContainers(
     });
 
     containerIDSet.removeIf(this::isContainerReplicatingOrDeleting);
+
+    // remove EC containers
+    containerIDSet.removeIf(containerID -> {
+      try {

Review Comment:
   Great suggestions. Added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1170662211

   @umamaheswararao there are two steps when balancing container:
   1 containerBalancer will select a candidate container from a source over-utilized Dn, and find an under-utilized Dn as a target
   2 call legacyRm#move to take this move action.
   
   the only difference for EC container in step 1 is the placement policy, this is what is done in [HDDS-6533](https://github.com/apache/ozone/pull/3455), and i think it has nothing to do with RM. we can merge is first.
   
   for EC container, we can just skip move in legacy in RM until we complete the RM related work.
   
   @siddhantsangwan 
   
   if container balancer will select an EC container but the move will fail since we skip it now. so theoretically,there might be a scenario that an EC container might always be selected for move ,but move always fail and the utilization of the datanode will never change, which will lead to Infinite loop。 please notice this. we can fix this after RM related work is completed
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1169555890

   @JacksonYao287 We have fully done the actual recovery work. We also relaxed placement checks in replication flows currently. Unless we have ready analysis balancer can work with current version EC RM, we can switch off EC container from balancer just to avoid, not to go in wrong way. ( similar to how we were returning from RM for EC containers until before we started offline recovery work)
   
   I see you have another patch where you are working to make Balancer work with EC. If you tested and works with balancer, we can remove the flag and enable it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3547: HDDS-6940. EC: Skip the EC container for balancer

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3547:
URL: https://github.com/apache/ozone/pull/3547#issuecomment-1180039478

   Thanks for the reviews. 
   
   > if container balancer will select an EC container but the move will fail since we skip it now. so theoretically,there might be a scenario that an EC container might always be selected for move ,but move always fail and the utilization of the datanode will never change, which will lead to Infinite loop。 please notice this. we can fix this after RM related work is completed
   
   To avoid this scenario, we can merge this PR to exclude EC containers from balancing for now. We can then proceed with https://github.com/apache/ozone/pull/3455 and finally include EC containers (by undoing the code in this PR) once steps 1 and 2 are complete. What do you all think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org