You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/11/10 18:28:45 UTC

[GitHub] [hadoop] goiri commented on a change in pull request #3642: YARN-10760. Number of allocated OPPORTUNISTIC containers can dip below 0

goiri commented on a change in pull request #3642:
URL: https://github.com/apache/hadoop/pull/3642#discussion_r746869941



##########
File path: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
##########
@@ -720,7 +726,10 @@ public void completedContainer(RMContainer rmContainer,
       SchedulerApplicationAttempt schedulerAttempt =
           getCurrentAttemptForContainer(containerId);
       if (schedulerAttempt != null) {
-        schedulerAttempt.removeRMContainer(containerId);
+        if (schedulerAttempt.removeRMContainer(containerId)) {

Review comment:
       Is this an OPPORTUNISTIC container for sure?

##########
File path: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestOpportunisticContainerAllocatorAMService.java
##########
@@ -817,6 +825,56 @@ public void testOpportunisticSchedulerMetrics() throws Exception {
         metrics.getAggregatedReleasedContainers());
   }
 
+  /**
+   * Tests that, if a node has running opportunistic containers when the RM
+   * is down, RM is able to reflect the opportunistic containers
+   * in its metrics upon RM recovery.
+   */
+  @Test
+  public void testMetricsRetainsAllocatedOpportunisticAfterRMRestart()
+      throws Exception {
+    HashMap<NodeId, MockNM> nodes = new HashMap<>();
+    MockNM nm1 = new MockNM("h1:1234", 4096, rm.getResourceTrackerService());
+    nodes.put(nm1.getNodeId(), nm1);
+    final RMApp app = MockRMAppSubmitter.submit(rm,
+        MockRMAppSubmissionData.Builder.createWithMemory(GB, rm)

Review comment:
       Extracting this is more readable.

##########
File path: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestOpportunisticContainerAllocatorAMService.java
##########
@@ -817,6 +825,56 @@ public void testOpportunisticSchedulerMetrics() throws Exception {
         metrics.getAggregatedReleasedContainers());
   }
 
+  /**
+   * Tests that, if a node has running opportunistic containers when the RM
+   * is down, RM is able to reflect the opportunistic containers
+   * in its metrics upon RM recovery.
+   */
+  @Test
+  public void testMetricsRetainsAllocatedOpportunisticAfterRMRestart()
+      throws Exception {
+    HashMap<NodeId, MockNM> nodes = new HashMap<>();
+    MockNM nm1 = new MockNM("h1:1234", 4096, rm.getResourceTrackerService());
+    nodes.put(nm1.getNodeId(), nm1);
+    final RMApp app = MockRMAppSubmitter.submit(rm,
+        MockRMAppSubmissionData.Builder.createWithMemory(GB, rm)
+            .withAppName("app")
+            .withUser("user")
+            .withAcls(null)
+            .withQueue("default")
+            .build());
+
+    final ApplicationAttemptId appAttemptId =
+        app.getCurrentAppAttempt().getAppAttemptId();
+
+    final ContainerId recoverContainerId = ContainerId.newContainerId(
+        appAttemptId, 2);
+
+    final Resource fakeResource = Resource.newInstance(1024, 1);
+    final String fakeDiagnostics = "recover container";
+    final Priority fakePriority = Priority.newInstance(0);
+
+    final NMContainerStatus recoverContainerReport =
+        NMContainerStatus.newInstance(
+            recoverContainerId, 0, ContainerState.RUNNING,
+            fakeResource, fakeDiagnostics, 0,
+            fakePriority, 0, null,
+            ExecutionType.OPPORTUNISTIC, -1);
+
+    rm.registerNode(
+        "h1:1234", 4096, 1,
+        Collections.singletonList(
+            appAttemptId.getApplicationId()),
+        Collections.singletonList(recoverContainerReport));
+
+    OpportunisticSchedulerMetrics metrics =
+        OpportunisticSchedulerMetrics.getMetrics();
+    Assert.assertEquals(1,

Review comment:
       We probably want to assert before this too with a 0 and maybe one of the cases where we do not increment.
   I have the feeling this test passes even without the code changes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org