You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2016/04/07 23:42:25 UTC

[jira] [Commented] (YARN-4865) Track Reserved resources in ResourceUsage and QueueCapacities

    [ https://issues.apache.org/jira/browse/YARN-4865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231126#comment-15231126 ] 

Wangda Tan commented on YARN-4865:
----------------------------------

[~sunilg],

It seems this patch needs one more fix:

{code}
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
index 9a74c22..df57787 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
@@ -1322,14 +1322,6 @@ public void completedContainer(Resource clusterResource,

         // Book-keeping
         if (removed) {
-
-          // track reserved resource for metrics, for normal container
-          // getReservedResource will be null.
-          Resource reservedRes = rmContainer.getReservedResource();
-          if (reservedRes != null && !reservedRes.equals(Resources.none())) {
-            decReservedResource(node.getPartition(), reservedRes);
-          }
-
           // Inform the ordering policy
           orderingPolicy.containerReleased(application, rmContainer);

diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
index cf1b3e0..558fc53 100644
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
@@ -247,6 +247,8 @@ public synchronized boolean unreserve(Priority priority,
       // Update reserved metrics
       queue.getMetrics().unreserveResource(getUser(),
           rmContainer.getReservedResource());
+
+      queue.decReservedResource(node.getPartition(), rmContainer.getReservedResource());
       return true;
     }
     return false;
{code}

We need above change to make sure allocation from reserved container will correctly deduct reserved resource. [~sunilg], could you add few tests also?

And some other cases in my mind that we need to consider:
- Nodes lost / disconnected, we need to deduct reserved resources on such nodes. (I think it should covered by completedContainer code path)

Above can be addressed in a separate JIRA.

> Track Reserved resources in ResourceUsage and QueueCapacities 
> --------------------------------------------------------------
>
>                 Key: YARN-4865
>                 URL: https://issues.apache.org/jira/browse/YARN-4865
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Sunil G
>            Assignee: Sunil G
>             Fix For: 2.9.0
>
>         Attachments: 0001-YARN-4865.patch, 0002-YARN-4865.patch, 0003-YARN-4865.patch
>
>
> As discussed in YARN-4678, capture reserved capacity separately in QueueCapcities for better tracking. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)