You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "zhihai xu (JIRA)" <ji...@apache.org> on 2015/09/09 18:31:46 UTC

[jira] [Updated] (YARN-4133) Containers to be preempted leak in FairScheduler preemption logic.

     [ https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhihai xu updated YARN-4133:
----------------------------
    Description: 
Containers to be preempted leak in FairScheduler preemption logic. It may cause missing preemption due to containers in {{warnedContainers}} wrongly removed. The problem is in {{preemptResources}}:
There are two issues which can cause containers  wrongly removed from {{warnedContainers}}:
Firstly missing the container state {{RMContainerState.ACQUIRED}} in the condition check:
{code}
(container.getState() == RMContainerState.RUNNING ||
              container.getState() == RMContainerState.ALLOCATED)
{code}
Secondly if  {{isResourceGreaterThanNone(toPreempt)}} return false, we shouldn't remove container from {{warnedContainers}}. We should only remove container from {{warnedContainers}}, if container is not in state {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and {{RMContainerState.ACQUIRED}}.
{code}
      if ((container.getState() == RMContainerState.RUNNING ||
              container.getState() == RMContainerState.ALLOCATED) &&
              isResourceGreaterThanNone(toPreempt)) {
        warnOrKillContainer(container);
        Resources.subtractFrom(toPreempt, container.getContainer().getResource());
      } else {
        warnedIter.remove();
      }
{code}
Also once the containers in {{warnedContainers}} are wrongly removed, it will never be preempted. Because these containers are already in {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't return the containers in {{FSAppAttempt#preemptionMap}}.
{code}
  public RMContainer preemptContainer() {
    if (LOG.isDebugEnabled()) {
      LOG.debug("App " + getName() + " is going to preempt a running " +
          "container");
    }

    RMContainer toBePreempted = null;
    for (RMContainer container : getLiveContainers()) {
      if (!getPreemptionContainers().contains(container) &&
          (toBePreempted == null ||
              comparator.compare(toBePreempted, container) > 0)) {
        toBePreempted = container;
      }
    }
    return toBePreempted;
  }
{code}

  was:
Containers to be preempted leaks in FairScheduler preemption logic. It may cause missing preemption due to containers in {{warnedContainers}} wrongly removed. The problem is in {{preemptResources}}:
There are two issues which can cause containers  wrongly removed from {{warnedContainers}}:
Firstly missing the container state {{RMContainerState.ACQUIRED}} in the condition check:
{code}
(container.getState() == RMContainerState.RUNNING ||
              container.getState() == RMContainerState.ALLOCATED)
{code}
Secondly if  {{isResourceGreaterThanNone(toPreempt)}} return false, we shouldn't remove container from {{warnedContainers}}. We should only remove container from {{warnedContainers}}, if container is not in state {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and {{RMContainerState.ACQUIRED}}.
{code}
      if ((container.getState() == RMContainerState.RUNNING ||
              container.getState() == RMContainerState.ALLOCATED) &&
              isResourceGreaterThanNone(toPreempt)) {
        warnOrKillContainer(container);
        Resources.subtractFrom(toPreempt, container.getContainer().getResource());
      } else {
        warnedIter.remove();
      }
{code}
Also once the containers in {{warnedContainers}} are wrongly removed, it will never be preempted. Because these containers are already in {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't return the containers in {{FSAppAttempt#preemptionMap}}.
{code}
  public RMContainer preemptContainer() {
    if (LOG.isDebugEnabled()) {
      LOG.debug("App " + getName() + " is going to preempt a running " +
          "container");
    }

    RMContainer toBePreempted = null;
    for (RMContainer container : getLiveContainers()) {
      if (!getPreemptionContainers().contains(container) &&
          (toBePreempted == null ||
              comparator.compare(toBePreempted, container) > 0)) {
        toBePreempted = container;
      }
    }
    return toBePreempted;
  }
{code}


> Containers to be preempted leak in FairScheduler preemption logic.
> ------------------------------------------------------------------
>
>                 Key: YARN-4133
>                 URL: https://issues.apache.org/jira/browse/YARN-4133
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: YARN-4133.000.patch
>
>
> Containers to be preempted leak in FairScheduler preemption logic. It may cause missing preemption due to containers in {{warnedContainers}} wrongly removed. The problem is in {{preemptResources}}:
> There are two issues which can cause containers  wrongly removed from {{warnedContainers}}:
> Firstly missing the container state {{RMContainerState.ACQUIRED}} in the condition check:
> {code}
> (container.getState() == RMContainerState.RUNNING ||
>               container.getState() == RMContainerState.ALLOCATED)
> {code}
> Secondly if  {{isResourceGreaterThanNone(toPreempt)}} return false, we shouldn't remove container from {{warnedContainers}}. We should only remove container from {{warnedContainers}}, if container is not in state {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and {{RMContainerState.ACQUIRED}}.
> {code}
>       if ((container.getState() == RMContainerState.RUNNING ||
>               container.getState() == RMContainerState.ALLOCATED) &&
>               isResourceGreaterThanNone(toPreempt)) {
>         warnOrKillContainer(container);
>         Resources.subtractFrom(toPreempt, container.getContainer().getResource());
>       } else {
>         warnedIter.remove();
>       }
> {code}
> Also once the containers in {{warnedContainers}} are wrongly removed, it will never be preempted. Because these containers are already in {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't return the containers in {{FSAppAttempt#preemptionMap}}.
> {code}
>   public RMContainer preemptContainer() {
>     if (LOG.isDebugEnabled()) {
>       LOG.debug("App " + getName() + " is going to preempt a running " +
>           "container");
>     }
>     RMContainer toBePreempted = null;
>     for (RMContainer container : getLiveContainers()) {
>       if (!getPreemptionContainers().contains(container) &&
>           (toBePreempted == null ||
>               comparator.compare(toBePreempted, container) > 0)) {
>         toBePreempted = container;
>       }
>     }
>     return toBePreempted;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)