You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Jonathan Hung (Jira)" <ji...@apache.org> on 2019/11/26 23:43:00 UTC

[jira] [Created] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

Jonathan Hung created YARN-9992:
-----------------------------------

             Summary: Max allocation per queue is zero for custom resource types on RM startup
                 Key: YARN-9992
                 URL: https://issues.apache.org/jira/browse/YARN-9992
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Jonathan Hung


Found an issue where trying to request GPUs on a newly booted RM cannot schedule. It throws the exception in SchedulerUtils#throwInvalidResourceException:
{noformat}
throw new InvalidResourceRequestException(
    "Invalid resource request, requested resource type=[" + reqResourceName
        + "] < 0 or greater than maximum allowed allocation. Requested "
        + "resource=" + reqResource + ", maximum allowed allocation="
        + availableResource
        + ", please note that maximum allowed allocation is calculated "
        + "by scheduler based on maximum resource of registered "
        + "NodeManagers, which might be less than configured "
        + "maximum allocation="
        + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works again.

I think the RC is that upon scheduler refresh, resource-types.xml is loaded in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call ResourceUtils#fetchMaximumAllocationFromConfig in CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to fetch the {{yarn.resource-types}} config. But resource-types.xml is not loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find the custom resource when computing max allocations, and the custom resource max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org