You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2021/09/13 19:40:00 UTC

[jira] [Resolved] (YARN-10941) Wrong Yarn node label mapping with AWS EMR machine types

     [ https://issues.apache.org/jira/browse/YARN-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran resolved YARN-10941.
-----------------------------------
    Resolution: Invalid

going to have to close this as invalid, sorry

# JIRA isn't the place for discussions like this. It's not a bug in the code
# And it's an EMR deployment issue, so not even Apache code

> Wrong Yarn node label mapping with AWS EMR machine types
> --------------------------------------------------------
>
>                 Key: YARN-10941
>                 URL: https://issues.apache.org/jira/browse/YARN-10941
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.10.1
>            Reporter: Agam
>            Priority: Major
>
> Does anyone have experience with Yarn node labels on AWS EMR? If so you please share your thoughts. We want to run All the Spark executors on Task(Spot) machine and all the Spark ApplicationMaster/Driver on Core(on-Demand) machine. Previously we were running Spark executors and Spark Driver all on the CORE machine(on-demand).
> In order to achieve this, we have created the "TASK" yarn node label as a part of a custom AWS EMR Bootstrap action. And Have mapped the same "TASK" yarn label when any Spot instance is registered with AWS EMR in a separate bootstrap action. As "CORE" is the default yarn node label expression, so we are simply mapping it with an on-demand instance upon registration of the node in the bootstrap action.
> We are using `"spark.yarn.executor.nodeLabelExpression": "TASK"` spark conf to launch spark executors on Task nodes.
> So.. we are facing the problem of the wrong mapping of the Yarn node label with the appropriate machine i.e For a short duration of time(around 1-2 mins) the "TASK" yarn node label is mapped with on-demand instances and "CORE" yarn node label is mapped with spot instance. So During this short duration of wrong labeling Yarn launches Spark executors on On-demand instances and Spark drivers on Spot instances.
> This wrong mapping of labels with corresponding machine type persists till the bootstrap actions are complete and after that, the mapping is automatically resolved to its correct state.
> The script we are running as a part of the bootstrap action:
> This script is run on all new machines to assign a label to that machine. The script is being run as a background PID as the yarn is only available after all custom bootstrap actions are completed
> This command is being run on the Master instance to create a new TASK yarn node label at the time of cluster creation.
> Does anyone have clue to prevent this wrong mapping of labels?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org