You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Lakshmi Manasa Gaduputi (Jira)" <ji...@apache.org> on 2021/04/09 19:00:00 UTC

[jira] [Commented] (SAMZA-2619) yarn.am.container.label is applied to entire samza job (all its containers) instead of only the AM.

    [ https://issues.apache.org/jira/browse/SAMZA-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318213#comment-17318213 ] 

Lakshmi Manasa Gaduputi commented on SAMZA-2619:
------------------------------------------------

*Problem:* the patch [PR#1480|https://github.com/apache/samza/pull/1480]  does not work: throws NPE as appCtx.getAMContainerResourceRequest() returns null and is null till the end of the ClientHelper.submitApplication method. As in, AMContainerResourceRequest will always remain null within ApplicationSubmissionContext while in ClientHelper.

*NPE Reason:* Peeking into yarn code shows that AMContainerResourceRequest is created, if null within [RMAppManager.validateAndCreateResourceRequest|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java#L372] by using the mem,cpu set in the ApplicationSubmissionContext.setResource.

*YARN code into samza?*: The only way to set node label only for AM during submit application is to use [setAMContainerResourceRequest.setNodeLabelExpression in ApplicationSubmissionContext|https://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/NodeLabel.html]. However, creating AMContianerResourceRequest within samza's ClientHelper will require AM_CONTAINER_PRIORITY constant which is a part of yarn-server package NOT api, common or client which samza uses. Pulling yarn-server package into samza is seems plain WRONG! Additionally, it also means in some sense replicating what RMAppManager.validateAndCreateResourceRequest does to create the AMContainerRequest.

*Workaround (tested and works)* - is in the samza-job, use yarn.am.container.label=desired-label AND yarn.container.label="" -- relying on this javadoc in yarn for[ SubmissionApplicationContext|https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java#L437] and this comment from YARN-2493. This workaround places AM on desired-label node while containers go to general nodes without a label.

> yarn.am.container.label is applied to entire samza job (all its containers) instead of only the AM.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-2619
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2619
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Lakshmi Manasa Gaduputi
>            Assignee: Lakshmi Manasa Gaduputi
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> yarn.am.container.label is applied to entire samza job (all its containers) instead of only the AM.
> Yarn supports 3 forms of applying node label to containers – am only, container only and entire job. [https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeLabel.html#Specifying_node_label_for_application
> ]Samza's ClientHelper.scala finds the am container label and applies it to the application submission context which applies the label to the entire job instead of only the am. 
> this restricts jobs which prefer to have label only for am or different labels for am and containers.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)