You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Manikandan R (JIRA)" <ji...@apache.org> on 2018/12/02 17:32:00 UTC

[jira] [Commented] (YARN-6523) Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster

    [ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16706418#comment-16706418 ] 

Manikandan R commented on YARN-6523:
------------------------------------

Thanks for your guidance.

Made the changes in NodeHeartbeatResponsePBImpl class to contain only systemCredentialsForAppsProto list and introduced methods in utility class for "Map<ApplicationId,ByteBuffer> to/from collection" conversions. With this approach, I can see a minor issue: ApplicationId proto and ByteBuffer conversion happens at the collar side and can repeat for the same heart beat response. Whereas, before, thats not the case. Hope this is not an critical issue. 

Also created 2 seperate test cases to test token renewal -> sequence no increment flow and sequence no change -> response contains credentials flow.

> Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6523
>                 URL: https://issues.apache.org/jira/browse/YARN-6523
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: RM
>    Affects Versions: 2.8.0, 2.7.3
>            Reporter: Naganarasimha G R
>            Assignee: Manikandan R
>            Priority: Major
>         Attachments: YARN-6523.001.patch, YARN-6523.002.patch, YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch, YARN-6523.006.patch, YARN-6523.007.patch, YARN-6523.008.patch, YARN-6523.009.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens though all applications might not be active on the node. On top of it NodeHeartbeatResponsePBImpl converts tokens for each app into SystemCredentialsForAppsProto. Hence for each node and each heartbeat too many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with 8GB RAM configured for RM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org