You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Manikandan R (JIRA)" <ji...@apache.org> on 2018/11/12 17:12:00 UTC

[jira] [Commented] (YARN-6523) Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster

    [ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684104#comment-16684104 ] 

Manikandan R commented on YARN-6523:
------------------------------------

{quote}Does the registration request and response really need a token sequence number field? {quote}

Added token sequence no only in Registration response. Thought it would be more cleaner approach to have the sequence no upfront and pass as part of first node heartbeat itself. Anyways, removed now so that NM's StatusUpdaterImpl pass 0 in first heartbeat request and from then it would get set based on value received as part of node heartbeat response from RM.

{quote}Has the RM failover scenario been considered?{quote}

Since RMContext has tokenSeqeunceNo and initialised to 1 during the start, in cases of any restart it would again initialised to 1 and after all NM's re-registration process, each NM's first node heartbeat response would be having credentials for sure as there would be difference in value.

Taken care of all other comments. Attaching patch for review.



> Newly retrieved security Tokens are sent as part of each heartbeat to each node from RM which is not desirable in large cluster
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6523
>                 URL: https://issues.apache.org/jira/browse/YARN-6523
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: RM
>    Affects Versions: 2.8.0, 2.7.3
>            Reporter: Naganarasimha G R
>            Assignee: Manikandan R
>            Priority: Major
>         Attachments: YARN-6523.001.patch, YARN-6523.002.patch, YARN-6523.003.patch, YARN-6523.004.patch, YARN-6523.005.patch
>
>
> Currently as part of heartbeat response RM sets all application's tokens though all applications might not be active on the node. On top of it NodeHeartbeatResponsePBImpl converts tokens for each app into SystemCredentialsForAppsProto. Hence for each node and each heartbeat too many SystemCredentialsForAppsProto objects were getting created.
> We hit a OOM while testing for 2000 concurrent apps on 500 nodes cluster with 8GB RAM configured for RM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org