You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Robert Kanter (JIRA)" <ji...@apache.org> on 2017/10/05 00:33:00 UTC

[jira] [Updated] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow

     [ https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Kanter updated YARN-7262:
--------------------------------
    Attachment: YARN-7262.002.patch

Thanks for the feedback [~templedf].  
I've had a chance to actually use it in a real cluster and everything looks good.

{quote}The new property and default should have javadocs{quote}
It is documented in yarn-default.xml and most of the other properties in {{YarnConfiguration}} don't have Javadocs.

The 002 patch:
- I changed my {{null !=}} - that's what I get for copy-pasting existing code.
- Replaced all {{Assert.assertX}} with simply {{assertX}}
- Added messages to some assert statements
- Added tests for split index 2, 3, and 4.
- No longer stores {{token3}}
- {{initInternal}} now considers 0 a valid value.  I also fixed that for the app split index config.
- Made the "Unknown child node with name" message more descriptive, moved it to the debug level, and updated it to not erroneously complain about the "1", "2", "3", and "4" znodes.  I also made similar improvements for the similar code used for app spliting.
- Updated {{loadDelegationTokenFromNode}} to use {{else}} instead of early {{return}}
- Introduced a new variable in {{getLeafZnodePath}} instead of reusing {{splitIdx}}
- Split the long line in {{RMStateStore}}

> Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7262
>                 URL: https://issues.apache.org/jira/browse/YARN-7262
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-7262.001.patch, YARN-7262.002.patch
>
>
> We've seen users who are running into a problem where the RM is storing so many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those znodes is higher than the jute buffer. This is fine during operations, but becomes a problem on a fail over because the RM will try to read in all of the token znodes (i.e. call {{getChildren}} on the parent znode).  This is particularly bad because everything appears to be okay, but then if a failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the delegation token znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org