You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Clay B. (JIRA)" <ji...@apache.org> on 2017/08/26 01:45:01 UTC

[jira] [Created] (YARN-7106) YARN RM can be crashed requesting too many delegation tokens

Clay B. created YARN-7106:
-----------------------------

             Summary: YARN RM can be crashed requesting too many delegation tokens
                 Key: YARN-7106
                 URL: https://issues.apache.org/jira/browse/YARN-7106
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
            Reporter: Clay B.


My team has seen an interesting issue where a cluster can suffer an
accidental denial-of-service when someone (or something) requests tons of
delegation tokens; resulting in the RM becoming unresponsive.
Particularly, one sees the symptoms of YARN-2368 - "ResourceManager failed
when ZKRMStateStore tries to update znode data larger than 1MB" but
instead for us it is when the RM goes to enumerate the znode
/rmstore/ZKRMStateRoot/RMDTSecretManagerRoot/RMDelegationTokensRoot.
(Note, DelegationTokens not a znode path tied to application ID's.)

We seem to have some users who are good at causing this via Oozie. One can
also trigger this by "errantly" running the following in a tight loop:
curl -H "Content-Type: application/json" -X POST -d '{ "renewer" :
"hdfsdu" }' -u : --negotiate
http://f-bcpc-vm2.example.bloomberg.com:8088/ws/v1/cluster/delegation-token

However, what I can't find is any limitation of the number of delegation
tokens per user; nor can I find a way to see where the requests are coming
from. (I.e. I would like if I could get the IP of clients requesting
tokens; though the znodes do have the user to at least to track down the
who -- but not the where, often something one must do when a cluster user
has an errant job.)

In the shoes of YARN-2962, perhaps some remediation could be a namespace
in ZK per user - so one can only denial-of-service themself, or YARN could
even raise an exception if more tokens than some threshold are
outstanding.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org