You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2017/02/23 18:36:44 UTC

[jira] [Created] (OOZIE-2807) Oozie gets RM delegation token even for checking job status

Rohini Palaniswamy created OOZIE-2807:
-----------------------------------------

             Summary: Oozie gets RM delegation token even for checking job status
                 Key: OOZIE-2807
                 URL: https://issues.apache.org/jira/browse/OOZIE-2807
             Project: Oozie
          Issue Type: Bug
            Reporter: Rohini Palaniswamy
            Assignee: Satish Subhashrao Saley
             Fix For: 5.0.0


We had one user submitting way too many workflows with single hive query - ~3600 workflows running concurrently. Surprisingly Oozie held up well without issues.
But [~daryn] from our hadoop team saw that the amount of delegation tokens fetched by Oozie was very high compared to actual number of jobs submitted and was stressing RM with the calls and also pushing it close to its memory limits. This is because we are fetching the delegation token every time we create a JobClient instead of only during job submission.

https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java#L503-L519

So for one job we fetch
1) 1 token during submission
2) 1 token every 5 minutes when we check status of job
3) 1 token after the job ends to retrieve status.
4) 1 token if we are killing the job.

So for a job running for 11 minutes, we would have fetched the token 4 times. May be more in other cases like mapreduce where we check for end of launcher and child job.

Only 1 out of the token (used in the job submission) will be cancelled after job completes. Other tokens are kind of leaked and will only be cleaned up by RM after the expiry period (24 hrs is default). This can make RM go out of memory.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)