You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Rentao Wu (Jira)" <ji...@apache.org> on 2021/06/04 23:25:00 UTC

[jira] [Created] (OOZIE-3624) Oozie scheduled workflows fail when yarn/hdfs cluster changes

Rentao Wu created OOZIE-3624:
--------------------------------

             Summary: Oozie scheduled workflows fail when yarn/hdfs cluster changes
                 Key: OOZIE-3624
                 URL: https://issues.apache.org/jira/browse/OOZIE-3624
             Project: Oozie
          Issue Type: Improvement
          Components: coordinator, workflow
    Affects Versions: 5.2.0
            Reporter: Rentao Wu


When the yarn cluster which is used by a Oozie scheduled workflow gets recreated with a new cluster, future runs of the scheduled workflow will break as they depend on the workflow/ job.properties files which was deployed on hdfs.

 

The yarn jobtracker will also no longer work due to:

 

 
{noformat}
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1622844783178_0004_000002 not found in AMRMTokenSecretManager.
 
{noformat}
 

It seem there are some tokens store in yarn and when the yarn cluster gets terminated and replaced with a new yarn cluster. The oozie launcher will hit this error message.

The yarn cluster getting recreated is a common case in cloud, I'm wondering is there a way for oozie to be resilient to the underlying yarn cluster being ephemeral?

 

is it supported for workflow/coordinator/ job.properties files to be deployed on s3 instead of hdfs?

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)