You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Daniel Zhang <ja...@hotmail.com> on 2020/06/26 13:27:47 UTC

Oozie Action Max Retries

Hi,

In the Oozie document: https://oozie.apache.org/docs/5.1.0/WorkflowFunctionalSpec.html#a18_User-Retry_for_Workflow_Actions_since_Oozie_3.1, it listed the oozie action max retries as "User-Retry allows user to give certain number of reties (must not exceed system max retries)", and I assume the system max retries is defined as "oozie.action.retries.max" with default value of 3,  defined in document https://oozie.apache.org/docs/5.1.0/WorkflowFunctionalSpec.html#a18_User-Retry_for_Workflow_Actions_since_Oozie_3.1

But when I changed that value on AWS EMR 5.28.1, shown below:
[hadoop@ip-10-51-51-37 ~]$ oozie admin -version
Oozie server build version: {"build.version":"5.1.0","vc.url":"https:\/\/git-wip-us.apache.org\/repos\/asf\/oozie.git","vc.revision":"branch-5.1@352b76eb","build.time":"2019.12.14-10:37:29GMT","build.user":"ec2-user"}
[hadoop@ip-10-51-51-37 ~]$ oozie admin -configuration | grep retries
oozie.action.retries.max : 12
oozie.action.ssh.check.retries.max : 3
oozie.service.CallbackService.early.requeue.max.retries : 5
oozie.service.JPAService.retry.max-retries : 10
oozie.zookeeper.max.retries : 10

In our test for the action retries as "<action name="SparkAction" retry-max="10" retry-interval="1">", we observed the retries still ONLY happened 3 times with 1 minute interval, then Oozie workflow will go to failure step.

Any idea why? In one of our business case, we want to retry more than 3 times with some interval as we defined, then go to the failure step.

Thanks