You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Michael Tarley <mt...@247-inc.com> on 2013/12/27 05:03:04 UTC

How to remove stale actions?

Hello all,

We run a Hadoop cluster of 60 datanodes (0.20.2, CDH3u4). Almost all jobs are scheduled via Oozie. Recently I’ve noticed that we get ~200 oozie.log entries per minute for actions of killed workflows:

[26/12/2013:16:00:55 PST] pool-2-thread-30  WARN org.apache.oozie.command.wf.SignalCommand: USER[oozie] GROUP[users] TOKEN[] APP[sv1_prod_client-20131001T1303-kec-W] JOB[0063764-131209015949959-oozie-tell-W] ACTION[0063764-131209015949959-oozie-tell-W@check-work-directory] Workflow not RUNNING, current status [KILLED]

These workflow were in fact killed using ‘oozie job -kill' command. But these updates related actions continue forever. Each new oozie kill results in one or even two more zombie actions.

Where are these zombies come from? How do we stop them?

Thanks,
mtjr