You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by David Morel <da...@amakuru.net> on 2014/10/07 23:55:21 UTC

Re: Best way to trigger oozie workflow?

On 9 Sep 2014, at 21:04, Paul Chavez wrote:

> If a workflow takes longer than the coordination interval to execute 
> then the new workflow will be created and put into 'Waiting' state by 
> default. There are concurrency settings that can allow more than one 
> workflow to execute at a time. Since you will be moving data into a 
> processing directoy, you can run with concurrency greater than one if 
> the processing directory is unique per workflow instance.

this is what I did when I had to do something similar:
- I used dated directories in HDFS (created through webhdfs from the
   script pushing the data)
- set the action to time out so if it didn't find the expected file
   there after a while (file name is always the same), it would not stay
   in WAITING state forever.
- set concurrency to a few instances so there's no pileup

to be honest, I don't remember how i'm doing cleanup of the created
dirs, but you get the idea :)

D.Morel