You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by "Raja.Aravapalli" <Ra...@target.com> on 2014/05/21 16:11:28 UTC

oozie coordinator

Hi

Can anyone guide me that, is it possible to control the coordinator application such way that, rather than triggering the workflow based on frequency(time based) to trigger the workflow certain specified no.of times.

Let me elaborate what I need:

Suppose I have four input files which I need to process in my input directory...
Now, I want my coordinator to trigger the workflow on my input only 4times, as every time it triggers it process one file from my input directory.


So, by default coordinator triggers the workflow based on time frequency we provide to it...  but I want my coordinator to trigger based on no.of my input files rather than the frequency.

Does it possible to configure in oozie coordinator.xml file.

Any one please help me......



Regards,
Raja.

Re: oozie coordinator

Posted by Mona Chitnis <ch...@yahoo-inc.com>.
This can be achieved slightly indirectly. You can set your <dataset> <uri-template> to point to your input files’ parent directory, and set an arbitrary frequency to execute the coordinator. Set a timeout for the coordinator action, so it will timeout after a finite interval when no input file is present. Use some custom logic with a Java action in workflow, to look for ‘next unprocessed’ input file, since I’m guessing your input files do not follow a synchronous filename pattern.

On 5/21/14, 7:11 AM, "Raja.Aravapalli" <Ra...@target.com>> wrote:

Hi

Can anyone guide me that, is it possible to control the coordinator application such way that, rather than triggering the workflow based on frequency(time based) to trigger the workflow certain specified no.of times.

Let me elaborate what I need:

Suppose I have four input files which I need to process in my input directory...
Now, I want my coordinator to trigger the workflow on my input only 4times, as every time it triggers it process one file from my input directory.


So, by default coordinator triggers the workflow based on time frequency we provide to it...  but I want my coordinator to trigger based on no.of my input files rather than the frequency.

Does it possible to configure in oozie coordinator.xml file.

Any one please help me......



Regards,
Raja.


Re: oozie coordinator

Posted by siva kumar <si...@gmail.com>.
hi Raja,
               Can you please post the workflow that you have designed to
take only one file at a time.



Thanks and regards



On Sat, May 24, 2014 at 3:00 PM, David Morel <da...@amakuru.net>wrote:

> Then create one directory per file, if for instance you can make sure you
> will not have more than 1 file per minute, include the yyyyMmddhhmm in the
> directory name, use that as your dataset for a minutely oozie job, and set
> the concurrency to 5 and a timeout of 5 minutes (or a bit more) in the
> coordinator xml.
> David
>
> On 24 mai 2014 08:11:57 CEST, "Raja.Aravapalli" <
> Raja.Aravapalli@target.com> wrote:
> >Hi Siva,
> >
> >What you said is correct. But, my workflow is designed such a way, that
> >it takes only one files at a time when it runs... :)
> >
> >Regards,
> >Raja.
> >
> >-----Original Message-----
> >From: siva kumar [mailto:siva165755@gmail.com]
> >Sent: Friday, May 23, 2014 6:18 PM
> >To: user@oozie.apache.org
> >Subject: Re: oozie coordinator
> >
> >Hi Raja,
> >You have specified a condition that,only one file should b processed
> >each time the workflow runs.But,when you specify the input directory
> >path containing four files in the workflow,how can oozie system know
> >which file to fetch?.
> >You need to specify particular file name if your condition has to b
> >satisfied.
> >When you specify a directory conatining many files , i think oozie
> >workflow executes all the files in single run.I think u can understand
> >what i was trying to say?.So,in this context i think the above
> >requirement may not be possible.
> >
> >
> >
> >On Fri, May 23, 2014 at 4:23 PM, Raja.Aravapalli
> ><Raja.Aravapalli@target.com
> >> wrote:
> >
> >> Ok, Thanks for your reply Siva, please do let me know if you finds
> >> some solution to overcome this as this type of customization helps us
> >
> >> a lot rather than killing the coordinator manually.....!
> >>
> >>
> >> Regards,
> >> Raja.
> >>
> >> -----Original Message-----
> >> From: siva kumar [mailto:siva165755@gmail.com]
> >> Sent: Friday, May 23, 2014 2:37 PM
> >> To: user@oozie.apache.org
> >> Subject: Re: oozie coordinator
> >>
> >> hi raja,
> >>           I have worked on the same requirement.But according to my
> >> knowledge i think,that is not the way how oozie works.Oozie is used a
> >
> >> scheduler of jobs based on some
> >condition(frequncy,time,dataavailability).
> >> I was left with no clue on the requirement you have posted.
> >>
> >>
> >> thanks
> >>
> >> On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli <
> >> Raja.Aravapalli@target.com
> >> > wrote:
> >>
> >> > Hi
> >> >
> >> > Can anyone guide me that, is it possible to control the coordinator
> >
> >> > application such way that, rather than triggering the workflow
> >based
> >> > on frequency(time based) to trigger the workflow certain specified
> >> > no.of
> >> times.
> >> >
> >> > Let me elaborate what I need:
> >> >
> >> > Suppose I have four input files which I need to process in my input
> >
> >> > directory...
> >> > Now, I want my coordinator to trigger the workflow on my input only
> >
> >> > 4times, as every time it triggers it process one file from my input
> >
> >> > directory.
> >> >
> >> >
> >> > So, by default coordinator triggers the workflow based on time
> >> > frequency we provide to it...  but I want my coordinator to trigger
> >
> >> > based on no.of my input files rather than the frequency.
> >> >
> >> > Does it possible to configure in oozie coordinator.xml file.
> >> >
> >> > Any one please help me......
> >> >
> >> >
> >> >
> >> > Regards,
> >> > Raja.
> >> >
> >>
>
> --
> Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

RE: oozie coordinator

Posted by David Morel <da...@amakuru.net>.
Then create one directory per file, if for instance you can make sure you will not have more than 1 file per minute, include the yyyyMmddhhmm in the directory name, use that as your dataset for a minutely oozie job, and set the concurrency to 5 and a timeout of 5 minutes (or a bit more) in the coordinator xml.
David

On 24 mai 2014 08:11:57 CEST, "Raja.Aravapalli" <Ra...@target.com> wrote:
>Hi Siva,
>
>What you said is correct. But, my workflow is designed such a way, that
>it takes only one files at a time when it runs... :)
>
>Regards,
>Raja.
>
>-----Original Message-----
>From: siva kumar [mailto:siva165755@gmail.com] 
>Sent: Friday, May 23, 2014 6:18 PM
>To: user@oozie.apache.org
>Subject: Re: oozie coordinator
>
>Hi Raja,
>You have specified a condition that,only one file should b processed
>each time the workflow runs.But,when you specify the input directory
>path containing four files in the workflow,how can oozie system know
>which file to fetch?.
>You need to specify particular file name if your condition has to b
>satisfied.
>When you specify a directory conatining many files , i think oozie
>workflow executes all the files in single run.I think u can understand
>what i was trying to say?.So,in this context i think the above
>requirement may not be possible.
>
>
>
>On Fri, May 23, 2014 at 4:23 PM, Raja.Aravapalli
><Raja.Aravapalli@target.com
>> wrote:
>
>> Ok, Thanks for your reply Siva, please do let me know if you finds 
>> some solution to overcome this as this type of customization helps us
>
>> a lot rather than killing the coordinator manually.....!
>>
>>
>> Regards,
>> Raja.
>>
>> -----Original Message-----
>> From: siva kumar [mailto:siva165755@gmail.com]
>> Sent: Friday, May 23, 2014 2:37 PM
>> To: user@oozie.apache.org
>> Subject: Re: oozie coordinator
>>
>> hi raja,
>>           I have worked on the same requirement.But according to my 
>> knowledge i think,that is not the way how oozie works.Oozie is used a
>
>> scheduler of jobs based on some
>condition(frequncy,time,dataavailability).
>> I was left with no clue on the requirement you have posted.
>>
>>
>> thanks
>>
>> On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli < 
>> Raja.Aravapalli@target.com
>> > wrote:
>>
>> > Hi
>> >
>> > Can anyone guide me that, is it possible to control the coordinator
>
>> > application such way that, rather than triggering the workflow
>based 
>> > on frequency(time based) to trigger the workflow certain specified 
>> > no.of
>> times.
>> >
>> > Let me elaborate what I need:
>> >
>> > Suppose I have four input files which I need to process in my input
>
>> > directory...
>> > Now, I want my coordinator to trigger the workflow on my input only
>
>> > 4times, as every time it triggers it process one file from my input
>
>> > directory.
>> >
>> >
>> > So, by default coordinator triggers the workflow based on time 
>> > frequency we provide to it...  but I want my coordinator to trigger
>
>> > based on no.of my input files rather than the frequency.
>> >
>> > Does it possible to configure in oozie coordinator.xml file.
>> >
>> > Any one please help me......
>> >
>> >
>> >
>> > Regards,
>> > Raja.
>> >
>>

-- 
Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.

RE: oozie coordinator

Posted by "Raja.Aravapalli" <Ra...@target.com>.
Hi Siva,

What you said is correct. But, my workflow is designed such a way, that it takes only one files at a time when it runs... :)

Regards,
Raja.

-----Original Message-----
From: siva kumar [mailto:siva165755@gmail.com] 
Sent: Friday, May 23, 2014 6:18 PM
To: user@oozie.apache.org
Subject: Re: oozie coordinator

Hi Raja,
              You have specified a condition that,only one file should b processed each time the workflow runs.But,when you specify the input directory path containing four files in the workflow,how can oozie system know which file to fetch?.
 You need to specify particular file name if your condition has to b satisfied.
When you specify a directory conatining many files , i think oozie workflow executes all the files in single run.I think u can understand what i was trying to say?.So,in this context i think the above requirement may not be possible.



On Fri, May 23, 2014 at 4:23 PM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Ok, Thanks for your reply Siva, please do let me know if you finds 
> some solution to overcome this as this type of customization helps us 
> a lot rather than killing the coordinator manually.....!
>
>
> Regards,
> Raja.
>
> -----Original Message-----
> From: siva kumar [mailto:siva165755@gmail.com]
> Sent: Friday, May 23, 2014 2:37 PM
> To: user@oozie.apache.org
> Subject: Re: oozie coordinator
>
> hi raja,
>           I have worked on the same requirement.But according to my 
> knowledge i think,that is not the way how oozie works.Oozie is used a 
> scheduler of jobs based on some condition(frequncy,time,dataavailability).
> I was left with no clue on the requirement you have posted.
>
>
> thanks
>
> On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli < 
> Raja.Aravapalli@target.com
> > wrote:
>
> > Hi
> >
> > Can anyone guide me that, is it possible to control the coordinator 
> > application such way that, rather than triggering the workflow based 
> > on frequency(time based) to trigger the workflow certain specified 
> > no.of
> times.
> >
> > Let me elaborate what I need:
> >
> > Suppose I have four input files which I need to process in my input 
> > directory...
> > Now, I want my coordinator to trigger the workflow on my input only 
> > 4times, as every time it triggers it process one file from my input 
> > directory.
> >
> >
> > So, by default coordinator triggers the workflow based on time 
> > frequency we provide to it...  but I want my coordinator to trigger 
> > based on no.of my input files rather than the frequency.
> >
> > Does it possible to configure in oozie coordinator.xml file.
> >
> > Any one please help me......
> >
> >
> >
> > Regards,
> > Raja.
> >
>

Re: oozie coordinator

Posted by siva kumar <si...@gmail.com>.
Hi Raja,
              You have specified a condition that,only one file should b
processed each time the workflow runs.But,when you specify the input
directory path containing four files in the workflow,how can oozie system
know which file to fetch?.
 You need to specify particular file name if your condition has to b
satisfied.
When you specify a directory conatining many files , i think oozie workflow
executes all the files in single run.I think u can understand what i was
trying to say?.So,in this context i think the above requirement may not be
possible.



On Fri, May 23, 2014 at 4:23 PM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Ok, Thanks for your reply Siva, please do let me know if you finds some
> solution to overcome this as this type of customization helps us a lot
> rather than killing the coordinator manually.....!
>
>
> Regards,
> Raja.
>
> -----Original Message-----
> From: siva kumar [mailto:siva165755@gmail.com]
> Sent: Friday, May 23, 2014 2:37 PM
> To: user@oozie.apache.org
> Subject: Re: oozie coordinator
>
> hi raja,
>           I have worked on the same requirement.But according to my
> knowledge i think,that is not the way how oozie works.Oozie is used a
> scheduler of jobs based on some condition(frequncy,time,dataavailability).
> I was left with no clue on the requirement you have posted.
>
>
> thanks
>
> On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli <
> Raja.Aravapalli@target.com
> > wrote:
>
> > Hi
> >
> > Can anyone guide me that, is it possible to control the coordinator
> > application such way that, rather than triggering the workflow based
> > on frequency(time based) to trigger the workflow certain specified no.of
> times.
> >
> > Let me elaborate what I need:
> >
> > Suppose I have four input files which I need to process in my input
> > directory...
> > Now, I want my coordinator to trigger the workflow on my input only
> > 4times, as every time it triggers it process one file from my input
> > directory.
> >
> >
> > So, by default coordinator triggers the workflow based on time
> > frequency we provide to it...  but I want my coordinator to trigger
> > based on no.of my input files rather than the frequency.
> >
> > Does it possible to configure in oozie coordinator.xml file.
> >
> > Any one please help me......
> >
> >
> >
> > Regards,
> > Raja.
> >
>

RE: oozie coordinator

Posted by "Raja.Aravapalli" <Ra...@target.com>.
Ok, Thanks for your reply Siva, please do let me know if you finds some solution to overcome this as this type of customization helps us a lot rather than killing the coordinator manually.....!


Regards,
Raja.

-----Original Message-----
From: siva kumar [mailto:siva165755@gmail.com] 
Sent: Friday, May 23, 2014 2:37 PM
To: user@oozie.apache.org
Subject: Re: oozie coordinator

hi raja,
          I have worked on the same requirement.But according to my knowledge i think,that is not the way how oozie works.Oozie is used a scheduler of jobs based on some condition(frequncy,time,dataavailability).
I was left with no clue on the requirement you have posted.


thanks

On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Hi
>
> Can anyone guide me that, is it possible to control the coordinator 
> application such way that, rather than triggering the workflow based 
> on frequency(time based) to trigger the workflow certain specified no.of times.
>
> Let me elaborate what I need:
>
> Suppose I have four input files which I need to process in my input 
> directory...
> Now, I want my coordinator to trigger the workflow on my input only 
> 4times, as every time it triggers it process one file from my input 
> directory.
>
>
> So, by default coordinator triggers the workflow based on time 
> frequency we provide to it...  but I want my coordinator to trigger 
> based on no.of my input files rather than the frequency.
>
> Does it possible to configure in oozie coordinator.xml file.
>
> Any one please help me......
>
>
>
> Regards,
> Raja.
>

Re: oozie coordinator

Posted by siva kumar <si...@gmail.com>.
hi raja,
          I have worked on the same requirement.But according to my
knowledge i think,that is not the way how oozie works.Oozie is used a
scheduler of jobs based on some condition(frequncy,time,dataavailability).
I was left with no clue on the requirement you have posted.


thanks

On Wed, May 21, 2014 at 7:41 PM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Hi
>
> Can anyone guide me that, is it possible to control the coordinator
> application such way that, rather than triggering the workflow based on
> frequency(time based) to trigger the workflow certain specified no.of times.
>
> Let me elaborate what I need:
>
> Suppose I have four input files which I need to process in my input
> directory...
> Now, I want my coordinator to trigger the workflow on my input only
> 4times, as every time it triggers it process one file from my input
> directory.
>
>
> So, by default coordinator triggers the workflow based on time frequency
> we provide to it...  but I want my coordinator to trigger based on no.of my
> input files rather than the frequency.
>
> Does it possible to configure in oozie coordinator.xml file.
>
> Any one please help me......
>
>
>
> Regards,
> Raja.
>