You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Kenneth Kavaliauskas <ke...@ericsson.com> on 2014/10/23 19:53:54 UTC

Problem with java action getting restarted.

I'm using an oozie java action step to start a java main.  This java application does some calculations and then runs another map-reduce job based on that data.
Since the oozie java action runs as a map-only job through oozie it is also seen in job tracker.

One of our nodes was low on memory so the task tracker killed the oozie map-only job and restarted it on another node.
However before killing it, the java application had already spawned its own map reduce job.
When the oozie map-only job was restarted on the other node, it again spawned yet another map-reduce job with the same data as the former one.
Looking in job tracker now has duplicate map-reduce jobs running against the same data.

How do you prevent/manage/alter settings such that the java program that oozie initiates in the map-only process only get run once.

Any help would be appreciated,
Ken

Ken Kavaliauskas
Software Engineer

Ericsson
kenneth.kavaliauskas@ericsson.com<ma...@ericsson.com>


Re: Problem with java action getting restarted.

Posted by Robert Kanter <rk...@cloudera.com>.
Hi Kenneth,

There currently isn't a great way to do this, especially on Hadoop 1.  You
can give your spawned job a unique name, and have your Java action check
for any running jobs with that name before running a new instance of the
job.  In later versions of Hadoop 2, we can actually set "tags" on jobs, so
we've added a similar check to most actions, but using a tag instead of the
job name.

- Robert

On Thu, Oct 23, 2014 at 10:53 AM, Kenneth Kavaliauskas <
kenneth.kavaliauskas@ericsson.com> wrote:

>  I’m using an oozie java action step to start a java main.  This java
> application does some calculations and then runs another map-reduce job
> based on that data.
>
> Since the oozie java action runs as a map-only job through oozie it is
> also seen in job tracker.
>
>
>
> One of our nodes was low on memory so the task tracker killed the oozie
> map-only job and restarted it on another node.
>
> However before killing it, the java application had already spawned its
> own map reduce job.
>
> When the oozie map-only job was restarted on the other node, it again
> spawned yet another map-reduce job with the same data as the former one.
>
> Looking in job tracker now has duplicate map-reduce jobs running against
> the same data.
>
>
>
> How do you prevent/manage/alter settings such that the java program that
> oozie initiates in the map-only process only get run once.
>
>
>
> Any help would be appreciated,
>
> Ken
>
> [image: Description: Description:
> http://insite.telcordia.com/corporate/pr/brand/downloads/templates/EricssonLine.gif]
>
> Ken Kavaliauskas
>
> Software Engineer
>
>
>
> Ericsson
>
> kenneth.kavaliauskas@ericsson.com
>
>
>