You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Something Something <ma...@gmail.com> on 2014/03/17 19:43:00 UTC

Is Hadoop's TooRunner thread-safe?

I would like to trigger a few Hadoop jobs simultaneously.  I've created a
pool of threads using Executors.newFixedThreadPool.  Idea is that if the
pool size is 2, my code will trigger 2 Hadoop jobs at the same exact time
using 'ToolRunner.run'.  In my testing, I noticed that these 2 threads keep
stepping on each other.

When I looked under the hood, I noticed that ToolRunner creates
GenericOptionsParser which in turn calls a static method
'buildGeneralOptions'.  This method uses 'OptionBuilder.withArgName' which
uses an instance variable called, 'argName'.  This doesn't look thread safe
to me and I believe is the root cause of issues I am running into.

Any thoughts?

Re: Is Hadoop's TooRunner thread-safe?

Posted by Something Something <ma...@gmail.com>.
Any thoughts on this?  Confirm or Deny it's an issue.. may be?


On Mon, Mar 17, 2014 at 11:43 AM, Something Something <
mailinglists19@gmail.com> wrote:

> I would like to trigger a few Hadoop jobs simultaneously.  I've created a
> pool of threads using Executors.newFixedThreadPool.  Idea is that if the
> pool size is 2, my code will trigger 2 Hadoop jobs at the same exact time
> using 'ToolRunner.run'.  In my testing, I noticed that these 2 threads
> keep stepping on each other.
>
> When I looked under the hood, I noticed that ToolRunner creates
> GenericOptionsParser which in turn calls a static method
> 'buildGeneralOptions'.  This method uses 'OptionBuilder.withArgName'
> which uses an instance variable called, 'argName'.  This doesn't look
> thread safe to me and I believe is the root cause of issues I am running
> into.
>
> Any thoughts?
>