You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Mike Kendall <mk...@justin.tv> on 2009/11/14 20:22:37 UTC

common reasons a map task would fail on a distributed cluster but not locally?

so if i run my task as:

cat input | ./map.py | ./sum.py > output

it works just fine.  however, running it on my cluster as:

hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar -file
map.py -mapper map.py -file cat.py -reducer cat.py -input input -output
output

it fails.  i'm really confused as to why this script would fail while my
others that were written with the same methodology would work.

is there a "common reasons map tasks fail" list somewhere?  any ideas?

Re: common reasons a map task would fail on a distributed cluster but not locally?

Posted by Todd Lipcon <to...@cloudera.com>.

On Mon, Nov 16, 2009 at 10:43 PM, Jason Venner <ja...@gmail.com>wrote:

> The common reasons I have failures with streaming jobs are
>
> 1) the script exits with a non zero exit status, which is considered task
> failure by the task tracker
> 2) at least in 18 and 19, if the script writes to stdout before reading the
> first input record, the streaming code will NPE because all readers are not
> fully setup
>

A similar bug still exists in 0.20.1.

https://issues.apache.org/jira/browse/MAPREDUCE-576


> 3) an incorrect expectation on the way the input records are formatted
> 4) an issue with the run time environment
>
> On Sun, Nov 15, 2009 at 1:18 PM, Siddu <si...@gmail.com> wrote:
>
> > On Sun, Nov 15, 2009 at 1:03 AM, Mike Kendall <mk...@justin.tv>
> wrote:
> >
> > > for some reason i never tried lowering my number of map and reduce
> tasks
> > > until now.  looks like i need to reconfigure my cluster since it runs
> > fine
> > > with only 3 map tasks and 3 reduce tasks.
> > >
> > > :X
> > >
> > > On Sat, Nov 14, 2009 at 11:22 AM, Mike Kendall <mk...@justin.tv>
> > wrote:
> > >
> > > > so if i run my task as:
> > > >
> > > > cat input | ./map.py | ./sum.py > output
> > > >
> > > > it works just fine.  however, running it on my cluster as:
> > > >
> > > > hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar
> > > -file
> > > > map.py -mapper map.py -file cat.py -reducer cat.py -input input
> -output
> > > > output
> > > >
> > > > it fails.  i'm really confused as to why this script would fail while
> > my
> > >
> >
> > If fail is followed by any error . please paste it here !
> >
> >
> > > > others that were written with the same methodology would work.
> > > >
> > > > is there a "common reasons map tasks fail" list somewhere?  any
> ideas?
> > > >
> > >
> >
> >
> >
> >
> > --
> > Regards,
> > ~Sid~
> > I have never met a man so ignorant that i couldn't learn something from
> him
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>

Re: common reasons a map task would fail on a distributed cluster but not locally?

Posted by Jason Venner <ja...@gmail.com>.

The common reasons I have failures with streaming jobs are

1) the script exits with a non zero exit status, which is considered task
failure by the task tracker
2) at least in 18 and 19, if the script writes to stdout before reading the
first input record, the streaming code will NPE because all readers are not
fully setup
3) an incorrect expectation on the way the input records are formatted
4) an issue with the run time environment

On Sun, Nov 15, 2009 at 1:18 PM, Siddu <si...@gmail.com> wrote:

> On Sun, Nov 15, 2009 at 1:03 AM, Mike Kendall <mk...@justin.tv> wrote:
>
> > for some reason i never tried lowering my number of map and reduce tasks
> > until now.  looks like i need to reconfigure my cluster since it runs
> fine
> > with only 3 map tasks and 3 reduce tasks.
> >
> > :X
> >
> > On Sat, Nov 14, 2009 at 11:22 AM, Mike Kendall <mk...@justin.tv>
> wrote:
> >
> > > so if i run my task as:
> > >
> > > cat input | ./map.py | ./sum.py > output
> > >
> > > it works just fine.  however, running it on my cluster as:
> > >
> > > hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar
> > -file
> > > map.py -mapper map.py -file cat.py -reducer cat.py -input input -output
> > > output
> > >
> > > it fails.  i'm really confused as to why this script would fail while
> my
> >
>
> If fail is followed by any error . please paste it here !
>
>
> > > others that were written with the same methodology would work.
> > >
> > > is there a "common reasons map tasks fail" list somewhere?  any ideas?
> > >
> >
>
>
>
>
> --
> Regards,
> ~Sid~
> I have never met a man so ignorant that i couldn't learn something from him
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: common reasons a map task would fail on a distributed cluster but not locally?

Posted by Siddu <si...@gmail.com>.

On Sun, Nov 15, 2009 at 1:03 AM, Mike Kendall <mk...@justin.tv> wrote:

> for some reason i never tried lowering my number of map and reduce tasks
> until now.  looks like i need to reconfigure my cluster since it runs fine
> with only 3 map tasks and 3 reduce tasks.
>
> :X
>
> On Sat, Nov 14, 2009 at 11:22 AM, Mike Kendall <mk...@justin.tv> wrote:
>
> > so if i run my task as:
> >
> > cat input | ./map.py | ./sum.py > output
> >
> > it works just fine.  however, running it on my cluster as:
> >
> > hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar
> -file
> > map.py -mapper map.py -file cat.py -reducer cat.py -input input -output
> > output
> >
> > it fails.  i'm really confused as to why this script would fail while my
>

If fail is followed by any error . please paste it here !


> > others that were written with the same methodology would work.
> >
> > is there a "common reasons map tasks fail" list somewhere?  any ideas?
> >
>




-- 
Regards,
~Sid~
I have never met a man so ignorant that i couldn't learn something from him

Re: common reasons a map task would fail on a distributed cluster but not locally?

Posted by Mike Kendall <mk...@justin.tv>.

for some reason i never tried lowering my number of map and reduce tasks
until now.  looks like i need to reconfigure my cluster since it runs fine
with only 3 map tasks and 3 reduce tasks.

:X

On Sat, Nov 14, 2009 at 11:22 AM, Mike Kendall <mk...@justin.tv> wrote:

> so if i run my task as:
>
> cat input | ./map.py | ./sum.py > output
>
> it works just fine.  however, running it on my cluster as:
>
> hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-*-streaming.jar -file
> map.py -mapper map.py -file cat.py -reducer cat.py -input input -output
> output
>
> it fails.  i'm really confused as to why this script would fail while my
> others that were written with the same methodology would work.
>
> is there a "common reasons map tasks fail" list somewhere?  any ideas?
>