You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Michael Hu <me...@gmail.com> on 2011/07/11 05:14:25 UTC

"java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Hi,all,
    The hadoop is set up. Whenever I run a job, I always got the same error.
Error is:

    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
hadoop-mapred-examples-0.21.0.jar wordcount test testout

*11/07/11 10:48:59 INFO mapreduce.Job: Running job: job_201107111031_0003
11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
attempt_201107111031_0003_m_000002_0, Status : FAILED
java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)

11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
*

    I google the " Task process exit with nonzero status of 1." They say
'it's an OS limit on the number of sub-directories that can be related in
another directory.' But I can create any sub-directories related in another
directory.

    Please, could anybody help me to solve this problem? Thanks
-- 
Yours sincerely
Hu Shengqiu

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Harsh J <ha...@cloudera.com>.
Right, it may be hitting this one:
https://issues.apache.org/jira/browse/MAPREDUCE-2592

But exit code 1 is perhaps too ambiguous to point to just that (I
mean, could be anything else too).

On Mon, Jul 11, 2011 at 8:48 PM, Steve Lewis <lo...@gmail.com> wrote:
> We have seen that limit reached when we ran a large number of jobs (3000
> strikes me as the figure but the real number may be hire) It has to do with
> the number of files created in a 24 hour period, My colleague who had the
> issue created a chron job to clear out the logs every hour.
> How many jobs are you running in a 24 hour period?
>
> On Mon, Jul 11, 2011 at 1:17 AM, Sudharsan Sampath <su...@gmail.com>
> wrote:
>>
>> Hi,
>>
>> The issue could be attributed to many causes. Few of which are
>>
>> 1) Unable to create logs due to insufficient space in the logs directory,
>> permissions issue.
>> 2) ulimit threshold that causes insuffucient allocation of memory.
>> 3) OOM on the child or unable to allocate the configured memory while
>> spawning the child
>> 4) Bug in the child args configuration in the mapred-site
>> 5) Unable to write the temp outputs (due to space or permission issue)
>>
>> The log that u mentioned in a limit on the file system spec and usually
>> occurs in a complex environment. Highly rare it could be the issue in
>> running the wordcount example.
>>
>> Thanks
>> Sudhan S
>>
>> On Mon, Jul 11, 2011 at 11:50 AM, Devaraj Das <dd...@hortonworks.com>
>> wrote:
>>>
>>> Moving this to mapreduce-user (this is the right list)..
>>>
>>> Could you please look at the TaskTracker logs around the time when you
>>> see the task failure. That might have something more useful for debugging..
>>>
>>>
>>> On Jul 10, 2011, at 8:14 PM, Michael Hu wrote:
>>>
>>> > Hi,all,
>>> >    The hadoop is set up. Whenever I run a job, I always got the same
>>> > error.
>>> > Error is:
>>> >
>>> >    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
>>> > hadoop-mapred-examples-0.21.0.jar wordcount test testout
>>> >
>>> > *11/07/11 10:48:59 INFO mapreduce.Job: Running job:
>>> > job_201107111031_0003
>>> > 11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
>>> > 11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
>>> > attempt_201107111031_0003_m_000002_0, Status : FAILED
>>> > java.lang.Throwable: Child Error
>>> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
>>> > Caused by: java.io.IOException: Task process exit with nonzero status
>>> > of 1.
>>> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)
>>> >
>>> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
>>> >
>>> > outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
>>> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
>>> >
>>> > outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
>>> > *
>>> >
>>> >    I google the " Task process exit with nonzero status of 1." They say
>>> > 'it's an OS limit on the number of sub-directories that can be related
>>> > in
>>> > another directory.' But I can create any sub-directories related in
>>> > another
>>> > directory.
>>> >
>>> >    Please, could anybody help me to solve this problem? Thanks
>>> > --
>>> > Yours sincerely
>>> > Hu Shengqiu
>>>
>>
>
>
>
> --
> Steven M. Lewis PhD
> 4221 105th Ave NE
> Kirkland, WA 98033
> 206-384-1340 (cell)
> Skype lordjoe_com
>
>
>



-- 
Harsh J

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Steve Lewis <lo...@gmail.com>.
We have seen that limit reached when we ran a large number of jobs (3000
strikes me as the figure but the real number may be hire) It has to do with
the number of files created in a 24 hour period, My colleague who had the
issue created a chron job to clear out the logs every hour.

How many jobs are you running in a 24 hour period?

On Mon, Jul 11, 2011 at 1:17 AM, Sudharsan Sampath <su...@gmail.com>wrote:

> Hi,
>
> The issue could be attributed to many causes. Few of which are
>
> 1) Unable to create logs due to insufficient space in the logs directory,
> permissions issue.
> 2) ulimit threshold that causes insuffucient allocation of memory.
> 3) OOM on the child or unable to allocate the configured memory while
> spawning the child
> 4) Bug in the child args configuration in the mapred-site
> 5) Unable to write the temp outputs (due to space or permission issue)
>
> The log that u mentioned in a limit on the file system spec and usually
> occurs in a complex environment. Highly rare it could be the issue in
> running the wordcount example.
>
> Thanks
> Sudhan S
>
>
> On Mon, Jul 11, 2011 at 11:50 AM, Devaraj Das <dd...@hortonworks.com>wrote:
>
>> Moving this to mapreduce-user (this is the right list)..
>>
>> Could you please look at the TaskTracker logs around the time when you see
>> the task failure. That might have something more useful for debugging..
>>
>>
>> On Jul 10, 2011, at 8:14 PM, Michael Hu wrote:
>>
>> > Hi,all,
>> >    The hadoop is set up. Whenever I run a job, I always got the same
>> error.
>> > Error is:
>> >
>> >    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
>> > hadoop-mapred-examples-0.21.0.jar wordcount test testout
>> >
>> > *11/07/11 10:48:59 INFO mapreduce.Job: Running job:
>> job_201107111031_0003
>> > 11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
>> > 11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
>> > attempt_201107111031_0003_m_000002_0, Status : FAILED
>> > java.lang.Throwable: Child Error
>> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> 1.
>> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)
>> >
>> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
>> >
>> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
>> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
>> >
>> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
>> > *
>> >
>> >    I google the " Task process exit with nonzero status of 1." They say
>> > 'it's an OS limit on the number of sub-directories that can be related
>> in
>> > another directory.' But I can create any sub-directories related in
>> another
>> > directory.
>> >
>> >    Please, could anybody help me to solve this problem? Thanks
>> > --
>> > Yours sincerely
>> > Hu Shengqiu
>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Sudharsan Sampath <su...@gmail.com>.
Hi,

The issue could be attributed to many causes. Few of which are

1) Unable to create logs due to insufficient space in the logs directory,
permissions issue.
2) ulimit threshold that causes insuffucient allocation of memory.
3) OOM on the child or unable to allocate the configured memory while
spawning the child
4) Bug in the child args configuration in the mapred-site
5) Unable to write the temp outputs (due to space or permission issue)

The log that u mentioned in a limit on the file system spec and usually
occurs in a complex environment. Highly rare it could be the issue in
running the wordcount example.

Thanks
Sudhan S

On Mon, Jul 11, 2011 at 11:50 AM, Devaraj Das <dd...@hortonworks.com> wrote:

> Moving this to mapreduce-user (this is the right list)..
>
> Could you please look at the TaskTracker logs around the time when you see
> the task failure. That might have something more useful for debugging..
>
>
> On Jul 10, 2011, at 8:14 PM, Michael Hu wrote:
>
> > Hi,all,
> >    The hadoop is set up. Whenever I run a job, I always got the same
> error.
> > Error is:
> >
> >    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
> > hadoop-mapred-examples-0.21.0.jar wordcount test testout
> >
> > *11/07/11 10:48:59 INFO mapreduce.Job: Running job: job_201107111031_0003
> > 11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
> > 11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
> > attempt_201107111031_0003_m_000002_0, Status : FAILED
> > java.lang.Throwable: Child Error
> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 1.
> >        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)
> >
> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> >
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
> > 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> >
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
> > *
> >
> >    I google the " Task process exit with nonzero status of 1." They say
> > 'it's an OS limit on the number of sub-directories that can be related in
> > another directory.' But I can create any sub-directories related in
> another
> > directory.
> >
> >    Please, could anybody help me to solve this problem? Thanks
> > --
> > Yours sincerely
> > Hu Shengqiu
>
>

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Devaraj Das <dd...@hortonworks.com>.
Moving this to mapreduce-user (this is the right list)..

Could you please look at the TaskTracker logs around the time when you see the task failure. That might have something more useful for debugging..


On Jul 10, 2011, at 8:14 PM, Michael Hu wrote:

> Hi,all,
>    The hadoop is set up. Whenever I run a job, I always got the same error.
> Error is:
> 
>    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
> hadoop-mapred-examples-0.21.0.jar wordcount test testout
> 
> *11/07/11 10:48:59 INFO mapreduce.Job: Running job: job_201107111031_0003
> 11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
> 11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
> attempt_201107111031_0003_m_000002_0, Status : FAILED
> java.lang.Throwable: Child Error
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)
> 
> 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
> 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
> *
> 
>    I google the " Task process exit with nonzero status of 1." They say
> 'it's an OS limit on the number of sub-directories that can be related in
> another directory.' But I can create any sub-directories related in another
> directory.
> 
>    Please, could anybody help me to solve this problem? Thanks
> -- 
> Yours sincerely
> Hu Shengqiu


Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Devaraj Das <dd...@hortonworks.com>.
Moving this to mapreduce-user (this is the right list)..

Could you please look at the TaskTracker logs around the time when you see the task failure. That might have something more useful for debugging..


On Jul 10, 2011, at 8:14 PM, Michael Hu wrote:

> Hi,all,
>    The hadoop is set up. Whenever I run a job, I always got the same error.
> Error is:
> 
>    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
> hadoop-mapred-examples-0.21.0.jar wordcount test testout
> 
> *11/07/11 10:48:59 INFO mapreduce.Job: Running job: job_201107111031_0003
> 11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
> 11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
> attempt_201107111031_0003_m_000002_0, Status : FAILED
> java.lang.Throwable: Child Error
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)
> 
> 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
> 11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
> outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
> *
> 
>    I google the " Task process exit with nonzero status of 1." They say
> 'it's an OS limit on the number of sub-directories that can be related in
> another directory.' But I can create any sub-directories related in another
> directory.
> 
>    Please, could anybody help me to solve this problem? Thanks
> -- 
> Yours sincerely
> Hu Shengqiu


Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Michael Hu <me...@gmail.com>.
Hi,all:
    I met a very weird problem. When I input some data, and if this data
have to split into more than 2 tasks, then the last task's status is always
initializing. I check the tasktracker, userlogs, datanode's logs, there're
no error report.

    Please, Could anybody know how this happen , thanks

-- 
Yours sincerely
Hu Shengqiu

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Harsh J <ha...@cloudera.com>.
The job may have succeeded due to the task having run successfully on
another tasktracker after a retry attempt was scheduled. This probably
means one of your TT has something bad on it, and should be easily
identifiable from the UI.

If all TTs are bad, your job would fail -- so yes, better to fix than
worry about expecting failures.

On Mon, Jul 11, 2011 at 11:53 PM, C.V.Krishnakumar Iyer
<f2...@gmail.com> wrote:
> Hi,
>
> I get this error too. But the Job completes properly. Is this error any cause for concern? As in, would any computation be hampered because of this?
>
> Thanks !
>
> Regards,
> Krishnakumar
> On Jul 11, 2011, at 10:53 AM, Bharath Mundlapudi wrote:
>
>> That number is around 40K (I think). I am not sure if you have certain configurations to cleanup user task logs periodically. We have solved this problem in MAPREDUCE-2415 which part of 0.20.204.
>>
>>
>> But you cleanup the task logs periodically, you will not run into this problem.
>>
>> -Bharath
>
>



-- 
Harsh J

Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by "C.V.Krishnakumar Iyer" <f2...@gmail.com>.
Hi,

I get this error too. But the Job completes properly. Is this error any cause for concern? As in, would any computation be hampered because of this?

Thanks !

Regards,
Krishnakumar
On Jul 11, 2011, at 10:53 AM, Bharath Mundlapudi wrote:

> That number is around 40K (I think). I am not sure if you have certain configurations to cleanup user task logs periodically. We have solved this problem in MAPREDUCE-2415 which part of 0.20.204. 
> 
> 
> But you cleanup the task logs periodically, you will not run into this problem.
> 
> -Bharath


Re: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Posted by Bharath Mundlapudi <bh...@yahoo.com>.
That number is around 40K (I think). I am not sure if you have certain configurations to cleanup user task logs periodically. We have solved this problem in MAPREDUCE-2415 which part of 0.20.204. 


But you cleanup the task logs periodically, you will not run into this problem.

-Bharath




________________________________
From: Michael Hu <me...@gmail.com>
To: common-dev@hadoop.apache.org
Sent: Sunday, July 10, 2011 8:14 PM
Subject: "java.lang.Throwable: Child Error " And " Task process exit with nonzero status of 1."

Hi,all,
    The hadoop is set up. Whenever I run a job, I always got the same error.
Error is:

    micah29@nc2:/usr/local/hadoop/hadoop$ ./bin/hadoop jar
hadoop-mapred-examples-0.21.0.jar wordcount test testout

*11/07/11 10:48:59 INFO mapreduce.Job: Running job: job_201107111031_0003
11/07/11 10:49:00 INFO mapreduce.Job:  map 0% reduce 0%
11/07/11 10:49:11 INFO mapreduce.Job: Task Id :
attempt_201107111031_0003_m_000002_0, Status : FAILED
java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:249)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:236)

11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stdout
11/07/11 10:49:11 WARN mapreduce.Job: Error reading task
outputhttp://nc2:50060/tasklog?plaintext=true&attemptid=attempt_201107111031_0003_m_000002_0&filter=stderr
*

    I google the " Task process exit with nonzero status of 1." They say
'it's an OS limit on the number of sub-directories that can be related in
another directory.' But I can create any sub-directories related in another
directory.

    Please, could anybody help me to solve this problem? Thanks
-- 
Yours sincerely
Hu Shengqiu