You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Robin Verlangen <ro...@us2.nl> on 2012/09/13 14:38:55 UTC

Hadoop failing jobs non zero exit status 7

Hi there,

Today we started deploying Mapr M3 into production. However we're having
problems completing jobs. During a typical job the job return this:

12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
attempt_201209111629_0002_r_000001_2, Status : FAILED on node
cl004.flxviz.com
java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*

When I get the logs of the tasktracker, I see things like:

2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable:
Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
Removing task 'attempt_201209111629_0002_r_000002_1'
2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker:
Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
task_201209111629_0002_m_000011, for tracker
'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable:
Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker:
Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
task_201209111629_0002_r_000002, for tracker
'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
Removing task 'attempt_201209111629_0002_m_000011_2'
2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable:
Child Error*

Does anyone have a clue where to start? It doesn't seem to be a MapR
specific problem, that's why I post this in the hadoop mailinglist.

Some additional information:
OS: Centos 6.3 x64
16GB Ram
2x quad core processor
12x 1TB harddrive
Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
Robin,

I am unsure. The following works for me on a Mac:

➜  Downloads  export
HADOOP_CLIENT_OPTS="-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log"
➜  Downloads  hadoop fs -ls /
…

So am guessing its not wrong (But where's your heap values? The
default heap size is bad to specify for tasks - as its a system
percentage of RAM, too much for a single small task). But you could
try reverting it and retrying. I'd also try to see if the heap request
(which currently may be too high) is > configured ulimit, as that
causes this as well (but odd that the return code is 7).

Does your app do a System.exit anywhere?

On Thu, Sep 13, 2012 at 6:30 PM, Robin Verlangen <ro...@us2.nl> wrote:
> These values are:
>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> Is this wrong?
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> No idea if this is MapR specific, but looks like your
>> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
>> for the failing jobs. Check its values up in its job.xml.
>>
>> For MapR specific issues, contact MapR directly.
>>
>> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
>> > Hi there,
>> >
>> > Today we started deploying Mapr M3 into production. However we're having
>> > problems completing jobs. During a typical job the job return this:
>> >
>> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
>> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
>> > cl004.flxviz.com
>> > java.lang.Throwable: Child Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>> >
>> > When I get the logs of the tasktracker, I see things like:
>> >
>> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
>> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_r_000002_1'
>> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
>> > task_201209111629_0002_m_000011, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
>> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
>> > task_201209111629_0002_r_000002, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_m_000011_2'
>> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
>> > Error*
>> >
>> > Does anyone have a clue where to start? It doesn't seem to be a MapR
>> > specific problem, that's why I post this in the hadoop mailinglist.
>> >
>> > Some additional information:
>> > OS: Centos 6.3 x64
>> > 16GB Ram
>> > 2x quad core processor
>> > 12x 1TB harddrive
>> >
>> > Best regards,
>> >
>> > Robin Verlangen
>> > Software engineer
>> >
>> > W http://www.robinverlangen.nl
>> > E robin@us2.nl
>> >
>> > Disclaimer: The information contained in this message and attachments is
>> > intended solely for the attention and use of the named addressee and may
>> > be
>> > confidential. If you are not the intended recipient, you are reminded
>> > that
>> > the information remains the property of the sender. You must not use,
>> > disclose, distribute, copy, print or rely on this e-mail. If you have
>> > received this message in error, please contact the sender immediately
>> > and
>> > irrevocably delete this message and any copies.
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
Robin,

I am unsure. The following works for me on a Mac:

➜  Downloads  export
HADOOP_CLIENT_OPTS="-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log"
➜  Downloads  hadoop fs -ls /
…

So am guessing its not wrong (But where's your heap values? The
default heap size is bad to specify for tasks - as its a system
percentage of RAM, too much for a single small task). But you could
try reverting it and retrying. I'd also try to see if the heap request
(which currently may be too high) is > configured ulimit, as that
causes this as well (but odd that the return code is 7).

Does your app do a System.exit anywhere?

On Thu, Sep 13, 2012 at 6:30 PM, Robin Verlangen <ro...@us2.nl> wrote:
> These values are:
>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> Is this wrong?
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> No idea if this is MapR specific, but looks like your
>> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
>> for the failing jobs. Check its values up in its job.xml.
>>
>> For MapR specific issues, contact MapR directly.
>>
>> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
>> > Hi there,
>> >
>> > Today we started deploying Mapr M3 into production. However we're having
>> > problems completing jobs. During a typical job the job return this:
>> >
>> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
>> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
>> > cl004.flxviz.com
>> > java.lang.Throwable: Child Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>> >
>> > When I get the logs of the tasktracker, I see things like:
>> >
>> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
>> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_r_000002_1'
>> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
>> > task_201209111629_0002_m_000011, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
>> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
>> > task_201209111629_0002_r_000002, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_m_000011_2'
>> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
>> > Error*
>> >
>> > Does anyone have a clue where to start? It doesn't seem to be a MapR
>> > specific problem, that's why I post this in the hadoop mailinglist.
>> >
>> > Some additional information:
>> > OS: Centos 6.3 x64
>> > 16GB Ram
>> > 2x quad core processor
>> > 12x 1TB harddrive
>> >
>> > Best regards,
>> >
>> > Robin Verlangen
>> > Software engineer
>> >
>> > W http://www.robinverlangen.nl
>> > E robin@us2.nl
>> >
>> > Disclaimer: The information contained in this message and attachments is
>> > intended solely for the attention and use of the named addressee and may
>> > be
>> > confidential. If you are not the intended recipient, you are reminded
>> > that
>> > the information remains the property of the sender. You must not use,
>> > disclose, distribute, copy, print or rely on this e-mail. If you have
>> > received this message in error, please contact the sender immediately
>> > and
>> > irrevocably delete this message and any copies.
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
Robin,

I am unsure. The following works for me on a Mac:

➜  Downloads  export
HADOOP_CLIENT_OPTS="-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log"
➜  Downloads  hadoop fs -ls /
…

So am guessing its not wrong (But where's your heap values? The
default heap size is bad to specify for tasks - as its a system
percentage of RAM, too much for a single small task). But you could
try reverting it and retrying. I'd also try to see if the heap request
(which currently may be too high) is > configured ulimit, as that
causes this as well (but odd that the return code is 7).

Does your app do a System.exit anywhere?

On Thu, Sep 13, 2012 at 6:30 PM, Robin Verlangen <ro...@us2.nl> wrote:
> These values are:
>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> Is this wrong?
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> No idea if this is MapR specific, but looks like your
>> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
>> for the failing jobs. Check its values up in its job.xml.
>>
>> For MapR specific issues, contact MapR directly.
>>
>> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
>> > Hi there,
>> >
>> > Today we started deploying Mapr M3 into production. However we're having
>> > problems completing jobs. During a typical job the job return this:
>> >
>> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
>> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
>> > cl004.flxviz.com
>> > java.lang.Throwable: Child Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>> >
>> > When I get the logs of the tasktracker, I see things like:
>> >
>> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
>> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_r_000002_1'
>> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
>> > task_201209111629_0002_m_000011, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
>> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
>> > task_201209111629_0002_r_000002, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_m_000011_2'
>> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
>> > Error*
>> >
>> > Does anyone have a clue where to start? It doesn't seem to be a MapR
>> > specific problem, that's why I post this in the hadoop mailinglist.
>> >
>> > Some additional information:
>> > OS: Centos 6.3 x64
>> > 16GB Ram
>> > 2x quad core processor
>> > 12x 1TB harddrive
>> >
>> > Best regards,
>> >
>> > Robin Verlangen
>> > Software engineer
>> >
>> > W http://www.robinverlangen.nl
>> > E robin@us2.nl
>> >
>> > Disclaimer: The information contained in this message and attachments is
>> > intended solely for the attention and use of the named addressee and may
>> > be
>> > confidential. If you are not the intended recipient, you are reminded
>> > that
>> > the information remains the property of the sender. You must not use,
>> > disclose, distribute, copy, print or rely on this e-mail. If you have
>> > received this message in error, please contact the sender immediately
>> > and
>> > irrevocably delete this message and any copies.
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
Robin,

I am unsure. The following works for me on a Mac:

➜  Downloads  export
HADOOP_CLIENT_OPTS="-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log"
➜  Downloads  hadoop fs -ls /
…

So am guessing its not wrong (But where's your heap values? The
default heap size is bad to specify for tasks - as its a system
percentage of RAM, too much for a single small task). But you could
try reverting it and retrying. I'd also try to see if the heap request
(which currently may be too high) is > configured ulimit, as that
causes this as well (but odd that the return code is 7).

Does your app do a System.exit anywhere?

On Thu, Sep 13, 2012 at 6:30 PM, Robin Verlangen <ro...@us2.nl> wrote:
> These values are:
>
> <property>
>   <name>mapred.map.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> <property>
>   <name>mapred.reduce.child.java.opts</name>
>   <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
> </property>
>
> Is this wrong?
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>
>
> 2012/9/13 Harsh J <ha...@cloudera.com>
>>
>> No idea if this is MapR specific, but looks like your
>> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
>> for the failing jobs. Check its values up in its job.xml.
>>
>> For MapR specific issues, contact MapR directly.
>>
>> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
>> > Hi there,
>> >
>> > Today we started deploying Mapr M3 into production. However we're having
>> > problems completing jobs. During a typical job the job return this:
>> >
>> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
>> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
>> > cl004.flxviz.com
>> > java.lang.Throwable: Child Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
>> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
>> >
>> > http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>> >
>> > When I get the logs of the tasktracker, I see things like:
>> >
>> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
>> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_r_000002_1'
>> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
>> > task_201209111629_0002_m_000011, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
>> > Error
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
>> > Caused by: java.io.IOException: Task process exit with nonzero status of
>> > 7.
>> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
>> > on
>> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
>> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
>> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
>> > task_201209111629_0002_r_000002, for tracker
>> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
>> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
>> > Removing
>> > task 'attempt_201209111629_0002_m_000011_2'
>> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
>> > Error
>> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
>> > Error*
>> >
>> > Does anyone have a clue where to start? It doesn't seem to be a MapR
>> > specific problem, that's why I post this in the hadoop mailinglist.
>> >
>> > Some additional information:
>> > OS: Centos 6.3 x64
>> > 16GB Ram
>> > 2x quad core processor
>> > 12x 1TB harddrive
>> >
>> > Best regards,
>> >
>> > Robin Verlangen
>> > Software engineer
>> >
>> > W http://www.robinverlangen.nl
>> > E robin@us2.nl
>> >
>> > Disclaimer: The information contained in this message and attachments is
>> > intended solely for the attention and use of the named addressee and may
>> > be
>> > confidential. If you are not the intended recipient, you are reminded
>> > that
>> > the information remains the property of the sender. You must not use,
>> > disclose, distribute, copy, print or rely on this e-mail. If you have
>> > received this message in error, please contact the sender immediately
>> > and
>> > irrevocably delete this message and any copies.
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Robin Verlangen <ro...@us2.nl>.
These values are:

<property>
  <name>mapred.map.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

Is this wrong?

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/13 Harsh J <ha...@cloudera.com>

> No idea if this is MapR specific, but looks like your
> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
> for the failing jobs. Check its values up in its job.xml.
>
> For MapR specific issues, contact MapR directly.
>
> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> > Hi there,
> >
> > Today we started deploying Mapr M3 into production. However we're having
> > problems completing jobs. During a typical job the job return this:
> >
> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> > cl004.flxviz.com
> > java.lang.Throwable: Child Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
> >
> > When I get the logs of the tasktracker, I see things like:
> >
> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_r_000002_1'
> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> > task_201209111629_0002_m_000011, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> > task_201209111629_0002_r_000002, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_m_000011_2'
> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
> Error*
> >
> > Does anyone have a clue where to start? It doesn't seem to be a MapR
> > specific problem, that's why I post this in the hadoop mailinglist.
> >
> > Some additional information:
> > OS: Centos 6.3 x64
> > 16GB Ram
> > 2x quad core processor
> > 12x 1TB harddrive
> >
> > Best regards,
> >
> > Robin Verlangen
> > Software engineer
> >
> > W http://www.robinverlangen.nl
> > E robin@us2.nl
> >
> > Disclaimer: The information contained in this message and attachments is
> > intended solely for the attention and use of the named addressee and may
> be
> > confidential. If you are not the intended recipient, you are reminded
> that
> > the information remains the property of the sender. You must not use,
> > disclose, distribute, copy, print or rely on this e-mail. If you have
> > received this message in error, please contact the sender immediately and
> > irrevocably delete this message and any copies.
> >
>
>
>
> --
> Harsh J
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Robin Verlangen <ro...@us2.nl>.
These values are:

<property>
  <name>mapred.map.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

Is this wrong?

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/13 Harsh J <ha...@cloudera.com>

> No idea if this is MapR specific, but looks like your
> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
> for the failing jobs. Check its values up in its job.xml.
>
> For MapR specific issues, contact MapR directly.
>
> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> > Hi there,
> >
> > Today we started deploying Mapr M3 into production. However we're having
> > problems completing jobs. During a typical job the job return this:
> >
> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> > cl004.flxviz.com
> > java.lang.Throwable: Child Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
> >
> > When I get the logs of the tasktracker, I see things like:
> >
> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_r_000002_1'
> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> > task_201209111629_0002_m_000011, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> > task_201209111629_0002_r_000002, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_m_000011_2'
> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
> Error*
> >
> > Does anyone have a clue where to start? It doesn't seem to be a MapR
> > specific problem, that's why I post this in the hadoop mailinglist.
> >
> > Some additional information:
> > OS: Centos 6.3 x64
> > 16GB Ram
> > 2x quad core processor
> > 12x 1TB harddrive
> >
> > Best regards,
> >
> > Robin Verlangen
> > Software engineer
> >
> > W http://www.robinverlangen.nl
> > E robin@us2.nl
> >
> > Disclaimer: The information contained in this message and attachments is
> > intended solely for the attention and use of the named addressee and may
> be
> > confidential. If you are not the intended recipient, you are reminded
> that
> > the information remains the property of the sender. You must not use,
> > disclose, distribute, copy, print or rely on this e-mail. If you have
> > received this message in error, please contact the sender immediately and
> > irrevocably delete this message and any copies.
> >
>
>
>
> --
> Harsh J
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Robin Verlangen <ro...@us2.nl>.
These values are:

<property>
  <name>mapred.map.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

Is this wrong?

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/13 Harsh J <ha...@cloudera.com>

> No idea if this is MapR specific, but looks like your
> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
> for the failing jobs. Check its values up in its job.xml.
>
> For MapR specific issues, contact MapR directly.
>
> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> > Hi there,
> >
> > Today we started deploying Mapr M3 into production. However we're having
> > problems completing jobs. During a typical job the job return this:
> >
> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> > cl004.flxviz.com
> > java.lang.Throwable: Child Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
> >
> > When I get the logs of the tasktracker, I see things like:
> >
> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_r_000002_1'
> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> > task_201209111629_0002_m_000011, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> > task_201209111629_0002_r_000002, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_m_000011_2'
> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
> Error*
> >
> > Does anyone have a clue where to start? It doesn't seem to be a MapR
> > specific problem, that's why I post this in the hadoop mailinglist.
> >
> > Some additional information:
> > OS: Centos 6.3 x64
> > 16GB Ram
> > 2x quad core processor
> > 12x 1TB harddrive
> >
> > Best regards,
> >
> > Robin Verlangen
> > Software engineer
> >
> > W http://www.robinverlangen.nl
> > E robin@us2.nl
> >
> > Disclaimer: The information contained in this message and attachments is
> > intended solely for the attention and use of the named addressee and may
> be
> > confidential. If you are not the intended recipient, you are reminded
> that
> > the information remains the property of the sender. You must not use,
> > disclose, distribute, copy, print or rely on this e-mail. If you have
> > received this message in error, please contact the sender immediately and
> > irrevocably delete this message and any copies.
> >
>
>
>
> --
> Harsh J
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Robin Verlangen <ro...@us2.nl>.
These values are:

<property>
  <name>mapred.map.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

<property>
  <name>mapred.reduce.child.java.opts</name>
  <value>-XX:ErrorFile=/opt/cores/mapreduce_java_error%p.log</value>
</property>

Is this wrong?

Best regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.



2012/9/13 Harsh J <ha...@cloudera.com>

> No idea if this is MapR specific, but looks like your
> mapred.child.java.opts (or map/reduce specific opts) may be incorrect
> for the failing jobs. Check its values up in its job.xml.
>
> For MapR specific issues, contact MapR directly.
>
> On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> > Hi there,
> >
> > Today we started deploying Mapr M3 into production. However we're having
> > problems completing jobs. During a typical job the job return this:
> >
> > 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> > attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> > cl004.flxviz.com
> > java.lang.Throwable: Child Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> > 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> >
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
> >
> > When I get the logs of the tasktracker, I see things like:
> >
> > 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> > 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_r_000002_1'
> > 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> > task_201209111629_0002_m_000011, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 7.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> on
> > tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> > 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> > task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> > task_201209111629_0002_r_000002, for tracker
> > 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> > 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker:
> Removing
> > task 'attempt_201209111629_0002_m_000011_2'
> > 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error
> > from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child
> Error*
> >
> > Does anyone have a clue where to start? It doesn't seem to be a MapR
> > specific problem, that's why I post this in the hadoop mailinglist.
> >
> > Some additional information:
> > OS: Centos 6.3 x64
> > 16GB Ram
> > 2x quad core processor
> > 12x 1TB harddrive
> >
> > Best regards,
> >
> > Robin Verlangen
> > Software engineer
> >
> > W http://www.robinverlangen.nl
> > E robin@us2.nl
> >
> > Disclaimer: The information contained in this message and attachments is
> > intended solely for the attention and use of the named addressee and may
> be
> > confidential. If you are not the intended recipient, you are reminded
> that
> > the information remains the property of the sender. You must not use,
> > disclose, distribute, copy, print or rely on this e-mail. If you have
> > received this message in error, please contact the sender immediately and
> > irrevocably delete this message and any copies.
> >
>
>
>
> --
> Harsh J
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
No idea if this is MapR specific, but looks like your
mapred.child.java.opts (or map/reduce specific opts) may be incorrect
for the failing jobs. Check its values up in its job.xml.

For MapR specific issues, contact MapR directly.

On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> task_201209111629_0002_m_000011, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> task_201209111629_0002_r_000002, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Aaron Eng <ae...@maprtech.com>.
Hi Robin,

"Task process exit with nonzero status of 7." is being printed by the
TaskTracker to indicate the child JVM spawned to run the task attempt in
question exited unexpectedly. This also means the task was not killed
administratively (either by TaskTracker or by you, the admin).  So
basically, the TaskTracker tried to launch a JVM and it exited.

You didn't post all the details for the attempt from the TaskTracker log so
it's hard to say the specifics of when/how this happened.  And I'm not
familiar with exit code 7 being returned by a JVM but this would have been
generated by the JVM process itself, not any user code you tried to run in
the attempt.  It could be that the JVM has some internal issue, some bug of
sorts, what java version are you using?  Or it could be the JVM needs
something from the environment that is not available/permissible in the
context in which it is being executed.  So for instance, you could have
some limit in place in the execution environment of the tasktracker which
is being hit.

If nothing else, you can note down the way in which the JVM is being
spawned and try to spawn it manually and if its immediately reproducible,
knowing whether this comes up when you spawn it directly from the shell vs.
being spawned via TaskTracker is a useful bit of info.

If you can't identify the cause, feel free to post in answers.mapr.com or
send an email to support@mapr.com for some more assistance.

Best Regards,
Aaron Eng

On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED on node cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip task_201209111629_0002_m_000011, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip task_201209111629_0002_r_000002, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Aaron Eng <ae...@maprtech.com>.
Hi Robin,

"Task process exit with nonzero status of 7." is being printed by the
TaskTracker to indicate the child JVM spawned to run the task attempt in
question exited unexpectedly. This also means the task was not killed
administratively (either by TaskTracker or by you, the admin).  So
basically, the TaskTracker tried to launch a JVM and it exited.

You didn't post all the details for the attempt from the TaskTracker log so
it's hard to say the specifics of when/how this happened.  And I'm not
familiar with exit code 7 being returned by a JVM but this would have been
generated by the JVM process itself, not any user code you tried to run in
the attempt.  It could be that the JVM has some internal issue, some bug of
sorts, what java version are you using?  Or it could be the JVM needs
something from the environment that is not available/permissible in the
context in which it is being executed.  So for instance, you could have
some limit in place in the execution environment of the tasktracker which
is being hit.

If nothing else, you can note down the way in which the JVM is being
spawned and try to spawn it manually and if its immediately reproducible,
knowing whether this comes up when you spawn it directly from the shell vs.
being spawned via TaskTracker is a useful bit of info.

If you can't identify the cause, feel free to post in answers.mapr.com or
send an email to support@mapr.com for some more assistance.

Best Regards,
Aaron Eng

On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED on node cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip task_201209111629_0002_m_000011, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip task_201209111629_0002_r_000002, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
No idea if this is MapR specific, but looks like your
mapred.child.java.opts (or map/reduce specific opts) may be incorrect
for the failing jobs. Check its values up in its job.xml.

For MapR specific issues, contact MapR directly.

On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> task_201209111629_0002_m_000011, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> task_201209111629_0002_r_000002, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Aaron Eng <ae...@maprtech.com>.
Hi Robin,

"Task process exit with nonzero status of 7." is being printed by the
TaskTracker to indicate the child JVM spawned to run the task attempt in
question exited unexpectedly. This also means the task was not killed
administratively (either by TaskTracker or by you, the admin).  So
basically, the TaskTracker tried to launch a JVM and it exited.

You didn't post all the details for the attempt from the TaskTracker log so
it's hard to say the specifics of when/how this happened.  And I'm not
familiar with exit code 7 being returned by a JVM but this would have been
generated by the JVM process itself, not any user code you tried to run in
the attempt.  It could be that the JVM has some internal issue, some bug of
sorts, what java version are you using?  Or it could be the JVM needs
something from the environment that is not available/permissible in the
context in which it is being executed.  So for instance, you could have
some limit in place in the execution environment of the tasktracker which
is being hit.

If nothing else, you can note down the way in which the JVM is being
spawned and try to spawn it manually and if its immediately reproducible,
knowing whether this comes up when you spawn it directly from the shell vs.
being spawned via TaskTracker is a useful bit of info.

If you can't identify the cause, feel free to post in answers.mapr.com or
send an email to support@mapr.com for some more assistance.

Best Regards,
Aaron Eng

On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED on node cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip task_201209111629_0002_m_000011, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip task_201209111629_0002_r_000002, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Aaron Eng <ae...@maprtech.com>.
Hi Robin,

"Task process exit with nonzero status of 7." is being printed by the
TaskTracker to indicate the child JVM spawned to run the task attempt in
question exited unexpectedly. This also means the task was not killed
administratively (either by TaskTracker or by you, the admin).  So
basically, the TaskTracker tried to launch a JVM and it exited.

You didn't post all the details for the attempt from the TaskTracker log so
it's hard to say the specifics of when/how this happened.  And I'm not
familiar with exit code 7 being returned by a JVM but this would have been
generated by the JVM process itself, not any user code you tried to run in
the attempt.  It could be that the JVM has some internal issue, some bug of
sorts, what java version are you using?  Or it could be the JVM needs
something from the environment that is not available/permissible in the
context in which it is being executed.  So for instance, you could have
some limit in place in the execution environment of the tasktracker which
is being hit.

If nothing else, you can note down the way in which the JVM is being
spawned and try to spawn it manually and if its immediately reproducible,
knowing whether this comes up when you spawn it directly from the shell vs.
being spawned via TaskTracker is a useful bit of info.

If you can't identify the cause, feel free to post in answers.mapr.com or
send an email to support@mapr.com for some more assistance.

Best Regards,
Aaron Eng

On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <ro...@us2.nl> wrote:

> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED on node cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip task_201209111629_0002_m_000011, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip task_201209111629_0002_r_000002, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
No idea if this is MapR specific, but looks like your
mapred.child.java.opts (or map/reduce specific opts) may be incorrect
for the failing jobs. Check its values up in its job.xml.

For MapR specific issues, contact MapR directly.

On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> task_201209111629_0002_m_000011, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> task_201209111629_0002_r_000002, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>



-- 
Harsh J

Re: Hadoop failing jobs non zero exit status 7

Posted by Harsh J <ha...@cloudera.com>.
No idea if this is MapR specific, but looks like your
mapred.child.java.opts (or map/reduce specific opts) may be incorrect
for the failing jobs. Check its values up in its job.xml.

For MapR specific issues, contact MapR directly.

On Thu, Sep 13, 2012 at 6:08 PM, Robin Verlangen <ro...@us2.nl> wrote:
> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id :
> attempt_201209111629_0002_r_000001_2, Status : FAILED on node
> cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output
> http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip
> task_201209111629_0002_m_000011, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on
> tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip
> task_201209111629_0002_r_000002, for tracker
> 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing
> task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error
> from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
>
> Best regards,
>
> Robin Verlangen
> Software engineer
>
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>



-- 
Harsh J