You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jay Vyas <ja...@gmail.com> on 2014/02/14 05:16:32 UTC

How to ascertain why LinuxContainer dies?

I have a linux container that dies.  The nodemanager logs only say:

WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
Exception from container-launch :
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
at org.apache.hadoop.util.Shell.run(Shell.java:129)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

where can i find the root cause of the non-zero exit code ?

-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Okay harsh : Your hint was enought to get me back on trakc! I  found the
linux container logs and they are Wonderful :)... I guess at the end of
each container run, logs get propogated into the Distributed file system's
/var/log  directories.

In any case, once i dug in there, I found the cryptic failure was because
my done_intermediate permissions were bad.

anyways, thanks for the hint Harsh ! After monitoring the local
/var/log/hadoop-yarn/container/ directory, i was able to see that the
stdout/stderr files were being deleted , and then after some googling i
found a post about how YARN aggregates logs into the DFS.

Anyways, problem solved.  For those curious:  If debugging
Yarn-linux-containers that are dying (as shown in [local]
/var/log/hadoop-yarn/ nodemanager logs), you can dig more after the task
dies by going into

hadoop fs -cat
/var/log/hadoop-yarn/apps/<oozie_user>/logs/application_1392385522708_0008/*



On Fri, Feb 14, 2014 at 9:17 AM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> I believe that errors on containers are not propagated to the standard
> "Java" logs.
>
> You have to look into the std* and syslog files of the container:
>
>
>
> Here is an example :
>
>
>
>
> *.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027*
>
>
>
> [htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart
>
> total 60
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr
>
> drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..
>
> drwx--x---  2 htf htf  4096 Feb  4 17:27 .
>
> -rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog
>
>
>
> Regards
>
> ./g
>
>
>
> -----Original Message-----
> From: Jay Vyas [mailto:jayunit100@gmail.com]
> Sent: Friday, February 14, 2014 7:02 AM
> To: user@hadoop.apache.org
> Cc: <us...@hadoop.apache.org>
> Subject: Re: How to ascertain why LinuxContainer dies?
>
>
>
> Not sure where the containers dump standard out /error to?  I figured it
> would be propagated in the node manager logs if anywhere, right?
>
>
>
> Sent from my iPhone
>
>
>
> > On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
>
> >
>
> > Hi,
>
> >
>
> > Does your container command generate any stderr/stdout outputs that
>
> > you can check under the container's work directory after it fails?
>
> >
>
> >> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>
> >> I have a linux container that dies.  The nodemanager logs only say:
>
> >>
>
> >> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>
> >> Exception from container-launch :
>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:129)
>
> >> at
>
> >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
>
> >> 322)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun
>
> >> chContainer(LinuxContainerExecutor.java:230)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:242)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:68)
>
> >> at
>
> >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
>
> >> utor.java:886)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>
> >> .java:908)
>
> >> at java.lang.Thread.run(Thread.java:662)
>
> >>
>
> >> where can i find the root cause of the non-zero exit code ?
>
> >>
>
> >> --
>
> >> Jay Vyas
>
> >> http://jayunit100.blogspot.com
>
> >
>
> >
>
> >
>
> > --
>
> > Harsh J
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Okay harsh : Your hint was enought to get me back on trakc! I  found the
linux container logs and they are Wonderful :)... I guess at the end of
each container run, logs get propogated into the Distributed file system's
/var/log  directories.

In any case, once i dug in there, I found the cryptic failure was because
my done_intermediate permissions were bad.

anyways, thanks for the hint Harsh ! After monitoring the local
/var/log/hadoop-yarn/container/ directory, i was able to see that the
stdout/stderr files were being deleted , and then after some googling i
found a post about how YARN aggregates logs into the DFS.

Anyways, problem solved.  For those curious:  If debugging
Yarn-linux-containers that are dying (as shown in [local]
/var/log/hadoop-yarn/ nodemanager logs), you can dig more after the task
dies by going into

hadoop fs -cat
/var/log/hadoop-yarn/apps/<oozie_user>/logs/application_1392385522708_0008/*



On Fri, Feb 14, 2014 at 9:17 AM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> I believe that errors on containers are not propagated to the standard
> "Java" logs.
>
> You have to look into the std* and syslog files of the container:
>
>
>
> Here is an example :
>
>
>
>
> *.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027*
>
>
>
> [htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart
>
> total 60
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr
>
> drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..
>
> drwx--x---  2 htf htf  4096 Feb  4 17:27 .
>
> -rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog
>
>
>
> Regards
>
> ./g
>
>
>
> -----Original Message-----
> From: Jay Vyas [mailto:jayunit100@gmail.com]
> Sent: Friday, February 14, 2014 7:02 AM
> To: user@hadoop.apache.org
> Cc: <us...@hadoop.apache.org>
> Subject: Re: How to ascertain why LinuxContainer dies?
>
>
>
> Not sure where the containers dump standard out /error to?  I figured it
> would be propagated in the node manager logs if anywhere, right?
>
>
>
> Sent from my iPhone
>
>
>
> > On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
>
> >
>
> > Hi,
>
> >
>
> > Does your container command generate any stderr/stdout outputs that
>
> > you can check under the container's work directory after it fails?
>
> >
>
> >> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>
> >> I have a linux container that dies.  The nodemanager logs only say:
>
> >>
>
> >> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>
> >> Exception from container-launch :
>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:129)
>
> >> at
>
> >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
>
> >> 322)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun
>
> >> chContainer(LinuxContainerExecutor.java:230)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:242)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:68)
>
> >> at
>
> >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
>
> >> utor.java:886)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>
> >> .java:908)
>
> >> at java.lang.Thread.run(Thread.java:662)
>
> >>
>
> >> where can i find the root cause of the non-zero exit code ?
>
> >>
>
> >> --
>
> >> Jay Vyas
>
> >> http://jayunit100.blogspot.com
>
> >
>
> >
>
> >
>
> > --
>
> > Harsh J
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Okay harsh : Your hint was enought to get me back on trakc! I  found the
linux container logs and they are Wonderful :)... I guess at the end of
each container run, logs get propogated into the Distributed file system's
/var/log  directories.

In any case, once i dug in there, I found the cryptic failure was because
my done_intermediate permissions were bad.

anyways, thanks for the hint Harsh ! After monitoring the local
/var/log/hadoop-yarn/container/ directory, i was able to see that the
stdout/stderr files were being deleted , and then after some googling i
found a post about how YARN aggregates logs into the DFS.

Anyways, problem solved.  For those curious:  If debugging
Yarn-linux-containers that are dying (as shown in [local]
/var/log/hadoop-yarn/ nodemanager logs), you can dig more after the task
dies by going into

hadoop fs -cat
/var/log/hadoop-yarn/apps/<oozie_user>/logs/application_1392385522708_0008/*



On Fri, Feb 14, 2014 at 9:17 AM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> I believe that errors on containers are not propagated to the standard
> "Java" logs.
>
> You have to look into the std* and syslog files of the container:
>
>
>
> Here is an example :
>
>
>
>
> *.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027*
>
>
>
> [htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart
>
> total 60
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr
>
> drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..
>
> drwx--x---  2 htf htf  4096 Feb  4 17:27 .
>
> -rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog
>
>
>
> Regards
>
> ./g
>
>
>
> -----Original Message-----
> From: Jay Vyas [mailto:jayunit100@gmail.com]
> Sent: Friday, February 14, 2014 7:02 AM
> To: user@hadoop.apache.org
> Cc: <us...@hadoop.apache.org>
> Subject: Re: How to ascertain why LinuxContainer dies?
>
>
>
> Not sure where the containers dump standard out /error to?  I figured it
> would be propagated in the node manager logs if anywhere, right?
>
>
>
> Sent from my iPhone
>
>
>
> > On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
>
> >
>
> > Hi,
>
> >
>
> > Does your container command generate any stderr/stdout outputs that
>
> > you can check under the container's work directory after it fails?
>
> >
>
> >> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>
> >> I have a linux container that dies.  The nodemanager logs only say:
>
> >>
>
> >> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>
> >> Exception from container-launch :
>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:129)
>
> >> at
>
> >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
>
> >> 322)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun
>
> >> chContainer(LinuxContainerExecutor.java:230)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:242)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:68)
>
> >> at
>
> >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
>
> >> utor.java:886)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>
> >> .java:908)
>
> >> at java.lang.Thread.run(Thread.java:662)
>
> >>
>
> >> where can i find the root cause of the non-zero exit code ?
>
> >>
>
> >> --
>
> >> Jay Vyas
>
> >> http://jayunit100.blogspot.com
>
> >
>
> >
>
> >
>
> > --
>
> > Harsh J
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Okay harsh : Your hint was enought to get me back on trakc! I  found the
linux container logs and they are Wonderful :)... I guess at the end of
each container run, logs get propogated into the Distributed file system's
/var/log  directories.

In any case, once i dug in there, I found the cryptic failure was because
my done_intermediate permissions were bad.

anyways, thanks for the hint Harsh ! After monitoring the local
/var/log/hadoop-yarn/container/ directory, i was able to see that the
stdout/stderr files were being deleted , and then after some googling i
found a post about how YARN aggregates logs into the DFS.

Anyways, problem solved.  For those curious:  If debugging
Yarn-linux-containers that are dying (as shown in [local]
/var/log/hadoop-yarn/ nodemanager logs), you can dig more after the task
dies by going into

hadoop fs -cat
/var/log/hadoop-yarn/apps/<oozie_user>/logs/application_1392385522708_0008/*



On Fri, Feb 14, 2014 at 9:17 AM, German Florez-Larrahondo <
german.fl@samsung.com> wrote:

> I believe that errors on containers are not propagated to the standard
> "Java" logs.
>
> You have to look into the std* and syslog files of the container:
>
>
>
> Here is an example :
>
>
>
>
> *.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027*
>
>
>
> [htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart
>
> total 60
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout
>
> -rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr
>
> drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..
>
> drwx--x---  2 htf htf  4096 Feb  4 17:27 .
>
> -rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog
>
>
>
> Regards
>
> ./g
>
>
>
> -----Original Message-----
> From: Jay Vyas [mailto:jayunit100@gmail.com]
> Sent: Friday, February 14, 2014 7:02 AM
> To: user@hadoop.apache.org
> Cc: <us...@hadoop.apache.org>
> Subject: Re: How to ascertain why LinuxContainer dies?
>
>
>
> Not sure where the containers dump standard out /error to?  I figured it
> would be propagated in the node manager logs if anywhere, right?
>
>
>
> Sent from my iPhone
>
>
>
> > On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
>
> >
>
> > Hi,
>
> >
>
> > Does your container command generate any stderr/stdout outputs that
>
> > you can check under the container's work directory after it fails?
>
> >
>
> >> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>
> >> I have a linux container that dies.  The nodemanager logs only say:
>
> >>
>
> >> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>
> >> Exception from container-launch :
>
> >> org.apache.hadoop.util.Shell$ExitCodeException:
>
> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>
> >> at org.apache.hadoop.util.Shell.run(Shell.java:129)
>
> >> at
>
> >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
>
> >> 322)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun
>
> >> chContainer(LinuxContainerExecutor.java:230)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:242)
>
> >> at
>
> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
>
> >> ontainerLaunch.call(ContainerLaunch.java:68)
>
> >> at
>
> >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
> >> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
>
> >> utor.java:886)
>
> >> at
>
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
>
> >> .java:908)
>
> >> at java.lang.Thread.run(Thread.java:662)
>
> >>
>
> >> where can i find the root cause of the non-zero exit code ?
>
> >>
>
> >> --
>
> >> Jay Vyas
>
> >> http://jayunit100.blogspot.com
>
> >
>
> >
>
> >
>
> > --
>
> > Harsh J
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

RE: How to ascertain why LinuxContainer dies?

Posted by German Florez-Larrahondo <ge...@samsung.com>.
I believe that errors on containers are not propagated to the standard “Java” logs.

You have to look into the std* and syslog files of the container:

 

Here is an example :

 

.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027

 

[htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart

total 60

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr

drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..

drwx--x---  2 htf htf  4096 Feb  4 17:27 .

-rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog

 

Regards

./g

 

-----Original Message-----
From: Jay Vyas [mailto:jayunit100@gmail.com] 
Sent: Friday, February 14, 2014 7:02 AM
To: user@hadoop.apache.org
Cc: <us...@hadoop.apache.org>
Subject: Re: How to ascertain why LinuxContainer dies?

 

Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

 

Sent from my iPhone

 

> On Feb 14, 2014, at 4:46 AM, Harsh J < <ma...@cloudera.com> harsh@cloudera.com> wrote:

> 

> Hi,

> 

> Does your container command generate any stderr/stdout outputs that 

> you can check under the container's work directory after it fails?

> 

>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <ma...@gmail.com> jayunit100@gmail.com> wrote:

>> I have a linux container that dies.  The nodemanager logs only say:

>> 

>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:

>> Exception from container-launch :

>> org.apache.hadoop.util.Shell$ExitCodeException:

>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)

>>   at

>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:

>> 322)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun

>> chContainer(LinuxContainerExecutor.java:230)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:242)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:68)

>>   at 

>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec

>> utor.java:886)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor

>> .java:908)

>>   at java.lang.Thread.run(Thread.java:662)

>> 

>> where can i find the root cause of the non-zero exit code ?

>> 

>> --

>> Jay Vyas

>>  <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com

> 

> 

> 

> --

> Harsh J


RE: How to ascertain why LinuxContainer dies?

Posted by German Florez-Larrahondo <ge...@samsung.com>.
I believe that errors on containers are not propagated to the standard “Java” logs.

You have to look into the std* and syslog files of the container:

 

Here is an example :

 

.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027

 

[htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart

total 60

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr

drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..

drwx--x---  2 htf htf  4096 Feb  4 17:27 .

-rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog

 

Regards

./g

 

-----Original Message-----
From: Jay Vyas [mailto:jayunit100@gmail.com] 
Sent: Friday, February 14, 2014 7:02 AM
To: user@hadoop.apache.org
Cc: <us...@hadoop.apache.org>
Subject: Re: How to ascertain why LinuxContainer dies?

 

Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

 

Sent from my iPhone

 

> On Feb 14, 2014, at 4:46 AM, Harsh J < <ma...@cloudera.com> harsh@cloudera.com> wrote:

> 

> Hi,

> 

> Does your container command generate any stderr/stdout outputs that 

> you can check under the container's work directory after it fails?

> 

>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <ma...@gmail.com> jayunit100@gmail.com> wrote:

>> I have a linux container that dies.  The nodemanager logs only say:

>> 

>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:

>> Exception from container-launch :

>> org.apache.hadoop.util.Shell$ExitCodeException:

>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)

>>   at

>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:

>> 322)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun

>> chContainer(LinuxContainerExecutor.java:230)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:242)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:68)

>>   at 

>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec

>> utor.java:886)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor

>> .java:908)

>>   at java.lang.Thread.run(Thread.java:662)

>> 

>> where can i find the root cause of the non-zero exit code ?

>> 

>> --

>> Jay Vyas

>>  <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com

> 

> 

> 

> --

> Harsh J


RE: How to ascertain why LinuxContainer dies?

Posted by German Florez-Larrahondo <ge...@samsung.com>.
I believe that errors on containers are not propagated to the standard “Java” logs.

You have to look into the std* and syslog files of the container:

 

Here is an example :

 

.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027

 

[htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart

total 60

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr

drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..

drwx--x---  2 htf htf  4096 Feb  4 17:27 .

-rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog

 

Regards

./g

 

-----Original Message-----
From: Jay Vyas [mailto:jayunit100@gmail.com] 
Sent: Friday, February 14, 2014 7:02 AM
To: user@hadoop.apache.org
Cc: <us...@hadoop.apache.org>
Subject: Re: How to ascertain why LinuxContainer dies?

 

Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

 

Sent from my iPhone

 

> On Feb 14, 2014, at 4:46 AM, Harsh J < <ma...@cloudera.com> harsh@cloudera.com> wrote:

> 

> Hi,

> 

> Does your container command generate any stderr/stdout outputs that 

> you can check under the container's work directory after it fails?

> 

>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <ma...@gmail.com> jayunit100@gmail.com> wrote:

>> I have a linux container that dies.  The nodemanager logs only say:

>> 

>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:

>> Exception from container-launch :

>> org.apache.hadoop.util.Shell$ExitCodeException:

>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)

>>   at

>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:

>> 322)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun

>> chContainer(LinuxContainerExecutor.java:230)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:242)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:68)

>>   at 

>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec

>> utor.java:886)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor

>> .java:908)

>>   at java.lang.Thread.run(Thread.java:662)

>> 

>> where can i find the root cause of the non-zero exit code ?

>> 

>> --

>> Jay Vyas

>>  <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com

> 

> 

> 

> --

> Harsh J


RE: How to ascertain why LinuxContainer dies?

Posted by German Florez-Larrahondo <ge...@samsung.com>.
I believe that errors on containers are not propagated to the standard “Java” logs.

You have to look into the std* and syslog files of the container:

 

Here is an example :

 

.../userlogs/application_1391549207212_0006/container_1391549207212_0006_01_000027

 

[htf@gfldesktop container_1391549207212_0006_01_000027]$ ls -lart

total 60

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stdout

-rw-rw-r--  1 htf htf     0 Feb  4 17:27 stderr

drwx--x--- 28 htf htf  4096 Feb  4 17:27 ..

drwx--x---  2 htf htf  4096 Feb  4 17:27 .

-rw-rw-r--  1 htf htf 50471 Feb  4 17:31 syslog

 

Regards

./g

 

-----Original Message-----
From: Jay Vyas [mailto:jayunit100@gmail.com] 
Sent: Friday, February 14, 2014 7:02 AM
To: user@hadoop.apache.org
Cc: <us...@hadoop.apache.org>
Subject: Re: How to ascertain why LinuxContainer dies?

 

Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

 

Sent from my iPhone

 

> On Feb 14, 2014, at 4:46 AM, Harsh J < <ma...@cloudera.com> harsh@cloudera.com> wrote:

> 

> Hi,

> 

> Does your container command generate any stderr/stdout outputs that 

> you can check under the container's work directory after it fails?

> 

>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas < <ma...@gmail.com> jayunit100@gmail.com> wrote:

>> I have a linux container that dies.  The nodemanager logs only say:

>> 

>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:

>> Exception from container-launch :

>> org.apache.hadoop.util.Shell$ExitCodeException:

>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)

>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)

>>   at

>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:

>> 322)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.laun

>> chContainer(LinuxContainerExecutor.java:230)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:242)

>>   at

>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C

>> ontainerLaunch.call(ContainerLaunch.java:68)

>>   at 

>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec

>> utor.java:886)

>>   at

>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor

>> .java:908)

>>   at java.lang.Thread.run(Thread.java:662)

>> 

>> where can i find the root cause of the non-zero exit code ?

>> 

>> --

>> Jay Vyas

>>  <http://jayunit100.blogspot.com> http://jayunit100.blogspot.com

> 

> 

> 

> --

> Harsh J


Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

Sent from my iPhone

> On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
> 
> Hi,
> 
> Does your container command generate any stderr/stdout outputs that
> you can check under the container's work directory after it fails?
> 
>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>> I have a linux container that dies.  The nodemanager logs only say:
>> 
>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>> Exception from container-launch :
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>>   at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>   at java.lang.Thread.run(Thread.java:662)
>> 
>> where can i find the root cause of the non-zero exit code ?
>> 
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

Sent from my iPhone

> On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
> 
> Hi,
> 
> Does your container command generate any stderr/stdout outputs that
> you can check under the container's work directory after it fails?
> 
>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>> I have a linux container that dies.  The nodemanager logs only say:
>> 
>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>> Exception from container-launch :
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>>   at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>   at java.lang.Thread.run(Thread.java:662)
>> 
>> where can i find the root cause of the non-zero exit code ?
>> 
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

Sent from my iPhone

> On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
> 
> Hi,
> 
> Does your container command generate any stderr/stdout outputs that
> you can check under the container's work directory after it fails?
> 
>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>> I have a linux container that dies.  The nodemanager logs only say:
>> 
>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>> Exception from container-launch :
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>>   at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>   at java.lang.Thread.run(Thread.java:662)
>> 
>> where can i find the root cause of the non-zero exit code ?
>> 
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Jay Vyas <ja...@gmail.com>.
Not sure where the containers dump standard out /error to?  I figured it would be propagated in the node manager logs if anywhere, right?

Sent from my iPhone

> On Feb 14, 2014, at 4:46 AM, Harsh J <ha...@cloudera.com> wrote:
> 
> Hi,
> 
> Does your container command generate any stderr/stdout outputs that
> you can check under the container's work directory after it fails?
> 
>> On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
>> I have a linux container that dies.  The nodemanager logs only say:
>> 
>> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
>> Exception from container-launch :
>> org.apache.hadoop.util.Shell$ExitCodeException:
>>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>>   at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>>   at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>   at java.lang.Thread.run(Thread.java:662)
>> 
>> where can i find the root cause of the non-zero exit code ?
>> 
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
> 
> 
> 
> -- 
> Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Does your container command generate any stderr/stdout outputs that
you can check under the container's work directory after it fails?

On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
> I have a linux container that dies.  The nodemanager logs only say:
>
> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> Exception from container-launch :
> org.apache.hadoop.util.Shell$ExitCodeException:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>   at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>   at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
>
> where can i find the root cause of the non-zero exit code ?
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com



-- 
Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Does your container command generate any stderr/stdout outputs that
you can check under the container's work directory after it fails?

On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
> I have a linux container that dies.  The nodemanager logs only say:
>
> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> Exception from container-launch :
> org.apache.hadoop.util.Shell$ExitCodeException:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>   at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>   at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
>
> where can i find the root cause of the non-zero exit code ?
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com



-- 
Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Does your container command generate any stderr/stdout outputs that
you can check under the container's work directory after it fails?

On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
> I have a linux container that dies.  The nodemanager logs only say:
>
> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> Exception from container-launch :
> org.apache.hadoop.util.Shell$ExitCodeException:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>   at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>   at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
>
> where can i find the root cause of the non-zero exit code ?
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com



-- 
Harsh J

Re: How to ascertain why LinuxContainer dies?

Posted by Harsh J <ha...@cloudera.com>.
Hi,

Does your container command generate any stderr/stdout outputs that
you can check under the container's work directory after it fails?

On Fri, Feb 14, 2014 at 9:46 AM, Jay Vyas <ja...@gmail.com> wrote:
> I have a linux container that dies.  The nodemanager logs only say:
>
> WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor:
> Exception from container-launch :
> org.apache.hadoop.util.Shell$ExitCodeException:
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:202)
>   at org.apache.hadoop.util.Shell.run(Shell.java:129)
>   at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:322)
>   at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:230)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
>   at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
>
> where can i find the root cause of the non-zero exit code ?
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com



-- 
Harsh J