You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by prateekarora <pr...@gmail.com> on 2016/05/31 23:22:25 UTC

yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Hi

I am running flink 1.0.2 with Yarn .

After running application for some time , Yarn kill my container due to
running beyond physical memory limits .

how can i debug memory issue ?

below are the logs :

Container container_1463184272818_0165_01_000012 is completed with
diagnostics: Container
[pid=19349,containerID=container_1463184272818_0165_01_000012] is running
beyond physical memory limits. Current usage: 6.0 GB of 6 GB physical memory
used; 9.1 GB of 12.6 GB virtual memory used. Killing container.

Dump of the process-tree for container_1463184272818_0165_01_000012 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
        |- 19356 19349 19349 19349 (java) 39350 9110 9711140864 1581168
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
-XX:MaxDirectMemorySize=4608m
-Djava.library.path=/home/nativelibraries/native_lib/
-Dlog.file=/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.log
-Dlogback.configurationFile=file:logback.xml
-Dlog4j.configuration=file:log4j.properties
org.apache.flink.yarn.YarnTaskManagerRunner --configDir .
        |- 19349 19345 19349 19349 (bash) 0 0 11456512 359 /bin/bash -c
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
-XX:MaxDirectMemorySize=4608m
-Djava.library.path=/home/nativelibraries/native_lib/
-Dlog.file=/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.log
-Dlogback.configurationFile=file:logback.xml
-Dlog4j.configuration=file:log4j.properties
org.apache.flink.yarn.YarnTaskManagerRunner --configDir . 1>
/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.out
2>
/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.err

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Posted by Alexis Gendronneau <a....@gmail.com>.
thats kinda strange. Have you tried setting minimum allocations for
containers to a bigger size too ?

2016-06-02 0:04 GMT+02:00 prateekarora <pr...@gmail.com>:

> Hi
>
> I have not changed any configuration "yarn.heap-cutoff-ratio" or
> "yarn.heap-cutoff-ratio.min" .
>
> As per log  flink assign 4608 M out of 6 GB .  i thought  configuration
> working fine .
> /usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
> -XX:MaxDirectMemorySize=4608m
>
>
> Regards
> Prateek
> -
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296p7325.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>



-- 
Alexis Gendronneau

alexis.gendronneau@corp.ovh.com
a.gendronneau@gmail.com

Re: yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Posted by prateekarora <pr...@gmail.com>.
Hi

I have not changed any configuration "yarn.heap-cutoff-ratio" or
"yarn.heap-cutoff-ratio.min" .

As per log  flink assign 4608 M out of 6 GB .  i thought  configuration
working fine .
/usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
-XX:MaxDirectMemorySize=4608m


Regards
Prateek
-



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296p7325.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Posted by Konstantin Knauf <ko...@tngtech.com>.
Hi Prateek,

did you change "yarn.heap-cutoff-ratio" or "yarn.heap-cutoff-ratio.min"
[1]?

Cheers,

Konstantin

[1]
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#yarn


On 01.06.2016 17:46, prateekarora wrote:
> Hi
> 
> Thanks for the reply 
> 
> i have 6 node yarn cluster with total 107.66 GB Memory and 48 vcore .
> 
> configuration : 
>     5 Node :
>             configure each  Node with 19.53 GiB  (
> yarn.nodemanager.resource.memory-mb = 19.53 GB) 
> 
>     1 Node :
>            configure Node with 10 GiB  ( yarn.nodemanager.resource.memory-mb
> = 10 GB) 
> 
> 
>     Total : around 107.66 GB
> 
> 
> currently i am running my flink application using below commnad :
> 
>             flink run -m yarn-cluster -yn 15 -ytm  6144 -ys 1 
> <application_jar>
> 
>      if i tried to run my application using  below configuration then also
> facing same issue.
>           
>            flink run -m yarn-cluster -yn 15 -ytm  4096 -ys 1 
> <application_jar>
> 
> 
> Regards
> Prateek
> 
> 
> 
> 
> 
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296p7317.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
> 

-- 
Konstantin Knauf * konstantin.knauf@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Re: yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Posted by prateekarora <pr...@gmail.com>.
Hi

Thanks for the reply 

i have 6 node yarn cluster with total 107.66 GB Memory and 48 vcore .

configuration : 
    5 Node :
            configure each  Node with 19.53 GiB  (
yarn.nodemanager.resource.memory-mb = 19.53 GB) 

    1 Node :
           configure Node with 10 GiB  ( yarn.nodemanager.resource.memory-mb
= 10 GB) 


    Total : around 107.66 GB


currently i am running my flink application using below commnad :

            flink run -m yarn-cluster -yn 15 -ytm  6144 -ys 1 
<application_jar>

     if i tried to run my application using  below configuration then also
facing same issue.
          
           flink run -m yarn-cluster -yn 15 -ytm  4096 -ys 1 
<application_jar>


Regards
Prateek





--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296p7317.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: yarn kill container due to running beyond physical memory limits [ How can i debug memory issue ]

Posted by Alexis Gendronneau <a....@gmail.com>.
Hello,

How many memory your yarn containers are configured to have ? This error
may be due to running a flink on yarn cluster with more memory than you
have in containers. Could you check it, and maybe set containers memory to
a more suitable value ?

regards

2016-06-01 1:22 GMT+02:00 prateekarora <pr...@gmail.com>:

> Hi
>
> I am running flink 1.0.2 with Yarn .
>
> After running application for some time , Yarn kill my container due to
> running beyond physical memory limits .
>
> how can i debug memory issue ?
>
> below are the logs :
>
> Container container_1463184272818_0165_01_000012 is completed with
> diagnostics: Container
> [pid=19349,containerID=container_1463184272818_0165_01_000012] is running
> beyond physical memory limits. Current usage: 6.0 GB of 6 GB physical
> memory
> used; 9.1 GB of 12.6 GB virtual memory used. Killing container.
>
> Dump of the process-tree for container_1463184272818_0165_01_000012 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 19356 19349 19349 19349 (java) 39350 9110 9711140864 1581168
> /usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
> -XX:MaxDirectMemorySize=4608m
> -Djava.library.path=/home/nativelibraries/native_lib/
>
> -Dlog.file=/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.log
> -Dlogback.configurationFile=file:logback.xml
> -Dlog4j.configuration=file:log4j.properties
> org.apache.flink.yarn.YarnTaskManagerRunner --configDir .
>         |- 19349 19345 19349 19349 (bash) 0 0 11456512 359 /bin/bash -c
> /usr/lib/jvm/java-7-oracle-cloudera/bin/java -Xms4608m -Xmx4608m
> -XX:MaxDirectMemorySize=4608m
> -Djava.library.path=/home/nativelibraries/native_lib/
>
> -Dlog.file=/var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.log
> -Dlogback.configurationFile=file:logback.xml
> -Dlog4j.configuration=file:log4j.properties
> org.apache.flink.yarn.YarnTaskManagerRunner --configDir . 1>
>
> /var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.out
> 2>
>
> /var/log/hadoop-yarn/container/application_1463184272818_0165/container_1463184272818_0165_01_000012/taskmanager.err
>
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/yarn-kill-container-due-to-running-beyond-physical-memory-limits-How-can-i-debug-memory-issue-tp7296.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>



-- 
Alexis Gendronneau

alexis.gendronneau@corp.ovh.com
a.gendronneau@gmail.com