You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by sohimankotia <so...@gmail.com> on 2017/04/14 07:55:40 UTC

Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

I am running a flink streaming job with parallelism 1 . 

Suddenly after 4 hours job failed . It showed 

Container container_e39_1492083788459_0676_01_000002 is completed with
diagnostics: Container
[pid=79546,containerID=container_e39_1492083788459_0676_01_000002] is
running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing
container. 

        
I tried to monitor with jmap on task manager and did not get anything that
can cause out of memory . No out of memory error in logs also



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Container-is-is-running-beyond-physical-memory-limits-Current-usage-2-0-GB-of-2-GB-physical-memory-u-tp12615.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

Posted by sohimankotia <so...@gmail.com>.
Hi Shannon,

Thanks for your response . 

First Yes, I am running flink in yarn and my job is running with parallelism
1 .

There are few points , may those can help you to narrow down for a solution
to help me ,

1. I have other jobs also running in same cluster but with more than 1
parallelism , and those are running fine .
2. There is no observation regarding out of memory from application.
3. If I run job with memory 2GB , it is failing after 4-6 hours . But If I
am running my job with 4GB it is getting failed after 21-24 hours .
4. I took jmap heap dump every 15 min for process on task manager, 
everything seems fine . 
5. My checkpointing state is every 30 sec and having size 1.17KB with no
backpressure 

Just dumb thoughts :

1. Can running with parallelism 1 cause any problem ?





--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Container-is-is-running-beyond-physical-memory-limits-Current-usage-2-0-GB-of-2-GB-physical-memory-u-tp12615p12621.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Container is is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing container.

Posted by Shannon Carey <sc...@expedia.com>.
I've had similar problems when running Flink in Yarn. Flink task manager fails and it can't launch re-start jobs because there aren't enough slots and eventually Yarn decides to terminate Flink and you lose all your jobs & state because Flink regards it as a graceful shutdown. My latest attempt to solve the issue was to attempt to disable the vmem and pmem checks in yarn with the "yarn.nodemanager.pmem-check-enabled" and "yarn.nodemanager.vmem-check-enabled" settings. It's been ok so far, but I'm not totally sure if it was a good idea or not.

Of course, I'm not sure if that's the exact same problem you're having because I'm not sure if you're running Flink in Yarn or not.

-Shannon
 


On 4/14/17, 2:55 AM, "sohimankotia" <so...@gmail.com> wrote:

>I am running a flink streaming job with parallelism 1 . 
>
>Suddenly after 4 hours job failed . It showed 
>
>Container container_e39_1492083788459_0676_01_000002 is completed with
>diagnostics: Container
>[pid=79546,containerID=container_e39_1492083788459_0676_01_000002] is
>running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
>physical memory used; 2.9 GB of 4.2 GB virtual memory used. Killing
>container. 
>
>        
>I tried to monitor with jmap on task manager and did not get anything that
>can cause out of memory . No out of memory error in logs also
>
>
>
>--
>View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Container-is-is-running-beyond-physical-memory-limits-Current-usage-2-0-GB-of-2-GB-physical-memory-u-tp12615.html
>Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
>