You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by admin <wy...@163.com> on 2021/03/31 05:57:00 UTC

Container is running beyond physical memory limits. Current usage: 5.0 GB of 5 GB physical memory used; 7.0 GB of 25 GB virtual memory used. Killing container.

java.lang.Exception: Container [pid=17248,containerID=container_1597847003686_12235_01_001336] is running beyond physical memory limits. Current usage: 5.0 GB of 5 GB physical memory used; 7.0 GB of 25 GB virtual memory used. Killing container.
Dump of the process-tree for container_1597847003686_12235_01_001336 :
    |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    |- 17283 17248 17248 17248 (java) 1025867 190314 7372083200 1311496 /usr/local/jdk1.8/bin/java -Xmx2147483611 -Xms2147483611 -XX:MaxDirectMemorySize=590558009 -XX:MaxMetaspaceSize=268435456 -server -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:ParallelGCThreads=4 -XX:+AlwaysPreTouch -XX:NewRatio=1 -DjobName=fastmidu-deeplink-tuid-20200203 -Dlog.file=/data1/yarn/containers/application_1597847003686_12235/container_1597847003686_12235_01_001336/taskmanager.log -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskExecutorRunner -D taskmanager.memory.framework.off-heap.size=134217728b -D taskmanager.memory.network.max=456340281b -D taskmanager.memory.network.min=456340281b -D taskmanager.memory.framework.heap.size=134217728b -D taskmanager.memory.managed.size=1825361124b -D taskmanager.cpu.cores=5.0 -D taskmanager.memory.task.heap.size=2013265883b -D taskmanager.memory.task.off-heap.size=0b --configDir . -Djobmanager.rpc.address=di-h4-dn-134.h.ab1.qttsite.net -Dweb.port=0 -Dweb.tmpdir=/tmp/flink-web-f63d543b-a75a-4dc4-be93-979eebd8062d -Djobmanager.rpc.port=43423 -Drest.address=di-h4-dn-134.h.ab1.qttsite.net 
    |- 17248 17246 17248 17248 (bash) 0 0 116015104 353 /bin/bash -c /usr/local/jdk1.8/bin/java -Xmx2147483611 -Xms2147483611 -XX:MaxDirectMemorySize=590558009 -XX:MaxMetaspaceSize=268435456 -server -XX:+UseConcMarkSweepGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=75 -XX:ParallelGCThreads=4 -XX:+AlwaysPreTouch -XX:NewRatio=1 -DjobName=fastmidu-deeplink-tuid-20200203 -Dlog.file=/data1/yarn/containers/application_1597847003686_12235/container_1597847003686_12235_01_001336/taskmanager.log -Dlog4j.configuration=file:./log4j.properties org.apache.flink.yarn.YarnTaskExecutorRunner -D taskmanager.memory.framework.off-heap.size=134217728b -D taskmanager.memory.network.max=456340281b -D taskmanager.memory.network.min=456340281b -D taskmanager.memory.framework.heap.size=134217728b -D taskmanager.memory.managed.size=1825361124b -D taskmanager.cpu.cores=5.0 -D taskmanager.memory.task.heap.size=2013265883b -D taskmanager.memory.task.off-heap.size=0b --configDir . -Djobmanager.rpc.address='di-h4-dn-134.h.ab1.qttsite.net' -Dweb.port='0' -Dweb.tmpdir='/tmp/flink-web-f63d543b-a75a-4dc4-be93-979eebd8062d' -Djobmanager.rpc.port='43423' -Drest.address='di-h4-dn-134.h.ab1.qttsite.net' 1> /data1/yarn/containers/application_1597847003686_12235/container_1597847003686_12235_01_001336/taskmanager.out 2> /data1/yarn/containers/application_1597847003686_12235/container_1597847003686_12235_01_001336/taskmanager.err 

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

    at org.apache.flink.yarn.YarnResourceManager.lambda$onContainersCompleted$0(YarnResourceManager.java:343)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
    at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
    at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
    at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
    at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
    at akka.actor.ActorCell.invoke(ActorCell.scala:561)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
    at akka.dispatch.Mailbox.run(Mailbox.scala:225)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Hi :
    生产上用的flink 1.10.1 版本的flink,经常有任务重启,然后在UI里面报错如上面的信息。

这种情况目前的处理方式是调大每个TaskManager的内存大小,除了这种方式,还有没有其他方式,有没有什么具体实用的排查方式,具体的原因是什么呢???