You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ufuk Celebi (JIRA)" <ji...@apache.org> on 2016/08/18 16:01:21 UTC

[jira] [Commented] (FLINK-4425) "Out Of Memory" during savepoint deserialization

    [ https://issues.apache.org/jira/browse/FLINK-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426698#comment-15426698 ] 

Ufuk Celebi commented on FLINK-4425:
------------------------------------

Thanks for reporting this. 

(1) Is it possible to share your user program with some data?

If not possible,  could you (2) trigger the savepoint with the job having a MemoryStateBackend and share the savepoint file? That way the savepoint will be self-contained and you can share it here.

I can then try to reproduce it.

> "Out Of Memory" during savepoint deserialization
> ------------------------------------------------
>
>                 Key: FLINK-4425
>                 URL: https://issues.apache.org/jira/browse/FLINK-4425
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Sergii Koshel
>
> I've created savepoint and trying to start job using it (via -s param) and getting exception like below:
> {code:title=Exception|borderStyle=solid}
> java.lang.OutOfMemoryError: Java heap space
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointV1Serializer.deserialize(SavepointV1Serializer.java:167)
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointV1Serializer.deserialize(SavepointV1Serializer.java:42)
>         at org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.loadSavepoint(FsSavepointStore.java:133)
>         at org.apache.flink.runtime.checkpoint.savepoint.SavepointCoordinator.restoreSavepoint(SavepointCoordinator.java:201)
>         at org.apache.flink.runtime.executiongraph.ExecutionGraph.restoreSavepoint(ExecutionGraph.java:983)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1302)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>         at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
>         at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>         at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>         at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>         at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>         at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> jobmanager.heap.mb: 1280
> taskmanager.heap.mb: 1024
> java 1.8
> savepoint + checkpoint size < 1 Mb in total



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)