You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "lvshuang (Jira)" <ji...@apache.org> on 2022/01/10 17:45:00 UTC

[jira] [Commented] (FLINK-22411) Checkpoint failed caused by Mkdirs failed to create file, the path for Flink state.checkpoints.dir in docker-compose can not work from Flink Operations Playground

    [ https://issues.apache.org/jira/browse/FLINK-22411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472186#comment-17472186 ] 

lvshuang commented on FLINK-22411:
----------------------------------

I have already in my computer create `/tmp/flink-checkpoints-directory/` and `/tmp/flink-savepoints-directory/`, still have problem about  this. can you help me@Serge

```

jobmanager_1            | 2022-01-10 17:38:07,900 WARN  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Error while processing AcknowledgeCheckpoint message
jobmanager_1            | org.apache.flink.runtime.checkpoint.CheckpointException: Could not finalize the pending checkpoint 1327. Failure reason: Failure to finalize checkpoint.
jobmanager_1            |     at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.completePendingCheckpoint(CheckpointCoordinator.java:1199) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveAcknowledgeMessage(CheckpointCoordinator.java:1072) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$acknowledgeCheckpoint$1(ExecutionGraphHandler.java:89) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_302]
jobmanager_1            |     at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_302]
jobmanager_1            |     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_302]
jobmanager_1            |     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_302]
jobmanager_1            |     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
jobmanager_1            |     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
jobmanager_1            |     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
jobmanager_1            | Caused by: java.io.IOException: Mkdirs failed to create file:/tmp/flink-checkpoints-directory/63687e9b34fd9ef2dcadc58c139ebaac/chk-1327
jobmanager_1            |     at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:262) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.state.filesystem.FsCheckpointMetadataOutputStream.<init>(FsCheckpointMetadataOutputStream.java:65) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageLocation.createMetadataOutputStream(FsCheckpointStorageLocation.java:109) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.checkpoint.PendingCheckpoint.finalizeCheckpoint(PendingCheckpoint.java:321) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.completePendingCheckpoint(CheckpointCoordinator.java:1182) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
jobmanager_1            |     ... 10 more

```

> Checkpoint failed caused by Mkdirs failed to create file, the path for Flink state.checkpoints.dir in docker-compose can not work from Flink Operations Playground
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-22411
>                 URL: https://issues.apache.org/jira/browse/FLINK-22411
>             Project: Flink
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.12.2
>            Reporter: Serge
>            Priority: Minor
>              Labels: auto-deprioritized-major, pull-request-available
>         Attachments: screenshot-1.png
>
>
> docker-compose starting correctly starting docker-compose but after several minutes of work, Apache Flink has to create checkpoints, but there is some problem with access to the file system. next step in [Observing Failure & Recovery|https://ci.apache.org/projects/flink/flink-docs-release-1.12/try-flink/flink-operations-playground.html#observing-failure–recovery] can not operation.
> Exception:
> {code:java}
> org.apache.flink.runtime.checkpoint.CheckpointException: Could not finalize the pending checkpoint 104. Failure reason: Failure to finalize checkpoint.
>     at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.completePendingCheckpoint(CheckpointCoordinator.java:1216) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
> …..
> Caused by: org.apache.flink.util.SerializedThrowable: Mkdirs failed to create file:/tmp/flink-checkpoints-directory/d73c2f87b0d7ea6748a1913ee4b50afe/chk-104
>     at org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:262) ~[flink-dist_2.11-1.12.1.jar:1.12.1]
> {code}
> it is work , add a step:
> Create the checkpoint and savepoint directories on the Docker host machine (these volumes are mounted by the jobmanager and taskmanager, as specified in docker-compose.yaml):
> {code:bash}
> mkdir -p /tmp/flink-checkpoints-directory
> mkdir -p /tmp/flink-savepoints-directory
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)