You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Martijn Visser (Jira)" <ji...@apache.org> on 2022/03/24 09:02:00 UTC

[jira] [Updated] (FLINK-21427) Recovering from savepoint key does not exist in S3

     [ https://issues.apache.org/jira/browse/FLINK-21427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martijn Visser updated FLINK-21427:
-----------------------------------
        Parent: FLINK-26820
    Issue Type: Sub-task  (was: Bug)

> Recovering from savepoint key does not exist in S3
> --------------------------------------------------
>
>                 Key: FLINK-21427
>                 URL: https://issues.apache.org/jira/browse/FLINK-21427
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Cassandra
>    Affects Versions: 1.11.2, 1.12.1, 1.13.0
>            Reporter: Kenzyme Le
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor, auto-unassigned
>
> Hi,
> I was able to stop and generate a savepoint successfully, but resuming the job caused repeated errors in the logs *specified key does not exist* in S3.
> Plugin used: *flink-s3-fs-presto-1.11.2.jar*
>  
> {code:java}
> [] - Could not commit checkpoint.
> com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: D76810E62E37680C; S3 Extended Request ID: h5MRYXOZj5UnPEVjYtMOnUqXUSeJ784eFz3PEQjdT8B7499ZvV+3DHNLII8WLVbVhJ1/ujPG7Bo=), S3 Extended Request ID: h5MRYXOZj5UnPEVjYtMOnUqXUSeJ784eFz3PEQjdT8B7499ZvV+3DHNLII8WLVbVhJ1/ujPG7Bo= (Path: s3p://app/flink/checkpoints/prod/613240ac4a3ebb2e1a428bbd1a973433/taskowned/7a252162-f002-4afc-a45a-04b0a622c204)
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:917) ~[?:?]
> 	at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:902) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.openStream(PrestoS3FileSystem.java:887) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.seekStream(PrestoS3FileSystem.java:880) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$read$0(PrestoS3FileSystem.java:819) ~[?:?]
> 	at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.read(PrestoS3FileSystem.java:818) ~[?:?]
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?]
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:271) ~[?:?]
> 	at java.io.FilterInputStream.read(FilterInputStream.java:83) ~[?:?]
> 	at org.apache.flink.fs.s3presto.common.HadoopDataInputStream.read(HadoopDataInputStream.java:84) ~[?:?]
> 	at org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:51) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at java.io.DataInputStream.readByte(DataInputStream.java:270) ~[?:?]
> 	at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:435) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.runtime.io.disk.InputViewIterator.next(InputViewIterator.java:43) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.runtime.util.ReusingMutableToRegularIteratorWrapper.hasNext(ReusingMutableToRegularIteratorWrapper.java:61) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at com.app.streams.kernel.sinks.JdbcBatchWriteAheadSink.sendValues(JdbcBatchWriteAheadSink.java:52) ~[blob_p-642a2d12ebc0fdfb4a406ab5e9ebff24a2edf335-29b861b5042a27f11426951b8a753b1f:?]
> 	at org.apache.flink.streaming.runtime.operators.GenericWriteAheadSink.notifyCheckpointComplete(GenericWriteAheadSink.java:233) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:107) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:283) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:987) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$10(StreamTask.java:958) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$12(StreamTask.java:974) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.yield(MailboxExecutorImpl.java:79) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.runSynchronousSavepointMailboxLoop(StreamTask.java:406) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:881) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:113) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:198) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:93) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:158) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:351) ~[flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:566) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:536) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [flink-dist_2.12-1.11.2.jar:1.11.2]
> 	at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: D76810E62E37680C; S3 Extended Request ID: h5MRYXOZj5UnPEVjYtMOnUqXUSeJ784eFz3PEQjdT8B7499ZvV+3DHNLII8WLVbVhJ1/ujPG7Bo=)
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1799) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1383) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1359) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544) ~[?:?]
> 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524) ~[?:?]
> 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5054) ~[?:?]
> 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5000) ~[?:?]
> 	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1486) ~[?:?]
> 	at com.facebook.presto.hive.s3.PrestoS3FileSystem$PrestoS3InputStream.lambda$openStream$1(PrestoS3FileSystem.java:905) ~[?:?]
> 	... 41 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)