You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (JIRA)" <ji...@apache.org> on 2018/02/05 11:55:02 UTC
[jira] [Assigned] (FLINK-8559) Exceptions in
RocksDBIncrementalSnapshotOperation#takeSnapshot cause job to get stuck
[ https://issues.apache.org/jira/browse/FLINK-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chesnay Schepler reassigned FLINK-8559:
---------------------------------------
Assignee: Chesnay Schepler
> Exceptions in RocksDBIncrementalSnapshotOperation#takeSnapshot cause job to get stuck
> -------------------------------------------------------------------------------------
>
> Key: FLINK-8559
> URL: https://issues.apache.org/jira/browse/FLINK-8559
> Project: Flink
> Issue Type: Bug
> Components: State Backends, Checkpointing, Tests
> Affects Versions: 1.5.0
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Blocker
>
> In the {{RocksDBKeyedStatebackend#snapshotIncrementally}} we can find this code
>
> {code:java}
> final RocksDBIncrementalSnapshotOperation<K> snapshotOperation =
> new RocksDBIncrementalSnapshotOperation<>(
> this,
> checkpointStreamFactory,
> checkpointId,
> checkpointTimestamp);
> snapshotOperation.takeSnapshot();
> return new FutureTask<KeyedStateHandle>(
> new Callable<KeyedStateHandle>() {
> @Override
> public KeyedStateHandle call() throws Exception {
> return snapshotOperation.materializeSnapshot();
> }
> }
> ) {
> @Override
> public boolean cancel(boolean mayInterruptIfRunning) {
> snapshotOperation.stop();
> return super.cancel(mayInterruptIfRunning);
> }
> @Override
> protected void done() {
> snapshotOperation.releaseResources(isCancelled());
> }
> };
> {code}
> In the constructor of RocksDBIncrementalSnapshotOperation we call {{aquireResource()}} on the RocksDB {{ResourceGuard}}. If {{snapshotOperation.takeSnapshot()}} fails with an exception these resources are never released. When the task is shutdown due to the exception it will get stuck on releasing RocksDB.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)