You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2017/10/23 14:00:00 UTC
[jira] [Commented] (FLINK-7873) Introduce HybridStreamStateHandle
for quick recovery from checkpoint.
[ https://issues.apache.org/jira/browse/FLINK-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215170#comment-16215170 ]
Aljoscha Krettek commented on FLINK-7873:
-----------------------------------------
[~srichter] could you please have a look at this.
> Introduce HybridStreamStateHandle for quick recovery from checkpoint.
> ---------------------------------------------------------------------
>
> Key: FLINK-7873
> URL: https://issues.apache.org/jira/browse/FLINK-7873
> Project: Flink
> Issue Type: New Feature
> Components: State Backends, Checkpointing
> Affects Versions: 1.3.2
> Reporter: Sihua Zhou
> Assignee: Sihua Zhou
>
> Current recovery strategy will always read checkpoint data from remote FileStream (HDFS). This will cost a lot of network when the state is so big (e.g. 1T), this cost can be saved by reading the checkpoint data from local disk. So i introduce a HybridStreamStateHandler which try to create a local input stream first, if failed, it then create a remote input stream, it prototype looks like below:
> {code:java}
> class HybridStreamHandle {
> private FileStateHandle localHandle;
> private FileStateHandle remoteHandle;
> ......
> public FSDataInputStream openInputStream() throws IOException {
> FSDataInputStream inputStream = localHandle.openInputStream();
> return inputStream != null ? inputStream : remoteHandle.openInputStream();
> }
> .....
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)