You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Roman Khachatryan (Jira)" <ji...@apache.org> on 2022/06/23 12:38:00 UTC

[jira] [Commented] (FLINK-27155) Reduce multiple reads to the same Changelog file in the same taskmanager during restore

    [ https://issues.apache.org/jira/browse/FLINK-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558041#comment-17558041 ] 

Roman Khachatryan commented on FLINK-27155:
-------------------------------------------

Many thanks for writing the design doc, [~Feifan Wang].
In general, the described approach makes a lot of sense to me.

I wanted to add these comments (I've sent a request to get commenter access):
2.3: We could probably re-use common thread pool, such as RuntimeEnvironment.asyncOperationsThreadPool
2.4: Could you explain why StateChangeFormat methods have to be static?
2.4: Just to clarify: the new "cache" component will essentially customize two lines from StateChangeFormat:
    FSDataInputStream stream = handle.openInputStream();
    DataInputViewStreamWrapper input = wrap(stream);
    Right?
    So after some refactoring we can have a non-caching version and then add a caching one. 

2.5.3 / 2.5.4: I'm not sure those approaches can handle pipelined regions properly and without further complication

> Reduce multiple reads to the same Changelog file in the same taskmanager during restore
> ---------------------------------------------------------------------------------------
>
>                 Key: FLINK-27155
>                 URL: https://issues.apache.org/jira/browse/FLINK-27155
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Feifan Wang
>            Assignee: Feifan Wang
>            Priority: Major
>             Fix For: 1.16.0
>
>
> h3. Background
> In the current implementation, State changes of different operators in the same taskmanager may be written to the same changelog file, which effectively reduces the number of files and requests to DFS.
> But on the other hand, the current implementation also reads the same changelog file multiple times on recovery. More specifically, the number of times the same changelog file is accessed is related to the number of ChangeSets contained in it. And since each read needs to skip the preceding bytes, this network traffic is also wasted.
> The result is a lot of unnecessary request to DFS when there are multiple slots and keyed state in the same taskmanager.
> h3. Proposal
> We can reduce multiple reads to the same changelog file in the same taskmanager during restore.
> One possible approach is to read the changelog file all at once and cache it in memory or local file for a period of time when reading the changelog file.
> I think this could be a subtask of [v2 FLIP-158: Generalized incremental checkpoints|https://issues.apache.org/jira/browse/FLINK-25842] .
> Hi [~ym] , [~roman]  how do you think about ?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)