You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Yu Li (Jira)" <ji...@apache.org> on 2019/12/09 07:34:00 UTC

[jira] [Updated] (FLINK-12692) Support disk spilling in HeapKeyedStateBackend

     [ https://issues.apache.org/jira/browse/FLINK-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yu Li updated FLINK-12692:
--------------------------
    Fix Version/s:     (was: 1.10.0)
                   1.11.0

Sorry but we have to postpone the work to 1.11.0 due to comparative limited review resource. We will try to supply a trial version in [flink-packages|https://flink-packages.org] for those who'd like to try this out in production. Will give a note here once the trial version is ready.

> Support disk spilling in HeapKeyedStateBackend
> ----------------------------------------------
>
>                 Key: FLINK-12692
>                 URL: https://issues.apache.org/jira/browse/FLINK-12692
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / State Backends
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{HeapKeyedStateBackend}} is one of the two {{KeyedStateBackends}} in Flink, since state lives as Java objects on the heap and the de/serialization only happens during state snapshot and restore, it outperforms {{RocksDBKeyedStateBackend}} when all data could reside in memory.
> However, along with the advantage, {{HeapKeyedStateBackend}} also has its shortcomings, and the most painful one is the difficulty to estimate the maximum heap size (Xmx) to set, and we will suffer from GC impact once the heap memory is not enough to hold all state data. There’re several (inevitable) causes for such scenario, including (but not limited to):
> * Memory overhead of Java object representation (tens of times of the serialized data size).
> * Data flood caused by burst traffic.
> * Data accumulation caused by source malfunction.
> To resolve this problem, we propose a solution to support spilling state data to disk before heap memory is exhausted. We will monitor the heap usage and choose the coldest data to spill, and reload them when heap memory is regained after data removing or TTL expiration, automatically.
> More details please refer to the design doc and mailing list discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)