You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by lan tran <in...@gmail.com> on 2022/09/06 05:39:57 UTC

Out of memory in heap memory when working with state

Hi team,  
  
Currently, I was facing the OutOfMemoryError: Java heap space. This was some
how due to the fact that I was storing the state on FileSystem. With the
FsStateBackend, the working state for each task manager is in memory (on the
JVM heap), and state backups (checkpoints) go to a distributed file system,
e.g., HDFS. Therefore, is there anyways that I can free the state in memory
and directly use the state on s3 ?



Sent from [Mail](https://go.microsoft.com/fwlink/?LinkId=550986) for Windows




Re: Out of memory in heap memory when working with state

Posted by Hangxiang Yu <ma...@gmail.com>.
Hi, lan.
I guess you are using the old version of flink.
You could use RocksDBStateBackend[1] in the new version. It will put the
state into disk when the state is large which could avoid using too much
memory.
BTW, In the current internal mechanism, the state on the external storage
like s3 is just used when checkpoint and restore.
Directly using state on the external storage like s3 in runtime (which may
be called compute-storage separation) is not supported currently, some
products like [2] support it.
You could also maintain the state in the external storage by yourself, but
the performance and state consistency are important things needed to be
considered.

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/state_backends/#the-embeddedrocksdbstatebackend
[2]
https://www.alibabacloud.com/help/en/realtime-compute-for-apache-flink/latest/geministatebackend-configurations

On Tue, Sep 6, 2022 at 1:40 PM lan tran <in...@gmail.com> wrote:

> Hi team,
>
> Currently, I was facing the OutOfMemoryError: Java heap space. This was
> some how due to the fact that I was storing the state on FileSystem. With
> the FsStateBackend, the working state for each task manager is in memory
> (on the JVM heap), and state backups (checkpoints) go to a distributed file
> system, e.g., HDFS. Therefore, is there anyways  that I can free the state
> in memory and directly use the state on s3 ?
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows
>
>
>


-- 
Best,
Hangxiang.