You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by SmileSmile <a5...@163.com> on 2020/07/03 06:30:36 UTC

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi，yun tang

I dont open checkpoint，so when  my job restart，flink how to clean history state？

my pod be killed only  happend after the job restart again and again， in this case ，I have to rebuild the flink cluster 。





| |
a511955993
|
|
邮箱：a511955993@163.com
|

签名由 网易邮箱大师 定制

On 07/03/2020 14:22, Yun Tang wrote:
Hi


If your job does not need checkpoint, why you would still restore your job with checkpoints?


Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.


In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview



Best
Yun Tang

From: SmileSmile <a5...@163.com>
Sent: Friday, July 3, 2020 14:01
To: 'user@flink.apache.org' <us...@flink.apache.org>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?
 

Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?



| |
a511955993
|
|
邮箱：a511955993@163.com
|

签名由 网易邮箱大师 定制

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Posted by Congxian Qiu <qc...@gmail.com>.

Hi SmileSmile

As the OOM problem, maybe you can try to get a memory dump before OOM,
after you get the memory dump, you can know who consumes more memory as
expected.

Best,
Congxian


Yun Tang <my...@live.com> 于2020年7月3日周五 下午3:04写道：

> Hi
>
> If you do not enable checkpoint and have you ever restored checkpoint for
> the new job. As what I have said, the timer would also be restored and the
> event time would also be triggered so that following onEventTime() could
> also be triggered to clean history data.
>
> For the 2nd question, why your job restarts again and again? I think that
> problem should be first considered.
>
> Best
> Yun Tang
> ------------------------------
> *From:* SmileSmile <a5...@163.com>
> *Sent:* Friday, July 3, 2020 14:30
> *To:* Yun Tang <my...@live.com>
> *Cc:* 'user@flink.apache.org' <us...@flink.apache.org>
> *Subject:* Re: Checkpoint is disable, will history data in rocksdb be
> leak when job restart?
>
> Hi，yun tang
>
> I dont open checkpoint，so when  my job restart，flink how to clean history
> state？
>
> my pod be killed only  happend after the job restart again and again， in
> this case ，I have to rebuild the flink cluster 。
>
>
>
>
> a511955993
> 邮箱：a511955993@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=a511955993&uid=a511955993%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1%EF%BC%9Aa511955993%40163.com%22%5D>
>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88> 定制
>
> On 07/03/2020 14:22, Yun Tang <my...@live.com> wrote:
> Hi
>
> If your job does not need checkpoint, why you would still restore your job
> with checkpoints?
>
> Actually, I did not total understand what you want, are you afraid that
> the state restored from last checkpoint would not be cleared? Since the
> event timer is also stored in checkpoint, after you restore from
> checkpoint, the event time window would also be triggered to clean history
> state.
>
> In the end, I think you just want to know why the pod is killed after some
> time? Please consider to increase the process memory to increase the
> overhead of JVM to provide some more buffer space for native memory usage
> [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and
> once you have some extra memory, the pod might be treated as OOM to be
> killed.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview
>
> Best
> Yun Tang
> ------------------------------
> *From:* SmileSmile <a5...@163.com>
> *Sent:* Friday, July 3, 2020 14:01
> *To:* 'user@flink.apache.org' <us...@flink.apache.org>
> *Subject:* Checkpoint is disable, will history data in rocksdb be leak
> when job restart?
>
>
> Hi
>
> My job work on flink 1.10.1 with event time , container memory usage  will
> rise 2G after one restart，then pod will be killed by os after some times
> restart。
>
> I find history data will be cleared when  new data arrive, call the
> function onEventTime() to clearAllState.But my job no need Checkpoint ,
> when job restart, will the history data  leaf in the offheap momory and
> never be clear?
>
> This case only happend when I use rocksdb，Heap backend is ok。
>
> Can anyone help me on how to deal with this?
>
>
> a511955993
> 邮箱：a511955993@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=a511955993&uid=a511955993%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1%EF%BC%9Aa511955993%40163.com%22%5D>
>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88> 定制
>
>

Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Posted by Yun Tang <my...@live.com>.

Hi

If you do not enable checkpoint and have you ever restored checkpoint for the new job. As what I have said, the timer would also be restored and the event time would also be triggered so that following onEventTime() could also be triggered to clean history data.

For the 2nd question, why your job restarts again and again? I think that problem should be first considered.

Best
Yun Tang
________________________________
From: SmileSmile <a5...@163.com>
Sent: Friday, July 3, 2020 14:30
To: Yun Tang <my...@live.com>
Cc: 'user@flink.apache.org' <us...@flink.apache.org>
Subject: Re: Checkpoint is disable, will history data in rocksdb be leak when job restart?

Hi，yun tang

I dont open checkpoint，so when  my job restart，flink how to clean history state？

my pod be killed only  happend after the job restart again and again， in this case ，I have to rebuild the flink cluster 。




<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=a511955993&uid=a511955993%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1%EF%BC%9Aa511955993%40163.com%22%5D>
[https://mail-online.nosdn.127.net/qiyelogo/defaultAvatar.png]
a511955993
邮箱：a511955993@163.com

签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail88> 定制

On 07/03/2020 14:22, Yun Tang<ma...@live.com> wrote:
Hi

If your job does not need checkpoint, why you would still restore your job with checkpoints?

Actually, I did not total understand what you want, are you afraid that the state restored from last checkpoint would not be cleared? Since the event timer is also stored in checkpoint, after you restore from checkpoint, the event time window would also be triggered to clean history state.

In the end, I think you just want to know why the pod is killed after some time? Please consider to increase the process memory to increase the overhead of JVM to provide some more buffer space for native memory usage [1]. After Flink-1.10, RocksDB will use 100% managed memory stablely and once you have some extra memory, the pod might be treated as OOM to be killed.

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview

Best
Yun Tang
________________________________
From: SmileSmile <a5...@163.com>>
Sent: Friday, July 3, 2020 14:01
To: 'user@flink.apache.org<ma...@flink.apache.org>' <us...@flink.apache.org>>
Subject: Checkpoint is disable, will history data in rocksdb be leak when job restart?


Hi

My job work on flink 1.10.1 with event time , container memory usage  will rise 2G after one restart，then pod will be killed by os after some times restart。

I find history data will be cleared when  new data arrive, call the function onEventTime() to clearAllState.But my job no need Checkpoint , when job restart, will the history data  leaf in the offheap momory and never be clear?

This case only happend when I use rocksdb，Heap backend is ok。

Can anyone help me on how to deal with this?


<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=a511955993&uid=a511955993%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1%EF%BC%9Aa511955993%40163.com%22%5D>
[https://mail-online.nosdn.127.net/qiyelogo/defaultAvatar.png]
a511955993
邮箱：a511955993@163.com

签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail88> 定制