You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Yu Li <ca...@gmail.com> on 2019/08/13 16:06:27 UTC

[DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Hi All,

We ever held a discussion about this feature before [1] but now opening
another thread because after a second thought introducing a new backend
instead of modifying the existing heap backend is a better option to
prevent causing any regression or surprise to existing in-production usage.
And since introducing a new backend is relatively big change, we regard it
as a FLIP and need another discussion and voting process according to our
newly drafted bylaw [2].

Please allow me to quote the brief description from the old thread [1] for
the convenience of those who noticed this feature for the first time:


*HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink, since
state lives as Java objects on the heap in HeapKeyedStateBackend and the
de/serialization only happens during state snapshot and restore, it
outperforms RocksDBKeyeStateBackend when all data could reside in
memory.**However,
along with the advantage, HeapKeyedStateBackend also has its shortcomings,
and the most painful one is the difficulty to estimate the maximum heap
size (Xmx) to set, and we will suffer from GC impact once the heap memory
is not enough to hold all state data. There’re several (inevitable) causes
for such scenario, including (but not limited to):*



** Memory overhead of Java object representation (tens of times of the
serialized data size).* Data flood caused by burst traffic.* Data
accumulation caused by source malfunction.**To resolve this problem, we
proposed a solution to support spilling state data to disk before heap
memory is exhausted. We will monitor the heap usage and choose the coldest
data to spill, and reload them when heap memory is regained after data
removing or TTL expiration, automatically. Furthermore, *to prevent causing
unexpected regression to existing usage of HeapKeyedStateBackend, we plan
to introduce a new SpillableHeapKeyedStateBackend and change it to default
in future if proven to be stable.

Please let us know your point of the feature and any comment is
welcomed/appreciated. Thanks.

[1] https://s.apache.org/pxeif
[2]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026

Best Regards,
Yu

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Yu Li <ca...@gmail.com>.

Done. Thanks for the reminder Tison!

Best Regards,
Yu


On Thu, 29 Aug 2019 at 21:03, Zili Chen <wa...@gmail.com> wrote:

> Hi Yu,
>
> Notice that the wiki is still marked as "*Under Discussion*" state.
>
> I think you can update it correspondingly.
>
> Best,
> tison.
>
>
> Yu Li <ca...@gmail.com> 于2019年8月20日周二 下午10:28写道：
>
> > Sorry for the lag but since we've got a consensus days ago, I started a
> > vote thread which will have a result by EOD, thus I'm closing this
> > discussion thread. Thanks all for the participation and
> > comments/suggestions!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 16 Aug 2019 at 09:09, Till Rohrmann <tr...@apache.org>
> wrote:
> >
> > > +1 for this FLIP and the feature. I think this feature will be super
> > > helpful for many Flink users.
> > >
> > > Once the SpillableHeapKeyedStateBackend has proven to be superior to
> the
> > > HeapKeyedStateBackend we should think about removing the latter
> > completely
> > > to reduce maintenance burden.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Fri, Aug 16, 2019 at 4:06 AM Congxian Qiu <qc...@gmail.com>
> > > wrote:
> > >
> > > > Big +1 for this feature.
> > > >
> > > > This FLIP can help improves at least the following two scenarios:
> > > > - Temporary data peak when using Heap StateBackend
> > > > - Heap State Backend has better performance than RocksDBStateBackend,
> > > > especially on SATA disk. there are some guys ever told me that they
> > > > increased the parallelism of operators(and use HeapStateBackend)
> other
> > > than
> > > > use RocksDBStateBackend to get better performance. But increase
> > > parallelism
> > > > will have some other problems, after this FLIP, we can run Flink Job
> > with
> > > > the same parallelism as RocksDBStateBackend and get better
> performance
> > > > also.
> > > >
> > > > Best,
> > > > Congxian
> > > >
> > > >
> > > > Yu Li <ca...@gmail.com> 于2019年8月16日周五 上午12:14写道：
> > > >
> > > > > Thanks all for the reviews and comments!
> > > > >
> > > > > bq. From the implementation plan, it looks like this exists purely
> > in a
> > > > new
> > > > > module and does not require any changes in other parts of Flink's
> > code.
> > > > Can
> > > > > you confirm that?
> > > > > Confirmed, thanks!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <
> > tzulitai@apache.org
> > > >
> > > > > wrote:
> > > > >
> > > > > > +1 to start a VOTE for this FLIP.
> > > > > >
> > > > > > Given the properties of this new state backend and that it will
> > exist
> > > > as
> > > > > a
> > > > > > new module without touching the original heap backend, I don't
> see
> > a
> > > > harm
> > > > > > in including this.
> > > > > > Regarding design of the feature, I've already mentioned my
> comments
> > > in
> > > > > the
> > > > > > original discussion thread.
> > > > > >
> > > > > > Cheers,
> > > > > > Gordon
> > > > > >
> > > > > > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com>
> wrote:
> > > > > >
> > > > > > > Big +1 for this feature.
> > > > > > >
> > > > > > > Our customers including me, have ever met dilemma where we have
> > to
> > > > use
> > > > > > > window to aggregate events in applications like real-time
> > > monitoring.
> > > > > The
> > > > > > > larger of timer and window state, the poor performance of
> > RocksDB.
> > > > > > However,
> > > > > > > switching to use FsStateBackend would always make me feel fear
> > > about
> > > > > the
> > > > > > > OOM errors.
> > > > > > >
> > > > > > > Look forward for more powerful enrichment to state-backend, and
> > > help
> > > > > > Flink
> > > > > > > to achieve better performance together.
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > > ________________________________
> > > > > > > From: Stephan Ewen <se...@apache.org>
> > > > > > > Sent: Thursday, August 15, 2019 23:07
> > > > > > > To: dev <de...@flink.apache.org>
> > > > > > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State
> > Backend
> > > > > > >
> > > > > > > +1 for this feature. I think this will be appreciated by users,
> > as
> > > a
> > > > > way
> > > > > > to
> > > > > > > use the HeapStateBackend with a safety-net against OOM errors.
> > > > > > > And having had major production exposure is great.
> > > > > > >
> > > > > > > From the implementation plan, it looks like this exists purely
> > in a
> > > > new
> > > > > > > module and does not require any changes in other parts of
> Flink's
> > > > code.
> > > > > > Can
> > > > > > > you confirm that?
> > > > > > >
> > > > > > > Other that that, I have no further questions and we could
> proceed
> > > to
> > > > > vote
> > > > > > > on this FLIP, from my side.
> > > > > > >
> > > > > > > Best,
> > > > > > > Stephan
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > > Sorry for forgetting to give the link of the FLIP, here it
> is:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com>
> wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > We ever held a discussion about this feature before [1] but
> > now
> > > > > > opening
> > > > > > > > > another thread because after a second thought introducing a
> > new
> > > > > > backend
> > > > > > > > > instead of modifying the existing heap backend is a better
> > > option
> > > > > to
> > > > > > > > > prevent causing any regression or surprise to existing
> > > > > in-production
> > > > > > > > usage.
> > > > > > > > > And since introducing a new backend is relatively big
> change,
> > > we
> > > > > > regard
> > > > > > > > it
> > > > > > > > > as a FLIP and need another discussion and voting process
> > > > according
> > > > > to
> > > > > > > our
> > > > > > > > > newly drafted bylaw [2].
> > > > > > > > >
> > > > > > > > > Please allow me to quote the brief description from the old
> > > > thread
> > > > > > [1]
> > > > > > > > for
> > > > > > > > > the convenience of those who noticed this feature for the
> > first
> > > > > time:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends
> > in
> > > > > Flink,
> > > > > > > > > since state lives as Java objects on the heap in
> > > > > > HeapKeyedStateBackend
> > > > > > > > and
> > > > > > > > > the de/serialization only happens during state snapshot and
> > > > > restore,
> > > > > > it
> > > > > > > > > outperforms RocksDBKeyeStateBackend when all data could
> > reside
> > > in
> > > > > > > > memory.**However,
> > > > > > > > > along with the advantage, HeapKeyedStateBackend also has
> its
> > > > > > > > shortcomings,
> > > > > > > > > and the most painful one is the difficulty to estimate the
> > > > maximum
> > > > > > heap
> > > > > > > > > size (Xmx) to set, and we will suffer from GC impact once
> the
> > > > heap
> > > > > > > memory
> > > > > > > > > is not enough to hold all state data. There’re several
> > > > (inevitable)
> > > > > > > > causes
> > > > > > > > > for such scenario, including (but not limited to):*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ** Memory overhead of Java object representation (tens of
> > times
> > > > of
> > > > > > the
> > > > > > > > > serialized data size).* Data flood caused by burst
> traffic.*
> > > Data
> > > > > > > > > accumulation caused by source malfunction.**To resolve this
> > > > > problem,
> > > > > > we
> > > > > > > > > proposed a solution to support spilling state data to disk
> > > before
> > > > > > heap
> > > > > > > > > memory is exhausted. We will monitor the heap usage and
> > choose
> > > > the
> > > > > > > > coldest
> > > > > > > > > data to spill, and reload them when heap memory is regained
> > > after
> > > > > > data
> > > > > > > > > removing or TTL expiration, automatically. Furthermore, *to
> > > > prevent
> > > > > > > > > causing unexpected regression to existing usage of
> > > > > > > HeapKeyedStateBackend,
> > > > > > > > > we plan to introduce a new SpillableHeapKeyedStateBackend
> and
> > > > > change
> > > > > > it
> > > > > > > > to
> > > > > > > > > default in future if proven to be stable.
> > > > > > > > >
> > > > > > > > > Please let us know your point of the feature and any
> comment
> > is
> > > > > > > > > welcomed/appreciated. Thanks.
> > > > > > > > >
> > > > > > > > > [1] https://s.apache.org/pxeif
> > > > > > > > > [2]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Yu
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Zili Chen <wa...@gmail.com>.

Hi Yu,

Notice that the wiki is still marked as "*Under Discussion*" state.

I think you can update it correspondingly.

Best,
tison.


Yu Li <ca...@gmail.com> 于2019年8月20日周二 下午10:28写道：

> Sorry for the lag but since we've got a consensus days ago, I started a
> vote thread which will have a result by EOD, thus I'm closing this
> discussion thread. Thanks all for the participation and
> comments/suggestions!
>
> Best Regards,
> Yu
>
>
> On Fri, 16 Aug 2019 at 09:09, Till Rohrmann <tr...@apache.org> wrote:
>
> > +1 for this FLIP and the feature. I think this feature will be super
> > helpful for many Flink users.
> >
> > Once the SpillableHeapKeyedStateBackend has proven to be superior to the
> > HeapKeyedStateBackend we should think about removing the latter
> completely
> > to reduce maintenance burden.
> >
> > Cheers,
> > Till
> >
> > On Fri, Aug 16, 2019 at 4:06 AM Congxian Qiu <qc...@gmail.com>
> > wrote:
> >
> > > Big +1 for this feature.
> > >
> > > This FLIP can help improves at least the following two scenarios:
> > > - Temporary data peak when using Heap StateBackend
> > > - Heap State Backend has better performance than RocksDBStateBackend,
> > > especially on SATA disk. there are some guys ever told me that they
> > > increased the parallelism of operators(and use HeapStateBackend) other
> > than
> > > use RocksDBStateBackend to get better performance. But increase
> > parallelism
> > > will have some other problems, after this FLIP, we can run Flink Job
> with
> > > the same parallelism as RocksDBStateBackend and get better performance
> > > also.
> > >
> > > Best,
> > > Congxian
> > >
> > >
> > > Yu Li <ca...@gmail.com> 于2019年8月16日周五 上午12:14写道：
> > >
> > > > Thanks all for the reviews and comments!
> > > >
> > > > bq. From the implementation plan, it looks like this exists purely
> in a
> > > new
> > > > module and does not require any changes in other parts of Flink's
> code.
> > > Can
> > > > you confirm that?
> > > > Confirmed, thanks!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <
> tzulitai@apache.org
> > >
> > > > wrote:
> > > >
> > > > > +1 to start a VOTE for this FLIP.
> > > > >
> > > > > Given the properties of this new state backend and that it will
> exist
> > > as
> > > > a
> > > > > new module without touching the original heap backend, I don't see
> a
> > > harm
> > > > > in including this.
> > > > > Regarding design of the feature, I've already mentioned my comments
> > in
> > > > the
> > > > > original discussion thread.
> > > > >
> > > > > Cheers,
> > > > > Gordon
> > > > >
> > > > > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:
> > > > >
> > > > > > Big +1 for this feature.
> > > > > >
> > > > > > Our customers including me, have ever met dilemma where we have
> to
> > > use
> > > > > > window to aggregate events in applications like real-time
> > monitoring.
> > > > The
> > > > > > larger of timer and window state, the poor performance of
> RocksDB.
> > > > > However,
> > > > > > switching to use FsStateBackend would always make me feel fear
> > about
> > > > the
> > > > > > OOM errors.
> > > > > >
> > > > > > Look forward for more powerful enrichment to state-backend, and
> > help
> > > > > Flink
> > > > > > to achieve better performance together.
> > > > > >
> > > > > > Best
> > > > > > Yun Tang
> > > > > > ________________________________
> > > > > > From: Stephan Ewen <se...@apache.org>
> > > > > > Sent: Thursday, August 15, 2019 23:07
> > > > > > To: dev <de...@flink.apache.org>
> > > > > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State
> Backend
> > > > > >
> > > > > > +1 for this feature. I think this will be appreciated by users,
> as
> > a
> > > > way
> > > > > to
> > > > > > use the HeapStateBackend with a safety-net against OOM errors.
> > > > > > And having had major production exposure is great.
> > > > > >
> > > > > > From the implementation plan, it looks like this exists purely
> in a
> > > new
> > > > > > module and does not require any changes in other parts of Flink's
> > > code.
> > > > > Can
> > > > > > you confirm that?
> > > > > >
> > > > > > Other that that, I have no further questions and we could proceed
> > to
> > > > vote
> > > > > > on this FLIP, from my side.
> > > > > >
> > > > > > Best,
> > > > > > Stephan
> > > > > >
> > > > > >
> > > > > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
> > > > > >
> > > > > > > Sorry for forgetting to give the link of the FLIP, here it is:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > > >
> > > > > > >
> > > > > > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > We ever held a discussion about this feature before [1] but
> now
> > > > > opening
> > > > > > > > another thread because after a second thought introducing a
> new
> > > > > backend
> > > > > > > > instead of modifying the existing heap backend is a better
> > option
> > > > to
> > > > > > > > prevent causing any regression or surprise to existing
> > > > in-production
> > > > > > > usage.
> > > > > > > > And since introducing a new backend is relatively big change,
> > we
> > > > > regard
> > > > > > > it
> > > > > > > > as a FLIP and need another discussion and voting process
> > > according
> > > > to
> > > > > > our
> > > > > > > > newly drafted bylaw [2].
> > > > > > > >
> > > > > > > > Please allow me to quote the brief description from the old
> > > thread
> > > > > [1]
> > > > > > > for
> > > > > > > > the convenience of those who noticed this feature for the
> first
> > > > time:
> > > > > > > >
> > > > > > > >
> > > > > > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends
> in
> > > > Flink,
> > > > > > > > since state lives as Java objects on the heap in
> > > > > HeapKeyedStateBackend
> > > > > > > and
> > > > > > > > the de/serialization only happens during state snapshot and
> > > > restore,
> > > > > it
> > > > > > > > outperforms RocksDBKeyeStateBackend when all data could
> reside
> > in
> > > > > > > memory.**However,
> > > > > > > > along with the advantage, HeapKeyedStateBackend also has its
> > > > > > > shortcomings,
> > > > > > > > and the most painful one is the difficulty to estimate the
> > > maximum
> > > > > heap
> > > > > > > > size (Xmx) to set, and we will suffer from GC impact once the
> > > heap
> > > > > > memory
> > > > > > > > is not enough to hold all state data. There’re several
> > > (inevitable)
> > > > > > > causes
> > > > > > > > for such scenario, including (but not limited to):*
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > ** Memory overhead of Java object representation (tens of
> times
> > > of
> > > > > the
> > > > > > > > serialized data size).* Data flood caused by burst traffic.*
> > Data
> > > > > > > > accumulation caused by source malfunction.**To resolve this
> > > > problem,
> > > > > we
> > > > > > > > proposed a solution to support spilling state data to disk
> > before
> > > > > heap
> > > > > > > > memory is exhausted. We will monitor the heap usage and
> choose
> > > the
> > > > > > > coldest
> > > > > > > > data to spill, and reload them when heap memory is regained
> > after
> > > > > data
> > > > > > > > removing or TTL expiration, automatically. Furthermore, *to
> > > prevent
> > > > > > > > causing unexpected regression to existing usage of
> > > > > > HeapKeyedStateBackend,
> > > > > > > > we plan to introduce a new SpillableHeapKeyedStateBackend and
> > > > change
> > > > > it
> > > > > > > to
> > > > > > > > default in future if proven to be stable.
> > > > > > > >
> > > > > > > > Please let us know your point of the feature and any comment
> is
> > > > > > > > welcomed/appreciated. Thanks.
> > > > > > > >
> > > > > > > > [1] https://s.apache.org/pxeif
> > > > > > > > [2]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Yu Li <ca...@gmail.com>.

Sorry for the lag but since we've got a consensus days ago, I started a
vote thread which will have a result by EOD, thus I'm closing this
discussion thread. Thanks all for the participation and
comments/suggestions!

Best Regards,
Yu


On Fri, 16 Aug 2019 at 09:09, Till Rohrmann <tr...@apache.org> wrote:

> +1 for this FLIP and the feature. I think this feature will be super
> helpful for many Flink users.
>
> Once the SpillableHeapKeyedStateBackend has proven to be superior to the
> HeapKeyedStateBackend we should think about removing the latter completely
> to reduce maintenance burden.
>
> Cheers,
> Till
>
> On Fri, Aug 16, 2019 at 4:06 AM Congxian Qiu <qc...@gmail.com>
> wrote:
>
> > Big +1 for this feature.
> >
> > This FLIP can help improves at least the following two scenarios:
> > - Temporary data peak when using Heap StateBackend
> > - Heap State Backend has better performance than RocksDBStateBackend,
> > especially on SATA disk. there are some guys ever told me that they
> > increased the parallelism of operators(and use HeapStateBackend) other
> than
> > use RocksDBStateBackend to get better performance. But increase
> parallelism
> > will have some other problems, after this FLIP, we can run Flink Job with
> > the same parallelism as RocksDBStateBackend and get better performance
> > also.
> >
> > Best,
> > Congxian
> >
> >
> > Yu Li <ca...@gmail.com> 于2019年8月16日周五 上午12:14写道：
> >
> > > Thanks all for the reviews and comments!
> > >
> > > bq. From the implementation plan, it looks like this exists purely in a
> > new
> > > module and does not require any changes in other parts of Flink's code.
> > Can
> > > you confirm that?
> > > Confirmed, thanks!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <tzulitai@apache.org
> >
> > > wrote:
> > >
> > > > +1 to start a VOTE for this FLIP.
> > > >
> > > > Given the properties of this new state backend and that it will exist
> > as
> > > a
> > > > new module without touching the original heap backend, I don't see a
> > harm
> > > > in including this.
> > > > Regarding design of the feature, I've already mentioned my comments
> in
> > > the
> > > > original discussion thread.
> > > >
> > > > Cheers,
> > > > Gordon
> > > >
> > > > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:
> > > >
> > > > > Big +1 for this feature.
> > > > >
> > > > > Our customers including me, have ever met dilemma where we have to
> > use
> > > > > window to aggregate events in applications like real-time
> monitoring.
> > > The
> > > > > larger of timer and window state, the poor performance of RocksDB.
> > > > However,
> > > > > switching to use FsStateBackend would always make me feel fear
> about
> > > the
> > > > > OOM errors.
> > > > >
> > > > > Look forward for more powerful enrichment to state-backend, and
> help
> > > > Flink
> > > > > to achieve better performance together.
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > ________________________________
> > > > > From: Stephan Ewen <se...@apache.org>
> > > > > Sent: Thursday, August 15, 2019 23:07
> > > > > To: dev <de...@flink.apache.org>
> > > > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
> > > > >
> > > > > +1 for this feature. I think this will be appreciated by users, as
> a
> > > way
> > > > to
> > > > > use the HeapStateBackend with a safety-net against OOM errors.
> > > > > And having had major production exposure is great.
> > > > >
> > > > > From the implementation plan, it looks like this exists purely in a
> > new
> > > > > module and does not require any changes in other parts of Flink's
> > code.
> > > > Can
> > > > > you confirm that?
> > > > >
> > > > > Other that that, I have no further questions and we could proceed
> to
> > > vote
> > > > > on this FLIP, from my side.
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > >
> > > > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
> > > > >
> > > > > > Sorry for forgetting to give the link of the FLIP, here it is:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > >
> > > > > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > We ever held a discussion about this feature before [1] but now
> > > > opening
> > > > > > > another thread because after a second thought introducing a new
> > > > backend
> > > > > > > instead of modifying the existing heap backend is a better
> option
> > > to
> > > > > > > prevent causing any regression or surprise to existing
> > > in-production
> > > > > > usage.
> > > > > > > And since introducing a new backend is relatively big change,
> we
> > > > regard
> > > > > > it
> > > > > > > as a FLIP and need another discussion and voting process
> > according
> > > to
> > > > > our
> > > > > > > newly drafted bylaw [2].
> > > > > > >
> > > > > > > Please allow me to quote the brief description from the old
> > thread
> > > > [1]
> > > > > > for
> > > > > > > the convenience of those who noticed this feature for the first
> > > time:
> > > > > > >
> > > > > > >
> > > > > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in
> > > Flink,
> > > > > > > since state lives as Java objects on the heap in
> > > > HeapKeyedStateBackend
> > > > > > and
> > > > > > > the de/serialization only happens during state snapshot and
> > > restore,
> > > > it
> > > > > > > outperforms RocksDBKeyeStateBackend when all data could reside
> in
> > > > > > memory.**However,
> > > > > > > along with the advantage, HeapKeyedStateBackend also has its
> > > > > > shortcomings,
> > > > > > > and the most painful one is the difficulty to estimate the
> > maximum
> > > > heap
> > > > > > > size (Xmx) to set, and we will suffer from GC impact once the
> > heap
> > > > > memory
> > > > > > > is not enough to hold all state data. There’re several
> > (inevitable)
> > > > > > causes
> > > > > > > for such scenario, including (but not limited to):*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > ** Memory overhead of Java object representation (tens of times
> > of
> > > > the
> > > > > > > serialized data size).* Data flood caused by burst traffic.*
> Data
> > > > > > > accumulation caused by source malfunction.**To resolve this
> > > problem,
> > > > we
> > > > > > > proposed a solution to support spilling state data to disk
> before
> > > > heap
> > > > > > > memory is exhausted. We will monitor the heap usage and choose
> > the
> > > > > > coldest
> > > > > > > data to spill, and reload them when heap memory is regained
> after
> > > > data
> > > > > > > removing or TTL expiration, automatically. Furthermore, *to
> > prevent
> > > > > > > causing unexpected regression to existing usage of
> > > > > HeapKeyedStateBackend,
> > > > > > > we plan to introduce a new SpillableHeapKeyedStateBackend and
> > > change
> > > > it
> > > > > > to
> > > > > > > default in future if proven to be stable.
> > > > > > >
> > > > > > > Please let us know your point of the feature and any comment is
> > > > > > > welcomed/appreciated. Thanks.
> > > > > > >
> > > > > > > [1] https://s.apache.org/pxeif
> > > > > > > [2]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Till Rohrmann <tr...@apache.org>.

+1 for this FLIP and the feature. I think this feature will be super
helpful for many Flink users.

Once the SpillableHeapKeyedStateBackend has proven to be superior to the
HeapKeyedStateBackend we should think about removing the latter completely
to reduce maintenance burden.

Cheers,
Till

On Fri, Aug 16, 2019 at 4:06 AM Congxian Qiu <qc...@gmail.com> wrote:

> Big +1 for this feature.
>
> This FLIP can help improves at least the following two scenarios:
> - Temporary data peak when using Heap StateBackend
> - Heap State Backend has better performance than RocksDBStateBackend,
> especially on SATA disk. there are some guys ever told me that they
> increased the parallelism of operators(and use HeapStateBackend) other than
> use RocksDBStateBackend to get better performance. But increase parallelism
> will have some other problems, after this FLIP, we can run Flink Job with
> the same parallelism as RocksDBStateBackend and get better performance
> also.
>
> Best,
> Congxian
>
>
> Yu Li <ca...@gmail.com> 于2019年8月16日周五 上午12:14写道：
>
> > Thanks all for the reviews and comments!
> >
> > bq. From the implementation plan, it looks like this exists purely in a
> new
> > module and does not require any changes in other parts of Flink's code.
> Can
> > you confirm that?
> > Confirmed, thanks!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <tz...@apache.org>
> > wrote:
> >
> > > +1 to start a VOTE for this FLIP.
> > >
> > > Given the properties of this new state backend and that it will exist
> as
> > a
> > > new module without touching the original heap backend, I don't see a
> harm
> > > in including this.
> > > Regarding design of the feature, I've already mentioned my comments in
> > the
> > > original discussion thread.
> > >
> > > Cheers,
> > > Gordon
> > >
> > > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:
> > >
> > > > Big +1 for this feature.
> > > >
> > > > Our customers including me, have ever met dilemma where we have to
> use
> > > > window to aggregate events in applications like real-time monitoring.
> > The
> > > > larger of timer and window state, the poor performance of RocksDB.
> > > However,
> > > > switching to use FsStateBackend would always make me feel fear about
> > the
> > > > OOM errors.
> > > >
> > > > Look forward for more powerful enrichment to state-backend, and help
> > > Flink
> > > > to achieve better performance together.
> > > >
> > > > Best
> > > > Yun Tang
> > > > ________________________________
> > > > From: Stephan Ewen <se...@apache.org>
> > > > Sent: Thursday, August 15, 2019 23:07
> > > > To: dev <de...@flink.apache.org>
> > > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
> > > >
> > > > +1 for this feature. I think this will be appreciated by users, as a
> > way
> > > to
> > > > use the HeapStateBackend with a safety-net against OOM errors.
> > > > And having had major production exposure is great.
> > > >
> > > > From the implementation plan, it looks like this exists purely in a
> new
> > > > module and does not require any changes in other parts of Flink's
> code.
> > > Can
> > > > you confirm that?
> > > >
> > > > Other that that, I have no further questions and we could proceed to
> > vote
> > > > on this FLIP, from my side.
> > > >
> > > > Best,
> > > > Stephan
> > > >
> > > >
> > > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
> > > >
> > > > > Sorry for forgetting to give the link of the FLIP, here it is:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > > >
> > > > > Thanks!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > >
> > > > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > We ever held a discussion about this feature before [1] but now
> > > opening
> > > > > > another thread because after a second thought introducing a new
> > > backend
> > > > > > instead of modifying the existing heap backend is a better option
> > to
> > > > > > prevent causing any regression or surprise to existing
> > in-production
> > > > > usage.
> > > > > > And since introducing a new backend is relatively big change, we
> > > regard
> > > > > it
> > > > > > as a FLIP and need another discussion and voting process
> according
> > to
> > > > our
> > > > > > newly drafted bylaw [2].
> > > > > >
> > > > > > Please allow me to quote the brief description from the old
> thread
> > > [1]
> > > > > for
> > > > > > the convenience of those who noticed this feature for the first
> > time:
> > > > > >
> > > > > >
> > > > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in
> > Flink,
> > > > > > since state lives as Java objects on the heap in
> > > HeapKeyedStateBackend
> > > > > and
> > > > > > the de/serialization only happens during state snapshot and
> > restore,
> > > it
> > > > > > outperforms RocksDBKeyeStateBackend when all data could reside in
> > > > > memory.**However,
> > > > > > along with the advantage, HeapKeyedStateBackend also has its
> > > > > shortcomings,
> > > > > > and the most painful one is the difficulty to estimate the
> maximum
> > > heap
> > > > > > size (Xmx) to set, and we will suffer from GC impact once the
> heap
> > > > memory
> > > > > > is not enough to hold all state data. There’re several
> (inevitable)
> > > > > causes
> > > > > > for such scenario, including (but not limited to):*
> > > > > >
> > > > > >
> > > > > >
> > > > > > ** Memory overhead of Java object representation (tens of times
> of
> > > the
> > > > > > serialized data size).* Data flood caused by burst traffic.* Data
> > > > > > accumulation caused by source malfunction.**To resolve this
> > problem,
> > > we
> > > > > > proposed a solution to support spilling state data to disk before
> > > heap
> > > > > > memory is exhausted. We will monitor the heap usage and choose
> the
> > > > > coldest
> > > > > > data to spill, and reload them when heap memory is regained after
> > > data
> > > > > > removing or TTL expiration, automatically. Furthermore, *to
> prevent
> > > > > > causing unexpected regression to existing usage of
> > > > HeapKeyedStateBackend,
> > > > > > we plan to introduce a new SpillableHeapKeyedStateBackend and
> > change
> > > it
> > > > > to
> > > > > > default in future if proven to be stable.
> > > > > >
> > > > > > Please let us know your point of the feature and any comment is
> > > > > > welcomed/appreciated. Thanks.
> > > > > >
> > > > > > [1] https://s.apache.org/pxeif
> > > > > > [2]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Congxian Qiu <qc...@gmail.com>.

Big +1 for this feature.

This FLIP can help improves at least the following two scenarios:
- Temporary data peak when using Heap StateBackend
- Heap State Backend has better performance than RocksDBStateBackend,
especially on SATA disk. there are some guys ever told me that they
increased the parallelism of operators(and use HeapStateBackend) other than
use RocksDBStateBackend to get better performance. But increase parallelism
will have some other problems, after this FLIP, we can run Flink Job with
the same parallelism as RocksDBStateBackend and get better performance also.

Best,
Congxian


Yu Li <ca...@gmail.com> 于2019年8月16日周五 上午12:14写道：

> Thanks all for the reviews and comments!
>
> bq. From the implementation plan, it looks like this exists purely in a new
> module and does not require any changes in other parts of Flink's code. Can
> you confirm that?
> Confirmed, thanks!
>
> Best Regards,
> Yu
>
>
> On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <tz...@apache.org>
> wrote:
>
> > +1 to start a VOTE for this FLIP.
> >
> > Given the properties of this new state backend and that it will exist as
> a
> > new module without touching the original heap backend, I don't see a harm
> > in including this.
> > Regarding design of the feature, I've already mentioned my comments in
> the
> > original discussion thread.
> >
> > Cheers,
> > Gordon
> >
> > On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:
> >
> > > Big +1 for this feature.
> > >
> > > Our customers including me, have ever met dilemma where we have to use
> > > window to aggregate events in applications like real-time monitoring.
> The
> > > larger of timer and window state, the poor performance of RocksDB.
> > However,
> > > switching to use FsStateBackend would always make me feel fear about
> the
> > > OOM errors.
> > >
> > > Look forward for more powerful enrichment to state-backend, and help
> > Flink
> > > to achieve better performance together.
> > >
> > > Best
> > > Yun Tang
> > > ________________________________
> > > From: Stephan Ewen <se...@apache.org>
> > > Sent: Thursday, August 15, 2019 23:07
> > > To: dev <de...@flink.apache.org>
> > > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
> > >
> > > +1 for this feature. I think this will be appreciated by users, as a
> way
> > to
> > > use the HeapStateBackend with a safety-net against OOM errors.
> > > And having had major production exposure is great.
> > >
> > > From the implementation plan, it looks like this exists purely in a new
> > > module and does not require any changes in other parts of Flink's code.
> > Can
> > > you confirm that?
> > >
> > > Other that that, I have no further questions and we could proceed to
> vote
> > > on this FLIP, from my side.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
> > >
> > > > Sorry for forgetting to give the link of the FLIP, here it is:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > > >
> > > > Thanks!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We ever held a discussion about this feature before [1] but now
> > opening
> > > > > another thread because after a second thought introducing a new
> > backend
> > > > > instead of modifying the existing heap backend is a better option
> to
> > > > > prevent causing any regression or surprise to existing
> in-production
> > > > usage.
> > > > > And since introducing a new backend is relatively big change, we
> > regard
> > > > it
> > > > > as a FLIP and need another discussion and voting process according
> to
> > > our
> > > > > newly drafted bylaw [2].
> > > > >
> > > > > Please allow me to quote the brief description from the old thread
> > [1]
> > > > for
> > > > > the convenience of those who noticed this feature for the first
> time:
> > > > >
> > > > >
> > > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in
> Flink,
> > > > > since state lives as Java objects on the heap in
> > HeapKeyedStateBackend
> > > > and
> > > > > the de/serialization only happens during state snapshot and
> restore,
> > it
> > > > > outperforms RocksDBKeyeStateBackend when all data could reside in
> > > > memory.**However,
> > > > > along with the advantage, HeapKeyedStateBackend also has its
> > > > shortcomings,
> > > > > and the most painful one is the difficulty to estimate the maximum
> > heap
> > > > > size (Xmx) to set, and we will suffer from GC impact once the heap
> > > memory
> > > > > is not enough to hold all state data. There’re several (inevitable)
> > > > causes
> > > > > for such scenario, including (but not limited to):*
> > > > >
> > > > >
> > > > >
> > > > > ** Memory overhead of Java object representation (tens of times of
> > the
> > > > > serialized data size).* Data flood caused by burst traffic.* Data
> > > > > accumulation caused by source malfunction.**To resolve this
> problem,
> > we
> > > > > proposed a solution to support spilling state data to disk before
> > heap
> > > > > memory is exhausted. We will monitor the heap usage and choose the
> > > > coldest
> > > > > data to spill, and reload them when heap memory is regained after
> > data
> > > > > removing or TTL expiration, automatically. Furthermore, *to prevent
> > > > > causing unexpected regression to existing usage of
> > > HeapKeyedStateBackend,
> > > > > we plan to introduce a new SpillableHeapKeyedStateBackend and
> change
> > it
> > > > to
> > > > > default in future if proven to be stable.
> > > > >
> > > > > Please let us know your point of the feature and any comment is
> > > > > welcomed/appreciated. Thanks.
> > > > >
> > > > > [1] https://s.apache.org/pxeif
> > > > > [2]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Yu Li <ca...@gmail.com>.

Thanks all for the reviews and comments!

bq. From the implementation plan, it looks like this exists purely in a new
module and does not require any changes in other parts of Flink's code. Can
you confirm that?
Confirmed, thanks!

Best Regards,
Yu


On Thu, 15 Aug 2019 at 18:04, Tzu-Li (Gordon) Tai <tz...@apache.org>
wrote:

> +1 to start a VOTE for this FLIP.
>
> Given the properties of this new state backend and that it will exist as a
> new module without touching the original heap backend, I don't see a harm
> in including this.
> Regarding design of the feature, I've already mentioned my comments in the
> original discussion thread.
>
> Cheers,
> Gordon
>
> On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:
>
> > Big +1 for this feature.
> >
> > Our customers including me, have ever met dilemma where we have to use
> > window to aggregate events in applications like real-time monitoring. The
> > larger of timer and window state, the poor performance of RocksDB.
> However,
> > switching to use FsStateBackend would always make me feel fear about the
> > OOM errors.
> >
> > Look forward for more powerful enrichment to state-backend, and help
> Flink
> > to achieve better performance together.
> >
> > Best
> > Yun Tang
> > ________________________________
> > From: Stephan Ewen <se...@apache.org>
> > Sent: Thursday, August 15, 2019 23:07
> > To: dev <de...@flink.apache.org>
> > Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
> >
> > +1 for this feature. I think this will be appreciated by users, as a way
> to
> > use the HeapStateBackend with a safety-net against OOM errors.
> > And having had major production exposure is great.
> >
> > From the implementation plan, it looks like this exists purely in a new
> > module and does not require any changes in other parts of Flink's code.
> Can
> > you confirm that?
> >
> > Other that that, I have no further questions and we could proceed to vote
> > on this FLIP, from my side.
> >
> > Best,
> > Stephan
> >
> >
> > On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
> >
> > > Sorry for forgetting to give the link of the FLIP, here it is:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> > >
> > > Thanks!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > > We ever held a discussion about this feature before [1] but now
> opening
> > > > another thread because after a second thought introducing a new
> backend
> > > > instead of modifying the existing heap backend is a better option to
> > > > prevent causing any regression or surprise to existing in-production
> > > usage.
> > > > And since introducing a new backend is relatively big change, we
> regard
> > > it
> > > > as a FLIP and need another discussion and voting process according to
> > our
> > > > newly drafted bylaw [2].
> > > >
> > > > Please allow me to quote the brief description from the old thread
> [1]
> > > for
> > > > the convenience of those who noticed this feature for the first time:
> > > >
> > > >
> > > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> > > > since state lives as Java objects on the heap in
> HeapKeyedStateBackend
> > > and
> > > > the de/serialization only happens during state snapshot and restore,
> it
> > > > outperforms RocksDBKeyeStateBackend when all data could reside in
> > > memory.**However,
> > > > along with the advantage, HeapKeyedStateBackend also has its
> > > shortcomings,
> > > > and the most painful one is the difficulty to estimate the maximum
> heap
> > > > size (Xmx) to set, and we will suffer from GC impact once the heap
> > memory
> > > > is not enough to hold all state data. There’re several (inevitable)
> > > causes
> > > > for such scenario, including (but not limited to):*
> > > >
> > > >
> > > >
> > > > ** Memory overhead of Java object representation (tens of times of
> the
> > > > serialized data size).* Data flood caused by burst traffic.* Data
> > > > accumulation caused by source malfunction.**To resolve this problem,
> we
> > > > proposed a solution to support spilling state data to disk before
> heap
> > > > memory is exhausted. We will monitor the heap usage and choose the
> > > coldest
> > > > data to spill, and reload them when heap memory is regained after
> data
> > > > removing or TTL expiration, automatically. Furthermore, *to prevent
> > > > causing unexpected regression to existing usage of
> > HeapKeyedStateBackend,
> > > > we plan to introduce a new SpillableHeapKeyedStateBackend and change
> it
> > > to
> > > > default in future if proven to be stable.
> > > >
> > > > Please let us know your point of the feature and any comment is
> > > > welcomed/appreciated. Thanks.
> > > >
> > > > [1] https://s.apache.org/pxeif
> > > > [2]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.

+1 to start a VOTE for this FLIP.

Given the properties of this new state backend and that it will exist as a
new module without touching the original heap backend, I don't see a harm
in including this.
Regarding design of the feature, I've already mentioned my comments in the
original discussion thread.

Cheers,
Gordon

On Thu, Aug 15, 2019 at 5:53 PM Yun Tang <my...@live.com> wrote:

> Big +1 for this feature.
>
> Our customers including me, have ever met dilemma where we have to use
> window to aggregate events in applications like real-time monitoring. The
> larger of timer and window state, the poor performance of RocksDB. However,
> switching to use FsStateBackend would always make me feel fear about the
> OOM errors.
>
> Look forward for more powerful enrichment to state-backend, and help Flink
> to achieve better performance together.
>
> Best
> Yun Tang
> ________________________________
> From: Stephan Ewen <se...@apache.org>
> Sent: Thursday, August 15, 2019 23:07
> To: dev <de...@flink.apache.org>
> Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend
>
> +1 for this feature. I think this will be appreciated by users, as a way to
> use the HeapStateBackend with a safety-net against OOM errors.
> And having had major production exposure is great.
>
> From the implementation plan, it looks like this exists purely in a new
> module and does not require any changes in other parts of Flink's code. Can
> you confirm that?
>
> Other that that, I have no further questions and we could proceed to vote
> on this FLIP, from my side.
>
> Best,
> Stephan
>
>
> On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:
>
> > Sorry for forgetting to give the link of the FLIP, here it is:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
> >
> > Thanks!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > We ever held a discussion about this feature before [1] but now opening
> > > another thread because after a second thought introducing a new backend
> > > instead of modifying the existing heap backend is a better option to
> > > prevent causing any regression or surprise to existing in-production
> > usage.
> > > And since introducing a new backend is relatively big change, we regard
> > it
> > > as a FLIP and need another discussion and voting process according to
> our
> > > newly drafted bylaw [2].
> > >
> > > Please allow me to quote the brief description from the old thread [1]
> > for
> > > the convenience of those who noticed this feature for the first time:
> > >
> > >
> > > *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> > > since state lives as Java objects on the heap in HeapKeyedStateBackend
> > and
> > > the de/serialization only happens during state snapshot and restore, it
> > > outperforms RocksDBKeyeStateBackend when all data could reside in
> > memory.**However,
> > > along with the advantage, HeapKeyedStateBackend also has its
> > shortcomings,
> > > and the most painful one is the difficulty to estimate the maximum heap
> > > size (Xmx) to set, and we will suffer from GC impact once the heap
> memory
> > > is not enough to hold all state data. There’re several (inevitable)
> > causes
> > > for such scenario, including (but not limited to):*
> > >
> > >
> > >
> > > ** Memory overhead of Java object representation (tens of times of the
> > > serialized data size).* Data flood caused by burst traffic.* Data
> > > accumulation caused by source malfunction.**To resolve this problem, we
> > > proposed a solution to support spilling state data to disk before heap
> > > memory is exhausted. We will monitor the heap usage and choose the
> > coldest
> > > data to spill, and reload them when heap memory is regained after data
> > > removing or TTL expiration, automatically. Furthermore, *to prevent
> > > causing unexpected regression to existing usage of
> HeapKeyedStateBackend,
> > > we plan to introduce a new SpillableHeapKeyedStateBackend and change it
> > to
> > > default in future if proven to be stable.
> > >
> > > Please let us know your point of the feature and any comment is
> > > welcomed/appreciated. Thanks.
> > >
> > > [1] https://s.apache.org/pxeif
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> > >
> > > Best Regards,
> > > Yu
> > >
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Yun Tang <my...@live.com>.

Big +1 for this feature.

Our customers including me, have ever met dilemma where we have to use window to aggregate events in applications like real-time monitoring. The larger of timer and window state, the poor performance of RocksDB. However, switching to use FsStateBackend would always make me feel fear about the OOM errors.

Look forward for more powerful enrichment to state-backend, and help Flink to achieve better performance together.

Best
Yun Tang
________________________________
From: Stephan Ewen <se...@apache.org>
Sent: Thursday, August 15, 2019 23:07
To: dev <de...@flink.apache.org>
Subject: Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

+1 for this feature. I think this will be appreciated by users, as a way to
use the HeapStateBackend with a safety-net against OOM errors.
And having had major production exposure is great.

From the implementation plan, it looks like this exists purely in a new
module and does not require any changes in other parts of Flink's code. Can
you confirm that?

Other that that, I have no further questions and we could proceed to vote
on this FLIP, from my side.

Best,
Stephan


On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:

> Sorry for forgetting to give the link of the FLIP, here it is:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
>
> Thanks!
>
> Best Regards,
> Yu
>
>
> On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
>
> > Hi All,
> >
> > We ever held a discussion about this feature before [1] but now opening
> > another thread because after a second thought introducing a new backend
> > instead of modifying the existing heap backend is a better option to
> > prevent causing any regression or surprise to existing in-production
> usage.
> > And since introducing a new backend is relatively big change, we regard
> it
> > as a FLIP and need another discussion and voting process according to our
> > newly drafted bylaw [2].
> >
> > Please allow me to quote the brief description from the old thread [1]
> for
> > the convenience of those who noticed this feature for the first time:
> >
> >
> > *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> > since state lives as Java objects on the heap in HeapKeyedStateBackend
> and
> > the de/serialization only happens during state snapshot and restore, it
> > outperforms RocksDBKeyeStateBackend when all data could reside in
> memory.**However,
> > along with the advantage, HeapKeyedStateBackend also has its
> shortcomings,
> > and the most painful one is the difficulty to estimate the maximum heap
> > size (Xmx) to set, and we will suffer from GC impact once the heap memory
> > is not enough to hold all state data. There’re several (inevitable)
> causes
> > for such scenario, including (but not limited to):*
> >
> >
> >
> > ** Memory overhead of Java object representation (tens of times of the
> > serialized data size).* Data flood caused by burst traffic.* Data
> > accumulation caused by source malfunction.**To resolve this problem, we
> > proposed a solution to support spilling state data to disk before heap
> > memory is exhausted. We will monitor the heap usage and choose the
> coldest
> > data to spill, and reload them when heap memory is regained after data
> > removing or TTL expiration, automatically. Furthermore, *to prevent
> > causing unexpected regression to existing usage of HeapKeyedStateBackend,
> > we plan to introduce a new SpillableHeapKeyedStateBackend and change it
> to
> > default in future if proven to be stable.
> >
> > Please let us know your point of the feature and any comment is
> > welcomed/appreciated. Thanks.
> >
> > [1] https://s.apache.org/pxeif
> > [2]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> >
> > Best Regards,
> > Yu
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Stephan Ewen <se...@apache.org>.

+1 for this feature. I think this will be appreciated by users, as a way to
use the HeapStateBackend with a safety-net against OOM errors.
And having had major production exposure is great.

From the implementation plan, it looks like this exists purely in a new
module and does not require any changes in other parts of Flink's code. Can
you confirm that?

Other that that, I have no further questions and we could proceed to vote
on this FLIP, from my side.

Best,
Stephan


On Tue, Aug 13, 2019 at 10:00 PM Yu Li <ca...@gmail.com> wrote:

> Sorry for forgetting to give the link of the FLIP, here it is:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend
>
> Thanks!
>
> Best Regards,
> Yu
>
>
> On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:
>
> > Hi All,
> >
> > We ever held a discussion about this feature before [1] but now opening
> > another thread because after a second thought introducing a new backend
> > instead of modifying the existing heap backend is a better option to
> > prevent causing any regression or surprise to existing in-production
> usage.
> > And since introducing a new backend is relatively big change, we regard
> it
> > as a FLIP and need another discussion and voting process according to our
> > newly drafted bylaw [2].
> >
> > Please allow me to quote the brief description from the old thread [1]
> for
> > the convenience of those who noticed this feature for the first time:
> >
> >
> > *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> > since state lives as Java objects on the heap in HeapKeyedStateBackend
> and
> > the de/serialization only happens during state snapshot and restore, it
> > outperforms RocksDBKeyeStateBackend when all data could reside in
> memory.**However,
> > along with the advantage, HeapKeyedStateBackend also has its
> shortcomings,
> > and the most painful one is the difficulty to estimate the maximum heap
> > size (Xmx) to set, and we will suffer from GC impact once the heap memory
> > is not enough to hold all state data. There’re several (inevitable)
> causes
> > for such scenario, including (but not limited to):*
> >
> >
> >
> > ** Memory overhead of Java object representation (tens of times of the
> > serialized data size).* Data flood caused by burst traffic.* Data
> > accumulation caused by source malfunction.**To resolve this problem, we
> > proposed a solution to support spilling state data to disk before heap
> > memory is exhausted. We will monitor the heap usage and choose the
> coldest
> > data to spill, and reload them when heap memory is regained after data
> > removing or TTL expiration, automatically. Furthermore, *to prevent
> > causing unexpected regression to existing usage of HeapKeyedStateBackend,
> > we plan to introduce a new SpillableHeapKeyedStateBackend and change it
> to
> > default in future if proven to be stable.
> >
> > Please let us know your point of the feature and any comment is
> > welcomed/appreciated. Thanks.
> >
> > [1] https://s.apache.org/pxeif
> > [2]
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
> >
> > Best Regards,
> > Yu
> >
>

Re: [DISCUSS] FLIP-50: Spill-able Heap Keyed State Backend

Posted by Yu Li <ca...@gmail.com>.

Sorry for forgetting to give the link of the FLIP, here it is:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-50%3A+Spill-able+Heap+Keyed+State+Backend

Thanks!

Best Regards,
Yu


On Tue, 13 Aug 2019 at 18:06, Yu Li <ca...@gmail.com> wrote:

> Hi All,
>
> We ever held a discussion about this feature before [1] but now opening
> another thread because after a second thought introducing a new backend
> instead of modifying the existing heap backend is a better option to
> prevent causing any regression or surprise to existing in-production usage.
> And since introducing a new backend is relatively big change, we regard it
> as a FLIP and need another discussion and voting process according to our
> newly drafted bylaw [2].
>
> Please allow me to quote the brief description from the old thread [1] for
> the convenience of those who noticed this feature for the first time:
>
>
> *HeapKeyedStateBackend is one of the two KeyedStateBackends in Flink,
> since state lives as Java objects on the heap in HeapKeyedStateBackend and
> the de/serialization only happens during state snapshot and restore, it
> outperforms RocksDBKeyeStateBackend when all data could reside in memory.**However,
> along with the advantage, HeapKeyedStateBackend also has its shortcomings,
> and the most painful one is the difficulty to estimate the maximum heap
> size (Xmx) to set, and we will suffer from GC impact once the heap memory
> is not enough to hold all state data. There’re several (inevitable) causes
> for such scenario, including (but not limited to):*
>
>
>
> ** Memory overhead of Java object representation (tens of times of the
> serialized data size).* Data flood caused by burst traffic.* Data
> accumulation caused by source malfunction.**To resolve this problem, we
> proposed a solution to support spilling state data to disk before heap
> memory is exhausted. We will monitor the heap usage and choose the coldest
> data to spill, and reload them when heap memory is regained after data
> removing or TTL expiration, automatically. Furthermore, *to prevent
> causing unexpected regression to existing usage of HeapKeyedStateBackend,
> we plan to introduce a new SpillableHeapKeyedStateBackend and change it to
> default in future if proven to be stable.
>
> Please let us know your point of the feature and any comment is
> welcomed/appreciated. Thanks.
>
> [1] https://s.apache.org/pxeif
> [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026
>
> Best Regards,
> Yu
>