You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Naveen Kumar <na...@flipkart.com.INVALID> on 2018/12/27 08:49:09 UTC

Hbase state backend in Flink

Hi,

I am exploring if we can plugin hbase as state backend in Flink. We have
need for streaming jobs with large window states, high throughput and
reliability.

I wanted to know if implementing Flink backend in Hbase or other
distributed KV store is possible. Any documentation or pointers will be
helpful.

Thanks,
Naveen

Re: Hbase state backend in Flink

Posted by Chen Qin <qi...@gmail.com>.

Hi Yu,

Very cool! I might be out of dated of what’s new in Flink already… 
Just wonder If there are efforts to support seconds level barrier alignment?

Chen

> On Dec 27, 2018, at 23:26, Yu Li <ca...@gmail.com> wrote:
> 
> FWIW, one major advantage of adopting HBase as Flink statebackend is to
> support direct read/write on DFS, so as to disaggregate storage and compute
> (DisAgg).  DisAgg has several benefits, such as supporting elastic
> computing in cloud, much better (order of magnitude) recovery speed when
> rescaling up/down (as Gyula also mentioned), etc. and we could eliminate
> the performance regression compared to local RW through techniques like
> adding a local L2 cache. More information please refer to our talk
> <https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf>
> at this year's Flink Forward China, and we could discuss more in another
> thread if interested.
> 
> Back to @Naveen's question here, we need to make HBase supporting embedded
> mode first before adopting it as Flink statebackend. We have done some
> initial work and please refer to HBASE-17743
> <https://issues.apache.org/jira/browse/HBASE-17743> and the design doc
> there for more details. And for sure we will upstream our work when ready
> to (smile).
> 
> Best Regards,
> Yu
> 
> 
> On Fri, 28 Dec 2018 at 13:12, Chen Qin <qi...@gmail.com> wrote:
> 
>> Hi Naveen,
>> 
>> AFAIK, there are two level of storage in typical statebackend
>> (local/remote). I think it kinda similar to what PC main memory and disk
>> analogy.
>> 
>> Take RocksDB Statebackend as example, window state (typical very large
>> ListState) persisted in partitioned local rocksdb files, adding element to
>> window is localized and cheap.When checkpoint starts, each of those rocksdb
>> do upload to corresponding HDFS directories separately.This is good in a
>> sense when any intermediate states between two successful checkpoints can
>> be overwritten and local snapshots can be done cheaply and asynchronously.
>> 
>> I heard folks tried to build mysqlbackend(deprecated), remote rocksdb as
>> service backend(hard to scale and performance bottleneck) , Cassandra(hard
>> to snapshot). All of which shares same trait on lack of local
>> parallelizable snapshot semantic.
>> 
>> Hope this helps!
>> Chen
>> 
>> On Thu, Dec 27, 2018 at 8:27 AM miki haiat <mi...@gmail.com> wrote:
>> 
>>> Did try to use rocksdb[1] as state backend?
>>> 
>>> 
>>> 1.
>>> 
>>> 
>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
>>> 
>>> 
>>> On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com
>>> .invalid
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I am exploring if we can plugin hbase as state backend in Flink. We
>> have
>>>> need for streaming jobs with large window states, high throughput and
>>>> reliability.
>>>> 
>>>> I wanted to know if implementing Flink backend in Hbase or other
>>>> distributed KV store is possible. Any documentation or pointers will be
>>>> helpful.
>>>> 
>>>> Thanks,
>>>> Naveen
>>>> 
>>> 
>>

Re: Hbase state backend in Flink

Posted by Yu Li <ca...@gmail.com>.

FWIW, one major advantage of adopting HBase as Flink statebackend is to
support direct read/write on DFS, so as to disaggregate storage and compute
(DisAgg).  DisAgg has several benefits, such as supporting elastic
computing in cloud, much better (order of magnitude) recovery speed when
rescaling up/down (as Gyula also mentioned), etc. and we could eliminate
the performance regression compared to local RW through techniques like
adding a local L2 cache. More information please refer to our talk
<https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf>
at this year's Flink Forward China, and we could discuss more in another
thread if interested.

Back to @Naveen's question here, we need to make HBase supporting embedded
mode first before adopting it as Flink statebackend. We have done some
initial work and please refer to HBASE-17743
<https://issues.apache.org/jira/browse/HBASE-17743> and the design doc
there for more details. And for sure we will upstream our work when ready
to (smile).

Best Regards,
Yu

On Fri, 28 Dec 2018 at 13:12, Chen Qin <qi...@gmail.com> wrote:

> Hi Naveen,
>
> AFAIK, there are two level of storage in typical statebackend
> (local/remote). I think it kinda similar to what PC main memory and disk
> analogy.
>
> Take RocksDB Statebackend as example, window state (typical very large
> ListState) persisted in partitioned local rocksdb files, adding element to
> window is localized and cheap.When checkpoint starts, each of those rocksdb
> do upload to corresponding HDFS directories separately.This is good in a
> sense when any intermediate states between two successful checkpoints can
> be overwritten and local snapshots can be done cheaply and asynchronously.
>
> I heard folks tried to build mysqlbackend(deprecated), remote rocksdb as
> service backend(hard to scale and performance bottleneck) , Cassandra(hard
> to snapshot). All of which shares same trait on lack of local
> parallelizable snapshot semantic.
>
> Hope this helps!
> Chen
>
> On Thu, Dec 27, 2018 at 8:27 AM miki haiat <mi...@gmail.com> wrote:
>
> > Did try to use rocksdb[1] as state backend?
> >
> >
> > 1.
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
> >
> >
> > On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com
> > .invalid
> > wrote:
> >
> > > Hi,
> > >
> > > I am exploring if we can plugin hbase as state backend in Flink. We
> have
> > > need for streaming jobs with large window states, high throughput and
> > > reliability.
> > >
> > > I wanted to know if implementing Flink backend in Hbase or other
> > > distributed KV store is possible. Any documentation or pointers will be
> > > helpful.
> > >
> > > Thanks,
> > > Naveen
> > >
> >
>

Re: Hbase state backend in Flink

Posted by Yu Li <ca...@gmail.com>.

FWIW, one major advantage of adopting HBase as Flink statebackend is to
support direct read/write on DFS, so as to disaggregate storage and compute
(DisAgg).  DisAgg has several benefits, such as supporting elastic
computing in cloud, much better (order of magnitude) recovery speed when
rescaling up/down (as Gyula also mentioned), etc. and we could eliminate
the performance regression compared to local RW through techniques like
adding a local L2 cache. More information please refer to our talk
<https://files.alicdn.com/tpsservice/1df9ccb8a7b6b2782a558d3c32d40c19.pdf>
at this year's Flink Forward China, and we could discuss more in another
thread if interested.

Back to @Naveen's question here, we need to make HBase supporting embedded
mode first before adopting it as Flink statebackend. We have done some
initial work and please refer to HBASE-17743
<https://issues.apache.org/jira/browse/HBASE-17743> and the design doc
there for more details. And for sure we will upstream our work when ready
to (smile).

Best Regards,
Yu

On Fri, 28 Dec 2018 at 13:12, Chen Qin <qi...@gmail.com> wrote:

> Hi Naveen,
>
> AFAIK, there are two level of storage in typical statebackend
> (local/remote). I think it kinda similar to what PC main memory and disk
> analogy.
>
> Take RocksDB Statebackend as example, window state (typical very large
> ListState) persisted in partitioned local rocksdb files, adding element to
> window is localized and cheap.When checkpoint starts, each of those rocksdb
> do upload to corresponding HDFS directories separately.This is good in a
> sense when any intermediate states between two successful checkpoints can
> be overwritten and local snapshots can be done cheaply and asynchronously.
>
> I heard folks tried to build mysqlbackend(deprecated), remote rocksdb as
> service backend(hard to scale and performance bottleneck) , Cassandra(hard
> to snapshot). All of which shares same trait on lack of local
> parallelizable snapshot semantic.
>
> Hope this helps!
> Chen
>
> On Thu, Dec 27, 2018 at 8:27 AM miki haiat <mi...@gmail.com> wrote:
>
> > Did try to use rocksdb[1] as state backend?
> >
> >
> > 1.
> >
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
> >
> >
> > On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com
> > .invalid
> > wrote:
> >
> > > Hi,
> > >
> > > I am exploring if we can plugin hbase as state backend in Flink. We
> have
> > > need for streaming jobs with large window states, high throughput and
> > > reliability.
> > >
> > > I wanted to know if implementing Flink backend in Hbase or other
> > > distributed KV store is possible. Any documentation or pointers will be
> > > helpful.
> > >
> > > Thanks,
> > > Naveen
> > >
> >
>

Re: Hbase state backend in Flink

Posted by Chen Qin <qi...@gmail.com>.

Hi Naveen,

AFAIK, there are two level of storage in typical statebackend
(local/remote). I think it kinda similar to what PC main memory and disk
analogy.

Take RocksDB Statebackend as example, window state (typical very large
ListState) persisted in partitioned local rocksdb files, adding element to
window is localized and cheap.When checkpoint starts, each of those rocksdb
do upload to corresponding HDFS directories separately.This is good in a
sense when any intermediate states between two successful checkpoints can
be overwritten and local snapshots can be done cheaply and asynchronously.

I heard folks tried to build mysqlbackend(deprecated), remote rocksdb as
service backend(hard to scale and performance bottleneck) , Cassandra(hard
to snapshot). All of which shares same trait on lack of local
parallelizable snapshot semantic.

Hope this helps!
Chen

On Thu, Dec 27, 2018 at 8:27 AM miki haiat <mi...@gmail.com> wrote:

> Did try to use rocksdb[1] as state backend?
>
>
> 1.
>
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend
>
>
> On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com
> .invalid
> wrote:
>
> > Hi,
> >
> > I am exploring if we can plugin hbase as state backend in Flink. We have
> > need for streaming jobs with large window states, high throughput and
> > reliability.
> >
> > I wanted to know if implementing Flink backend in Hbase or other
> > distributed KV store is possible. Any documentation or pointers will be
> > helpful.
> >
> > Thanks,
> > Naveen
> >
>

Re: Hbase state backend in Flink

Posted by miki haiat <mi...@gmail.com>.

Did try to use rocksdb[1] as state backend?


1.
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend


On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com.invalid
wrote:

> Hi,
>
> I am exploring if we can plugin hbase as state backend in Flink. We have
> need for streaming jobs with large window states, high throughput and
> reliability.
>
> I wanted to know if implementing Flink backend in Hbase or other
> distributed KV store is possible. Any documentation or pointers will be
> helpful.
>
> Thanks,
> Naveen
>

Re: Hbase state backend in Flink

Posted by Gyula Fóra <gy...@gmail.com>.

Hi!

While certainly possible I think it’s a bad idea in general.

I think state size itself shouldn’t be a problem with the RocksDb backend
as you can always increase parallelism to shard more while keeping the
insanely good performance compared to a remote kv store. We and other users
have successfully used rocksdb state backend with incremental snapshots
with several terabytes of state in production for years.

The only main advantage I see for hbase and similar kvstores as
statebackend is the instant recovery you get but even in that case you
probably want an implementation that combines an embedded and remote kv
store.

Also the rocskdb backend without any external dependency will be infinitely
more reliable in practice.

Cheers
Gyula

On Thu, 27 Dec 2018 at 17:17, Naveen Kumar
<na...@flipkart.com.invalid> wrote:

> Hi,
>
> I am exploring if we can plugin hbase as state backend in Flink. We have
> need for streaming jobs with large window states, high throughput and
> reliability.
>
> I wanted to know if implementing Flink backend in Hbase or other
> distributed KV store is possible. Any documentation or pointers will be
> helpful.
>
> Thanks,
> Naveen
>

Re: Hbase state backend in Flink

Posted by miki haiat <mi...@gmail.com>.

Did try to use rocksdb[1] as state backend?


1.
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#the-rocksdbstatebackend


On Thu, 27 Dec 2018, 18:17 Naveen Kumar <naveenkumar.g@flipkart.com.invalid
wrote:

> Hi,
>
> I am exploring if we can plugin hbase as state backend in Flink. We have
> need for streaming jobs with large window states, high throughput and
> reliability.
>
> I wanted to know if implementing Flink backend in Hbase or other
> distributed KV store is possible. Any documentation or pointers will be
> helpful.
>
> Thanks,
> Naveen
>

Re: Hbase state backend in Flink

Posted by Gyula Fóra <gy...@gmail.com>.

Hi!

While certainly possible I think it’s a bad idea in general.

I think state size itself shouldn’t be a problem with the RocksDb backend
as you can always increase parallelism to shard more while keeping the
insanely good performance compared to a remote kv store. We and other users
have successfully used rocksdb state backend with incremental snapshots
with several terabytes of state in production for years.

The only main advantage I see for hbase and similar kvstores as
statebackend is the instant recovery you get but even in that case you
probably want an implementation that combines an embedded and remote kv
store.

Also the rocskdb backend without any external dependency will be infinitely
more reliable in practice.

Cheers
Gyula

On Thu, 27 Dec 2018 at 17:17, Naveen Kumar
<na...@flipkart.com.invalid> wrote:

> Hi,
>
> I am exploring if we can plugin hbase as state backend in Flink. We have
> need for streaming jobs with large window states, high throughput and
> reliability.
>
> I wanted to know if implementing Flink backend in Hbase or other
> distributed KV store is possible. Any documentation or pointers will be
> helpful.
>
> Thanks,
> Naveen
>