You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Ron Crocker <rc...@newrelic.com.INVALID> on 2022/08/28 23:29:27 UTC

Rescuing Queryable State from deprecation

Hi -

For those of you who didn’t see my talk from Flink Forward, here’s a link <https://www.slideshare.net/FlinkForward/using-queryable-state-for-fun-and-profit> to the slides. 
By implementing the ideas first brought forward in the blog post <https://www.ververica.com/blog/queryable-state-use-case-demo>, we’ve found we can save a pretty substantial amount of expense using Flink Queryable State as a replacement for certain off-board caches.

Given that, I’m quite keen on rescuing queryable state from deprecation. However, I’m not at all sure how to start that conversation (other than by just starting it).

I believe my argument for rescuing this feature is supported in my talk, though I’m open to further discussion. 

Perhaps the next thing to understanding is why, in concrete terms, queryable state was added to the deprecation list. Can someone enlighten me on that?

Ron
—
Ron Crocker
New Relic Fellow & Architect
( ( •)) New Relic
rcrocker@newrelic.com
M: +1 630 363 8835


Re: Rescuing Queryable State from deprecation

Posted by Yuan Mei <yu...@gmail.com>.
Hey Ron,

Sorry for the late response. Thanks for the initiative to bring up this
topic and appreciate your efforts to rescue Queryable State :-). I will try
to answer your questions at my best!

*Why is Queryable State in the deprecation list?*
I've seen quite a few users bring up the requests to make "state" queryable
over the past few years, and I personally feel State Querying has a lot of
potential as well.
However, amongst all the use cases, I feel the current architecture of
queryable state is not aligned well with what is really needed to be
queried (I will explain why later, in the third section). I think this is
the main reason why the feature "queryable state" is not actively
maintained and improved, leading to the feature being added to the
deprecation list.

*What is the plan/future to support querying states?*
The motivation to introduce Queryable State was to provide external
applications with the capability to access real-time results from Flink
state without dependencies on external (k-v) stores. There are different
ways to achieve this. While Querable State can support some of the use
cases, I am not sure whether that's the way we want to head to before we
answer the following two questions:

1. What kind of data/query/result is expected from a user perspective
2. What does "real-time" mean here? Does that mean we have to read
uncommitted data (through the queryable state)?
    Or if the checkpoint interval is short, reading committed data is good
enough (through state processor API)

From my talking to users/customers (including myself), we do feel that
there are data already stored in the state that should not be wasted.
However, *Flink states are architected in a way optimized for fast
streaming processing access/fast failover/rescaling. Hence, state *
*querying **is far more complicated than simply "exposing states and making
the state accessible"*, as the current version of queryable state does.
Without scoping "state querying" well, it is difficult to make state
querying really usable and prod-ready.

*What is the problem with the queryable state?*
As I mentioned before, the problem mainly lies in the architecture. Here is
a short list based on my observation:

1. Queryable state service is unavailable during a failover.
2. Rollback to a previous checkpoint causing the state rollback as well.
We can solve this problem by only reading committed data. If this is the
case, why not simply use processor API then?
    In the queryable state's current model,
it has to maintain multiple snapshots and wait for checkpoint completion.
This complicates the read model of the state store.
3. Have to support multi-thread reads from the state, which
complicates the accessing model and design
4. Cannot do complicated query analysis (other than simple look-up)
based on the current architecture.
5. The queryable state's life cycle is bound to the life cycle of the Flink
job. If a job is restarted, the job id is changed, and the query has to be
changed correspondingly even for the same state.

All in all, the queryable state is neither scalable nor high-available + it
has the potential to affect normal data processing if read access is
excessive.

Me and my colleagues have devoted ourselves to maintaining and improving
Flink state-related components in the Flink community during the last
couple of years. For reasons mentioned above: the past, the future, and the
problems faced with the queryable state, I'd be hesitant to remove the
queryable state from the deprecation list unless we have a clear vision of
where the "state querying" head (the two questions I proposed).

At the same time, I'd be open and happy to discuss this in more detail
online/offline.


Best Regards,

Yuan




On Tue, Aug 30, 2022 at 11:33 AM Yun Tang <my...@live.com> wrote:

> Hi Ron,
>
> From my understanding, the feature of queryable state is introduced to
> Flink very early but lacks maintainers, in other words, this feature is
> currently a toy instead of a production-ready tool.
>
> I have heard from users many times asking for this feature in the
> production environment, and I believe it will bring benefits to the Flink
> community if someone could take it.
>
>
> Best
> Yun Tang
> ________________________________
> From: Konstantin Knauf <kn...@apache.org>
> Sent: Monday, August 29, 2022 17:54
> To: dev <de...@flink.apache.org>
> Subject: Re: Rescuing Queryable State from deprecation
>
> Hi Ron,
>
> thanks you for sharing your use case and your initiative to save Queryable
> State. Queryable State was deprecated due to a lack of maintainers and thus
> the community did not have resources to improve and develop it further. The
> deprecation notice was added to signal this lack of attention to our
> users.  On the other hand, I am not aware of anyone who is actively working
> towards actually dropping Queryable State. So, I don't think this will
> happen anytime soon.
>
> Personally, I see a lot of potential in Queryable State, but to make it
> really, really useful it still needs a lot of work and the current
> architecture/implementation might need to be re-thought from scratch in my
> opinion. In that case, the current implementation might need to phased out
> again, but opposed to today there would be an alternative then. I am not
> aware of anyone at Immerok who currently has time to review & support a
> major effort in this direction. Just keeping the current implementation
> alive without any fundamental changes might not be too much work, though.
>
> So, all in all, I think, Queryable State will neither be fundamentally
> improved nor dropped any time soon. If you are committed to maintaining the
> current implementation of Queryable State in terms of addressing bugs, test
> instabilities and potentially small features requests, I am +1 to remove it
> from the deprecation list again provided that we keep the
> APIs @Experimental.
>
> Cheers,
>
> Konstantin
>
>
>
>
>
>
>
> Am Mo., 29. Aug. 2022 um 01:30 Uhr schrieb Ron Crocker
> <rc...@newrelic.com.invalid>:
>
> > Hi -
> >
> > For those of you who didn’t see my talk from Flink Forward, here’s a
> link <
> >
> https://www.slideshare.net/FlinkForward/using-queryable-state-for-fun-and-profit
> >
> > to the slides.
> > By implementing the ideas first brought forward in the blog post <
> > https://www.ververica.com/blog/queryable-state-use-case-demo>, we’ve
> > found we can save a pretty substantial amount of expense using Flink
> > Queryable State as a replacement for certain off-board caches.
> >
> > Given that, I’m quite keen on rescuing queryable state from deprecation.
> > However, I’m not at all sure how to start that conversation (other than
> by
> > just starting it).
> >
> > I believe my argument for rescuing this feature is supported in my talk,
> > though I’m open to further discussion.
> >
> > Perhaps the next thing to understanding is why, in concrete terms,
> > queryable state was added to the deprecation list. Can someone enlighten
> me
> > on that?
> >
> > Ron
> > —
> > Ron Crocker
> > New Relic Fellow & Architect
> > ( ( •)) New Relic
> > rcrocker@newrelic.com
> > M: +1 630 363 8835
> >
> >
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Re: Rescuing Queryable State from deprecation

Posted by Yun Tang <my...@live.com>.
Hi Ron,

From my understanding, the feature of queryable state is introduced to Flink very early but lacks maintainers, in other words, this feature is currently a toy instead of a production-ready tool.

I have heard from users many times asking for this feature in the production environment, and I believe it will bring benefits to the Flink community if someone could take it.


Best
Yun Tang
________________________________
From: Konstantin Knauf <kn...@apache.org>
Sent: Monday, August 29, 2022 17:54
To: dev <de...@flink.apache.org>
Subject: Re: Rescuing Queryable State from deprecation

Hi Ron,

thanks you for sharing your use case and your initiative to save Queryable
State. Queryable State was deprecated due to a lack of maintainers and thus
the community did not have resources to improve and develop it further. The
deprecation notice was added to signal this lack of attention to our
users.  On the other hand, I am not aware of anyone who is actively working
towards actually dropping Queryable State. So, I don't think this will
happen anytime soon.

Personally, I see a lot of potential in Queryable State, but to make it
really, really useful it still needs a lot of work and the current
architecture/implementation might need to be re-thought from scratch in my
opinion. In that case, the current implementation might need to phased out
again, but opposed to today there would be an alternative then. I am not
aware of anyone at Immerok who currently has time to review & support a
major effort in this direction. Just keeping the current implementation
alive without any fundamental changes might not be too much work, though.

So, all in all, I think, Queryable State will neither be fundamentally
improved nor dropped any time soon. If you are committed to maintaining the
current implementation of Queryable State in terms of addressing bugs, test
instabilities and potentially small features requests, I am +1 to remove it
from the deprecation list again provided that we keep the
APIs @Experimental.

Cheers,

Konstantin







Am Mo., 29. Aug. 2022 um 01:30 Uhr schrieb Ron Crocker
<rc...@newrelic.com.invalid>:

> Hi -
>
> For those of you who didn’t see my talk from Flink Forward, here’s a link <
> https://www.slideshare.net/FlinkForward/using-queryable-state-for-fun-and-profit>
> to the slides.
> By implementing the ideas first brought forward in the blog post <
> https://www.ververica.com/blog/queryable-state-use-case-demo>, we’ve
> found we can save a pretty substantial amount of expense using Flink
> Queryable State as a replacement for certain off-board caches.
>
> Given that, I’m quite keen on rescuing queryable state from deprecation.
> However, I’m not at all sure how to start that conversation (other than by
> just starting it).
>
> I believe my argument for rescuing this feature is supported in my talk,
> though I’m open to further discussion.
>
> Perhaps the next thing to understanding is why, in concrete terms,
> queryable state was added to the deprecation list. Can someone enlighten me
> on that?
>
> Ron
> —
> Ron Crocker
> New Relic Fellow & Architect
> ( ( •)) New Relic
> rcrocker@newrelic.com
> M: +1 630 363 8835
>
>

--
https://twitter.com/snntrable
https://github.com/knaufk

Re: Rescuing Queryable State from deprecation

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Ron,

thanks you for sharing your use case and your initiative to save Queryable
State. Queryable State was deprecated due to a lack of maintainers and thus
the community did not have resources to improve and develop it further. The
deprecation notice was added to signal this lack of attention to our
users.  On the other hand, I am not aware of anyone who is actively working
towards actually dropping Queryable State. So, I don't think this will
happen anytime soon.

Personally, I see a lot of potential in Queryable State, but to make it
really, really useful it still needs a lot of work and the current
architecture/implementation might need to be re-thought from scratch in my
opinion. In that case, the current implementation might need to phased out
again, but opposed to today there would be an alternative then. I am not
aware of anyone at Immerok who currently has time to review & support a
major effort in this direction. Just keeping the current implementation
alive without any fundamental changes might not be too much work, though.

So, all in all, I think, Queryable State will neither be fundamentally
improved nor dropped any time soon. If you are committed to maintaining the
current implementation of Queryable State in terms of addressing bugs, test
instabilities and potentially small features requests, I am +1 to remove it
from the deprecation list again provided that we keep the
APIs @Experimental.

Cheers,

Konstantin







Am Mo., 29. Aug. 2022 um 01:30 Uhr schrieb Ron Crocker
<rc...@newrelic.com.invalid>:

> Hi -
>
> For those of you who didn’t see my talk from Flink Forward, here’s a link <
> https://www.slideshare.net/FlinkForward/using-queryable-state-for-fun-and-profit>
> to the slides.
> By implementing the ideas first brought forward in the blog post <
> https://www.ververica.com/blog/queryable-state-use-case-demo>, we’ve
> found we can save a pretty substantial amount of expense using Flink
> Queryable State as a replacement for certain off-board caches.
>
> Given that, I’m quite keen on rescuing queryable state from deprecation.
> However, I’m not at all sure how to start that conversation (other than by
> just starting it).
>
> I believe my argument for rescuing this feature is supported in my talk,
> though I’m open to further discussion.
>
> Perhaps the next thing to understanding is why, in concrete terms,
> queryable state was added to the deprecation list. Can someone enlighten me
> on that?
>
> Ron
> —
> Ron Crocker
> New Relic Fellow & Architect
> ( ( •)) New Relic
> rcrocker@newrelic.com
> M: +1 630 363 8835
>
>

-- 
https://twitter.com/snntrable
https://github.com/knaufk