You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Maciek Próchniak <mp...@touk.pl> on 2021/03/09 06:07:51 UTC

Future of QueryableState

Hello,


We are using QueryableState in some of Nussknacker deployments as a nice 
addition, allowing end users to peek inside job state for a given key 
(we mostly use custom operators).


Judging by mailing list and feature radar proposition by Stephan: 
https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg 


this feature is not widely used/supported. I'd like to ask:

- are there any alternative ways of accessing state during job 
execution? State API is very nice, but it operates on checkpoints and 
loading whole state to lookup one key seems a bit heavy?

- are there any inherent problems in QueryableState design (e.g. it's 
not feasible to use it in K8 settings, performance considerations) or 
just lack of interest/support (in that case we may offer some help)?


thanks,

maciek


Re: Future of QueryableState

Posted by Arvid Heise <ar...@apache.org>.
Hi Maciek,

Thanks for reaching out. Only through these interactions, we know how
important certain features are to users.

Queryable State has some limitations and makes the whole system rather
fragile. Most users that try it out are disappointed that there is actually
no SQL support. If we could support it, then expensive queries would slow
down the actual application... So if we have enough interest in the
community, we would rather replace queryable state with some way to
replicate state to an external system which supports proper queries and
which has no influence on the live application.

FLIP-158 [1] was just accepted and would make it easier to replicate state
onto an external system. Replicating an external system is not planned yet,
but it's one of the ideas that are floating around. Could you imagine to
have your Flink state replicated into some key/value store, log stream, or
database for your use case? What would be your preference?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints

On Wed, Mar 10, 2021 at 2:44 PM Maciek Próchniak <mp...@touk.pl> wrote:

> Hi Konstantin,
>
> thanks for detailed answer. I also thought about CoFunction, but it is a
> bit too heavy for us for the moment (each state would have to have
> additional kafka producer/consumer).
>
> Guess we'll use QueryableState for now and try to phase it out slowly...
>
>
> thanks,
>
> maciek
>
>
> On 09.03.2021 17:42, Konstantin Knauf wrote:
>
> Hi Maciek,
>
> Thank you for reaching out. I'll try to answer your questions separately.
>
> - nothing comparable. You already mention the State Processor API. Besides
> that, I can only think of a side channel (CoFunction) that is used to
> request a certain state that is then send to a side output and ultimate to
> a sink, e.g. Kafka State Request Topic -> Flink -> Kafka State Response
> Topic. This puts this complexity into the Flink Job, though.
>
> - I think it is a combination of both. Queryable State works well within
> its limitations. In the case of the RocksDBStatebackend this is mainly the
> availability of the job and the fact that you might read "uncommitted"
> state updates. In case of the heap-backed statebackends there are also
> synchronization issues, e.g. you might read stale values. You also mention
> the fact that queryable state has been an afterthought when it comes to
> more recent deployment options. I am not aware of any Committer who
> currently has the time to work on this to the degree that would be
> required. So, we thought, it would be more fair and realistic to mark
> Queryable State as "approaching end of life" in the sense that there is no
> active development on that component anymore.
>
> Best,
>
> Konstantin
>
> On Tue, Mar 9, 2021 at 7:08 AM Maciek Próchniak <mp...@touk.pl> wrote:
>
>> Hello,
>>
>>
>> We are using QueryableState in some of Nussknacker deployments as a nice
>> addition, allowing end users to peek inside job state for a given key
>> (we mostly use custom operators).
>>
>>
>> Judging by mailing list and feature radar proposition by Stephan:
>>
>> https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg
>>
>>
>> this feature is not widely used/supported. I'd like to ask:
>>
>> - are there any alternative ways of accessing state during job
>> execution? State API is very nice, but it operates on checkpoints and
>> loading whole state to lookup one key seems a bit heavy?
>>
>> - are there any inherent problems in QueryableState design (e.g. it's
>> not feasible to use it in K8 settings, performance considerations) or
>> just lack of interest/support (in that case we may offer some help)?
>>
>>
>> thanks,
>>
>> maciek
>>
>>
>
> --
>
> Konstantin Knauf
>
> https://twitter.com/snntrable
>
> https://github.com/knaufk
>
>

Re: Future of QueryableState

Posted by Maciek Próchniak <mp...@touk.pl>.
Hi Konstantin,

thanks for detailed answer. I also thought about CoFunction, but it is a 
bit too heavy for us for the moment (each state would have to have 
additional kafka producer/consumer).

Guess we'll use QueryableState for now and try to phase it out slowly...


thanks,

maciek


On 09.03.2021 17:42, Konstantin Knauf wrote:
> Hi Maciek,
>
> Thank you for reaching out. I'll try to answer your questions separately.
>
> - nothing comparable. You already mention the State Processor API. 
> Besides that, I can only think of a side channel (CoFunction) that is 
> used to request a certain state that is then send to a side output and 
> ultimate to a sink, e.g. Kafka State Request Topic -> Flink -> Kafka 
> State Response Topic. This puts this complexity into the Flink Job, 
> though.
>
> - I think it is a combination of both. Queryable State works well 
> within its limitations. In the case of the RocksDBStatebackend this is 
> mainly the availability of the job and the fact that you might read 
> "uncommitted" state updates. In case of the heap-backed statebackends 
> there are also synchronization issues, e.g. you might read stale 
> values. You also mention the fact that queryable state has been an 
> afterthought when it comes to more recent deployment options. I am not 
> aware of any Committer who currently has the time to work on this to 
> the degree that would be required. So, we thought, it would be more 
> fair and realistic to mark Queryable State as "approaching end of 
> life" in the sense that there is no active development on that 
> component anymore.
>
> Best,
>
> Konstantin
>
> On Tue, Mar 9, 2021 at 7:08 AM Maciek Próchniak <mpr@touk.pl 
> <ma...@touk.pl>> wrote:
>
>     Hello,
>
>
>     We are using QueryableState in some of Nussknacker deployments as
>     a nice
>     addition, allowing end users to peek inside job state for a given key
>     (we mostly use custom operators).
>
>
>     Judging by mailing list and feature radar proposition by Stephan:
>     https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg
>     <https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg>
>
>
>
>     this feature is not widely used/supported. I'd like to ask:
>
>     - are there any alternative ways of accessing state during job
>     execution? State API is very nice, but it operates on checkpoints and
>     loading whole state to lookup one key seems a bit heavy?
>
>     - are there any inherent problems in QueryableState design (e.g. it's
>     not feasible to use it in K8 settings, performance considerations) or
>     just lack of interest/support (in that case we may offer some help)?
>
>
>     thanks,
>
>     maciek
>
>
>
> -- 
>
> Konstantin Knauf
>
> https://twitter.com/snntrable <https://twitter.com/snntrable>
>
> https://github.com/knaufk <https://github.com/knaufk>
>

Re: Future of QueryableState

Posted by Konstantin Knauf <kn...@apache.org>.
Hi Maciek,

Thank you for reaching out. I'll try to answer your questions separately.

- nothing comparable. You already mention the State Processor API. Besides
that, I can only think of a side channel (CoFunction) that is used to
request a certain state that is then send to a side output and ultimate to
a sink, e.g. Kafka State Request Topic -> Flink -> Kafka State Response
Topic. This puts this complexity into the Flink Job, though.

- I think it is a combination of both. Queryable State works well within
its limitations. In the case of the RocksDBStatebackend this is mainly the
availability of the job and the fact that you might read "uncommitted"
state updates. In case of the heap-backed statebackends there are also
synchronization issues, e.g. you might read stale values. You also mention
the fact that queryable state has been an afterthought when it comes to
more recent deployment options. I am not aware of any Committer who
currently has the time to work on this to the degree that would be
required. So, we thought, it would be more fair and realistic to mark
Queryable State as "approaching end of life" in the sense that there is no
active development on that component anymore.

Best,

Konstantin

On Tue, Mar 9, 2021 at 7:08 AM Maciek Próchniak <mp...@touk.pl> wrote:

> Hello,
>
>
> We are using QueryableState in some of Nussknacker deployments as a nice
> addition, allowing end users to peek inside job state for a given key
> (we mostly use custom operators).
>
>
> Judging by mailing list and feature radar proposition by Stephan:
>
> https://github.com/StephanEwen/flink-web/blob/feature_radar/img/flink_feature_radar.svg
>
>
> this feature is not widely used/supported. I'd like to ask:
>
> - are there any alternative ways of accessing state during job
> execution? State API is very nice, but it operates on checkpoints and
> loading whole state to lookup one key seems a bit heavy?
>
> - are there any inherent problems in QueryableState design (e.g. it's
> not feasible to use it in K8 settings, performance considerations) or
> just lack of interest/support (in that case we may offer some help)?
>
>
> thanks,
>
> maciek
>
>

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk