You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Stefano Bortoli <st...@huawei.com> on 2018/03/02 09:58:07 UTC

StreamSQL queriable state

Hi guys,

I am checking out the queriable state API, and it seems that most of the tooling is already available. However, the queriable state is available just for the streaming API, not at the StreamSQL API level. In principle, as the flink-table is aware of the query semantic and data output type, it should be possible to configure the query compilation to nest queriable state in the process/window functions. Is there any plan in this direction?

Best,
Stefano

Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Hi, Timo:
https://docs.google.com/document/d/18XbrFuwzzRzHR84a48j17BIGpTndSS1q-shSgCBEd8I/edit?usp=sharing
is
a small design doc for the implementation.
I agree with you to put SQL client integration in another PR and focus on
the sink implementation in this PR.

On Mon, Mar 5, 2018 at 9:33 PM Timo Walther <tw...@apache.org> wrote:

> Hi Renjie,
>
> it would be great if we can reuse my QueryableStateTableSink
> implementation. As far as I know, the queryable state interfaces changed
> a bit in the last months and a new queryable state client as been added.
> So my branch needs to be rebased to the current APIs.
>
> We don't need a big design document but at least some paragraphs how the
> APIs should look like. How do you want to extend the SQL client for
> that? We can also introduce the QueryableStateTableSink in one pull
> request and do the API in a follow-up PR.
>
> What do you think?
>
> Regards,
> Timo
>
>
> Am 3/5/18 um 3:52 AM schrieb Renjie Liu:
> > Hi, Timo:
> > I've read your QueryableStateTableSink implementation and that basically
> > implementes what I want to do. I also want to extend SQL client so that
> > user can do point query against the table sink. Do we still need a design
> > doc for that? It seems that I just need to finish the left part and do
> some
> > test against it.
> >
> > Hi, Stefano:
> > Your requirement needs some changes to the flink table implementation
> but I
> > don't know why you need that? For debugging? The operator state is
> internal
> > and subject to optisimation logic, so I think it maybe meanless to expose
> > that.
> >
> > On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <
> stefano.bortoli@huawei.com>
> > wrote:
> >
> >> Hi Timo, Renjie,
> >>
> >> What I was thinking did not include the QueryableStateTableSink, but
> >> rather tap in directly into the state of a streaming operator. Perhaps
> it
> >> is the same thing, but just it sounds not intuitive to consider it a
> sink.
> >>
> >> So, we would need a way to configure the environment for the query to
> >> share the "state name" before the query is executed, and then use this
> to
> >> create the hook for the queriable state in the operator. Perhaps extend
> the
> >> current codegen and operator implementations to get as a parameter the
> >> StateDescriptor to be inquired.
> >>
> >> Looking forward for the design document, will be happy to give you
> >> feedback.
> >>
> >> Best,
> >> Stefano
> >>
> >> -----Original Message-----
> >> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> >> Sent: Friday, March 02, 2018 11:42 AM
> >> To: dev@flink.apache.org
> >> Subject: Re: StreamSQL queriable state
> >>
> >> Great, thank you.
> >> I'll start by writing a design doc.
> >>
> >> On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
> >>
> >>> I gave you contributor permissions in Jira. You should be able to
> >>> assign it to yourself now.
> >>>
> >>> Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> >>>> Hi, Timo:
> >>>> It seems that I can't assign it to myself. Could you please help to
> >>> assign
> >>>> that to me?
> >>>> My jira username is liurenjie1024 and my email is
> >>> liurenjie2008@gmail.com
> >>>> On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
> >> wrote:
> >>>>> Hi Renjie,
> >>>>>
> >>>>> that would be great. There is already a Jira issue for it:
> >>>>> https://issues.apache.org/jira/browse/FLINK-6968
> >>>>>
> >>>>> Feel free to assign it to yourself. You can reuse parts of my code
> >>>>> if you want. But maybe it would make sense to have a little design
> >>>>> document first about what we want to support.
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> >>>>>> Hi, Timo, I've been planning on the same thing and would like to
> >>>>> contribute
> >>>>>> that.
> >>>>>>
> >>>>>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> >>> wrote:
> >>>>>>> Hi Stefano,
> >>>>>>>
> >>>>>>> yes there are plan in this direction. Actually, I already worked
> >>>>>>> on
> >>> such
> >>>>>>> a QueryableStateTableSink [1] in the past but never finished it
> >>> because
> >>>>>>> of priority shifts. Would be great if somebody wants to
> >>>>>>> contribute
> >>> this
> >>>>>>> functionality :)
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> >>>>>>>
> >>>>>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> >>>>>>>> Hi guys,
> >>>>>>>>
> >>>>>>>> I am checking out the queriable state API, and it seems that
> >>>>>>>> most of
> >>>>> the
> >>>>>>> tooling is already available. However, the queriable state is
> >>> available
> >>>>>>> just for the streaming API, not at the StreamSQL API level. In
> >>>>> principle,
> >>>>>>> as the flink-table is aware of the query semantic and data output
> >>> type,
> >>>>> it
> >>>>>>> should be possible to configure the query compilation to nest
> >>> queriable
> >>>>>>> state in the process/window functions. Is there any plan in this
> >>>>> direction?
> >>>>>>>> Best,
> >>>>>>>> Stefano
> >>>>>>>>
> >>>>>>> --
> >>>>>> Liu, Renjie
> >>>>>> Software Engineer, MVAD
> >>>>>>
> >>>>> --
> >>>> Liu, Renjie
> >>>> Software Engineer, MVAD
> >>>>
> >>> --
> >> Liu, Renjie
> >> Software Engineer, MVAD
> >>
>
> --
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Timo Walther <tw...@apache.org>.
Hi Renjie,

it would be great if we can reuse my QueryableStateTableSink 
implementation. As far as I know, the queryable state interfaces changed 
a bit in the last months and a new queryable state client as been added. 
So my branch needs to be rebased to the current APIs.

We don't need a big design document but at least some paragraphs how the 
APIs should look like. How do you want to extend the SQL client for 
that? We can also introduce the QueryableStateTableSink in one pull 
request and do the API in a follow-up PR.

What do you think?

Regards,
Timo


Am 3/5/18 um 3:52 AM schrieb Renjie Liu:
> Hi, Timo:
> I've read your QueryableStateTableSink implementation and that basically
> implementes what I want to do. I also want to extend SQL client so that
> user can do point query against the table sink. Do we still need a design
> doc for that? It seems that I just need to finish the left part and do some
> test against it.
>
> Hi, Stefano:
> Your requirement needs some changes to the flink table implementation but I
> don't know why you need that? For debugging? The operator state is internal
> and subject to optisimation logic, so I think it maybe meanless to expose
> that.
>
> On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <st...@huawei.com>
> wrote:
>
>> Hi Timo, Renjie,
>>
>> What I was thinking did not include the QueryableStateTableSink, but
>> rather tap in directly into the state of a streaming operator. Perhaps it
>> is the same thing, but just it sounds not intuitive to consider it a sink.
>>
>> So, we would need a way to configure the environment for the query to
>> share the "state name" before the query is executed, and then use this to
>> create the hook for the queriable state in the operator. Perhaps extend the
>> current codegen and operator implementations to get as a parameter the
>> StateDescriptor to be inquired.
>>
>> Looking forward for the design document, will be happy to give you
>> feedback.
>>
>> Best,
>> Stefano
>>
>> -----Original Message-----
>> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
>> Sent: Friday, March 02, 2018 11:42 AM
>> To: dev@flink.apache.org
>> Subject: Re: StreamSQL queriable state
>>
>> Great, thank you.
>> I'll start by writing a design doc.
>>
>> On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
>>
>>> I gave you contributor permissions in Jira. You should be able to
>>> assign it to yourself now.
>>>
>>> Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
>>>> Hi, Timo:
>>>> It seems that I can't assign it to myself. Could you please help to
>>> assign
>>>> that to me?
>>>> My jira username is liurenjie1024 and my email is
>>> liurenjie2008@gmail.com
>>>> On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
>> wrote:
>>>>> Hi Renjie,
>>>>>
>>>>> that would be great. There is already a Jira issue for it:
>>>>> https://issues.apache.org/jira/browse/FLINK-6968
>>>>>
>>>>> Feel free to assign it to yourself. You can reuse parts of my code
>>>>> if you want. But maybe it would make sense to have a little design
>>>>> document first about what we want to support.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
>>>>>> Hi, Timo, I've been planning on the same thing and would like to
>>>>> contribute
>>>>>> that.
>>>>>>
>>>>>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
>>> wrote:
>>>>>>> Hi Stefano,
>>>>>>>
>>>>>>> yes there are plan in this direction. Actually, I already worked
>>>>>>> on
>>> such
>>>>>>> a QueryableStateTableSink [1] in the past but never finished it
>>> because
>>>>>>> of priority shifts. Would be great if somebody wants to
>>>>>>> contribute
>>> this
>>>>>>> functionality :)
>>>>>>>
>>>>>>> Regards,
>>>>>>> Timo
>>>>>>>
>>>>>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
>>>>>>>
>>>>>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> I am checking out the queriable state API, and it seems that
>>>>>>>> most of
>>>>> the
>>>>>>> tooling is already available. However, the queriable state is
>>> available
>>>>>>> just for the streaming API, not at the StreamSQL API level. In
>>>>> principle,
>>>>>>> as the flink-table is aware of the query semantic and data output
>>> type,
>>>>> it
>>>>>>> should be possible to configure the query compilation to nest
>>> queriable
>>>>>>> state in the process/window functions. Is there any plan in this
>>>>> direction?
>>>>>>>> Best,
>>>>>>>> Stefano
>>>>>>>>
>>>>>>> --
>>>>>> Liu, Renjie
>>>>>> Software Engineer, MVAD
>>>>>>
>>>>> --
>>>> Liu, Renjie
>>>> Software Engineer, MVAD
>>>>
>>> --
>> Liu, Renjie
>> Software Engineer, MVAD
>>


Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Hi, Stefano:

I agree with Febian that it's not a good idea to expose internal state as
queryable to users for the reasons listed by Fabian.

As you pointed out with current QueryableStateSink's design, duplicating
internal state is inefficient. But this is limited by the api of table sink
since we can only fetch change streams of dynamic table.

On Thu, Mar 8, 2018 at 11:35 PM Stefano Bortoli <st...@huawei.com>
wrote:

> Hi Timo, Renjie,
>
> Well, the idea is that stream processing could become a complex pipeline
> of multiple queries and sinking data on a separate sink for monitoring does
> not seem efficient. In fact, pulling state value on demand would allow to
> monitor the values of different parts of the stream  processing pipeline
> without needing to deal with arrival/output rate also at the monitoring
> level in a sink.
>
> I'm aware it would require some modifications of the level of the table.
> However, having a separate sink wouldn't duplicate part of the data with
> respect to the state and output sink? In fact, you would have to keep also
> these data in a state for stateful processing.
>
> As you are going anyway to have a configuration parameter that should be
> interpreted during the StreamSQL query compilation, perhaps hooking the
> queryable state as a parameter for the processing functions (and other
> streaming operators) would be the easier and would not create any overhead
> (besides the query to the state, of course).
>
> My2c.
>
> Best,
> Stefano
>
> -----Original Message-----
> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> Sent: Monday, March 05, 2018 3:52 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Hi, Timo:
> I've read your QueryableStateTableSink implementation and that basically
> implementes what I want to do. I also want to extend SQL client so that
> user can do point query against the table sink. Do we still need a design
> doc for that? It seems that I just need to finish the left part and do some
> test against it.
>
> Hi, Stefano:
> Your requirement needs some changes to the flink table implementation but
> I don't know why you need that? For debugging? The operator state is
> internal and subject to optisimation logic, so I think it maybe meanless to
> expose that.
>
> On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <stefano.bortoli@huawei.com
> >
> wrote:
>
> > Hi Timo, Renjie,
> >
> > What I was thinking did not include the QueryableStateTableSink, but
> > rather tap in directly into the state of a streaming operator. Perhaps
> > it is the same thing, but just it sounds not intuitive to consider it a
> sink.
> >
> > So, we would need a way to configure the environment for the query to
> > share the "state name" before the query is executed, and then use this
> > to create the hook for the queriable state in the operator. Perhaps
> > extend the current codegen and operator implementations to get as a
> > parameter the StateDescriptor to be inquired.
> >
> > Looking forward for the design document, will be happy to give you
> > feedback.
> >
> > Best,
> > Stefano
> >
> > -----Original Message-----
> > From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> > Sent: Friday, March 02, 2018 11:42 AM
> > To: dev@flink.apache.org
> > Subject: Re: StreamSQL queriable state
> >
> > Great, thank you.
> > I'll start by writing a design doc.
> >
> > On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
> >
> > > I gave you contributor permissions in Jira. You should be able to
> > > assign it to yourself now.
> > >
> > > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > > Hi, Timo:
> > > > It seems that I can't assign it to myself. Could you please help
> > > > to
> > > assign
> > > > that to me?
> > > > My jira username is liurenjie1024 and my email is
> > > liurenjie2008@gmail.com
> > > >
> > > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
> > wrote:
> > > >
> > > >> Hi Renjie,
> > > >>
> > > >> that would be great. There is already a Jira issue for it:
> > > >> https://issues.apache.org/jira/browse/FLINK-6968
> > > >>
> > > >> Feel free to assign it to yourself. You can reuse parts of my
> > > >> code if you want. But maybe it would make sense to have a little
> > > >> design document first about what we want to support.
> > > >>
> > > >> Regards,
> > > >> Timo
> > > >>
> > > >>
> > > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > > >>> Hi, Timo, I've been planning on the same thing and would like to
> > > >> contribute
> > > >>> that.
> > > >>>
> > > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> > > wrote:
> > > >>>
> > > >>>> Hi Stefano,
> > > >>>>
> > > >>>> yes there are plan in this direction. Actually, I already
> > > >>>> worked on
> > > such
> > > >>>> a QueryableStateTableSink [1] in the past but never finished it
> > > because
> > > >>>> of priority shifts. Would be great if somebody wants to
> > > >>>> contribute
> > > this
> > > >>>> functionality :)
> > > >>>>
> > > >>>> Regards,
> > > >>>> Timo
> > > >>>>
> > > >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > > >>>>
> > > >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > > >>>>> Hi guys,
> > > >>>>>
> > > >>>>> I am checking out the queriable state API, and it seems that
> > > >>>>> most of
> > > >> the
> > > >>>> tooling is already available. However, the queriable state is
> > > available
> > > >>>> just for the streaming API, not at the StreamSQL API level. In
> > > >> principle,
> > > >>>> as the flink-table is aware of the query semantic and data
> > > >>>> output
> > > type,
> > > >> it
> > > >>>> should be possible to configure the query compilation to nest
> > > queriable
> > > >>>> state in the process/window functions. Is there any plan in
> > > >>>> this
> > > >> direction?
> > > >>>>> Best,
> > > >>>>> Stefano
> > > >>>>>
> > > >>>> --
> > > >>> Liu, Renjie
> > > >>> Software Engineer, MVAD
> > > >>>
> > > >> --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>
-- 
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Fabian Hueske <fh...@gmail.com>.
I don't think it is a good idea to expose the internal state of a query as
queryable state for the following reasons:

1. plan generation: the streaming programs and operators are created based
on the optimized plan. A user cannot know which operators a query will run
on. Naming the queryable states would be another issue.
2. query internals: an operator should use the most efficient state
representation. A user cannot easily know how the state is organized
without looking into the internals of the operator. Hence, querying the
state would not provide much value because the result will be hard to
interpret without knowing the details.
3. backwards compatibility: we need to be able to reimplement operators
(which also includes the design of state). If we expose state as queryable,
that would not be possible without breaking compatibility. Better
optimization might even remove or merge certain operators.

Regarding the queryable state table sink: Yes, in the initial design the
result might be replicated in the state, but also might not. This depends
on the previous operation. We can later still add optimizations that unify
replicated state.

Best,
Fabian



2018-03-08 7:35 GMT-08:00 Stefano Bortoli <st...@huawei.com>:

> Hi Timo, Renjie,
>
> Well, the idea is that stream processing could become a complex pipeline
> of multiple queries and sinking data on a separate sink for monitoring does
> not seem efficient. In fact, pulling state value on demand would allow to
> monitor the values of different parts of the stream  processing pipeline
> without needing to deal with arrival/output rate also at the monitoring
> level in a sink.
>
> I'm aware it would require some modifications of the level of the table.
> However, having a separate sink wouldn't duplicate part of the data with
> respect to the state and output sink? In fact, you would have to keep also
> these data in a state for stateful processing.
>
> As you are going anyway to have a configuration parameter that should be
> interpreted during the StreamSQL query compilation, perhaps hooking the
> queryable state as a parameter for the processing functions (and other
> streaming operators) would be the easier and would not create any overhead
> (besides the query to the state, of course).
>
> My2c.
>
> Best,
> Stefano
>
> -----Original Message-----
> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> Sent: Monday, March 05, 2018 3:52 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Hi, Timo:
> I've read your QueryableStateTableSink implementation and that basically
> implementes what I want to do. I also want to extend SQL client so that
> user can do point query against the table sink. Do we still need a design
> doc for that? It seems that I just need to finish the left part and do some
> test against it.
>
> Hi, Stefano:
> Your requirement needs some changes to the flink table implementation but
> I don't know why you need that? For debugging? The operator state is
> internal and subject to optisimation logic, so I think it maybe meanless to
> expose that.
>
> On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <stefano.bortoli@huawei.com
> >
> wrote:
>
> > Hi Timo, Renjie,
> >
> > What I was thinking did not include the QueryableStateTableSink, but
> > rather tap in directly into the state of a streaming operator. Perhaps
> > it is the same thing, but just it sounds not intuitive to consider it a
> sink.
> >
> > So, we would need a way to configure the environment for the query to
> > share the "state name" before the query is executed, and then use this
> > to create the hook for the queriable state in the operator. Perhaps
> > extend the current codegen and operator implementations to get as a
> > parameter the StateDescriptor to be inquired.
> >
> > Looking forward for the design document, will be happy to give you
> > feedback.
> >
> > Best,
> > Stefano
> >
> > -----Original Message-----
> > From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> > Sent: Friday, March 02, 2018 11:42 AM
> > To: dev@flink.apache.org
> > Subject: Re: StreamSQL queriable state
> >
> > Great, thank you.
> > I'll start by writing a design doc.
> >
> > On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
> >
> > > I gave you contributor permissions in Jira. You should be able to
> > > assign it to yourself now.
> > >
> > > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > > Hi, Timo:
> > > > It seems that I can't assign it to myself. Could you please help
> > > > to
> > > assign
> > > > that to me?
> > > > My jira username is liurenjie1024 and my email is
> > > liurenjie2008@gmail.com
> > > >
> > > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
> > wrote:
> > > >
> > > >> Hi Renjie,
> > > >>
> > > >> that would be great. There is already a Jira issue for it:
> > > >> https://issues.apache.org/jira/browse/FLINK-6968
> > > >>
> > > >> Feel free to assign it to yourself. You can reuse parts of my
> > > >> code if you want. But maybe it would make sense to have a little
> > > >> design document first about what we want to support.
> > > >>
> > > >> Regards,
> > > >> Timo
> > > >>
> > > >>
> > > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > > >>> Hi, Timo, I've been planning on the same thing and would like to
> > > >> contribute
> > > >>> that.
> > > >>>
> > > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> > > wrote:
> > > >>>
> > > >>>> Hi Stefano,
> > > >>>>
> > > >>>> yes there are plan in this direction. Actually, I already
> > > >>>> worked on
> > > such
> > > >>>> a QueryableStateTableSink [1] in the past but never finished it
> > > because
> > > >>>> of priority shifts. Would be great if somebody wants to
> > > >>>> contribute
> > > this
> > > >>>> functionality :)
> > > >>>>
> > > >>>> Regards,
> > > >>>> Timo
> > > >>>>
> > > >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > > >>>>
> > > >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > > >>>>> Hi guys,
> > > >>>>>
> > > >>>>> I am checking out the queriable state API, and it seems that
> > > >>>>> most of
> > > >> the
> > > >>>> tooling is already available. However, the queriable state is
> > > available
> > > >>>> just for the streaming API, not at the StreamSQL API level. In
> > > >> principle,
> > > >>>> as the flink-table is aware of the query semantic and data
> > > >>>> output
> > > type,
> > > >> it
> > > >>>> should be possible to configure the query compilation to nest
> > > queriable
> > > >>>> state in the process/window functions. Is there any plan in
> > > >>>> this
> > > >> direction?
> > > >>>>> Best,
> > > >>>>> Stefano
> > > >>>>>
> > > >>>> --
> > > >>> Liu, Renjie
> > > >>> Software Engineer, MVAD
> > > >>>
> > > >> --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>

RE: StreamSQL queriable state

Posted by Stefano Bortoli <st...@huawei.com>.
Hi Timo, Renjie,

Well, the idea is that stream processing could become a complex pipeline of multiple queries and sinking data on a separate sink for monitoring does not seem efficient. In fact, pulling state value on demand would allow to monitor the values of different parts of the stream  processing pipeline without needing to deal with arrival/output rate also at the monitoring level in a sink.

I'm aware it would require some modifications of the level of the table. However, having a separate sink wouldn't duplicate part of the data with respect to the state and output sink? In fact, you would have to keep also these data in a state for stateful processing. 

As you are going anyway to have a configuration parameter that should be interpreted during the StreamSQL query compilation, perhaps hooking the queryable state as a parameter for the processing functions (and other streaming operators) would be the easier and would not create any overhead (besides the query to the state, of course). 

My2c.

Best,
Stefano

-----Original Message-----
From: Renjie Liu [mailto:liurenjie2008@gmail.com] 
Sent: Monday, March 05, 2018 3:52 AM
To: dev@flink.apache.org
Subject: Re: StreamSQL queriable state

Hi, Timo:
I've read your QueryableStateTableSink implementation and that basically implementes what I want to do. I also want to extend SQL client so that user can do point query against the table sink. Do we still need a design doc for that? It seems that I just need to finish the left part and do some test against it.

Hi, Stefano:
Your requirement needs some changes to the flink table implementation but I don't know why you need that? For debugging? The operator state is internal and subject to optisimation logic, so I think it maybe meanless to expose that.

On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <st...@huawei.com>
wrote:

> Hi Timo, Renjie,
>
> What I was thinking did not include the QueryableStateTableSink, but 
> rather tap in directly into the state of a streaming operator. Perhaps 
> it is the same thing, but just it sounds not intuitive to consider it a sink.
>
> So, we would need a way to configure the environment for the query to 
> share the "state name" before the query is executed, and then use this 
> to create the hook for the queriable state in the operator. Perhaps 
> extend the current codegen and operator implementations to get as a 
> parameter the StateDescriptor to be inquired.
>
> Looking forward for the design document, will be happy to give you 
> feedback.
>
> Best,
> Stefano
>
> -----Original Message-----
> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> Sent: Friday, March 02, 2018 11:42 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Great, thank you.
> I'll start by writing a design doc.
>
> On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
>
> > I gave you contributor permissions in Jira. You should be able to 
> > assign it to yourself now.
> >
> > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > Hi, Timo:
> > > It seems that I can't assign it to myself. Could you please help 
> > > to
> > assign
> > > that to me?
> > > My jira username is liurenjie1024 and my email is
> > liurenjie2008@gmail.com
> > >
> > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
> wrote:
> > >
> > >> Hi Renjie,
> > >>
> > >> that would be great. There is already a Jira issue for it:
> > >> https://issues.apache.org/jira/browse/FLINK-6968
> > >>
> > >> Feel free to assign it to yourself. You can reuse parts of my 
> > >> code if you want. But maybe it would make sense to have a little 
> > >> design document first about what we want to support.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > >>> Hi, Timo, I've been planning on the same thing and would like to
> > >> contribute
> > >>> that.
> > >>>
> > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> > wrote:
> > >>>
> > >>>> Hi Stefano,
> > >>>>
> > >>>> yes there are plan in this direction. Actually, I already 
> > >>>> worked on
> > such
> > >>>> a QueryableStateTableSink [1] in the past but never finished it
> > because
> > >>>> of priority shifts. Would be great if somebody wants to 
> > >>>> contribute
> > this
> > >>>> functionality :)
> > >>>>
> > >>>> Regards,
> > >>>> Timo
> > >>>>
> > >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > >>>>
> > >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > >>>>> Hi guys,
> > >>>>>
> > >>>>> I am checking out the queriable state API, and it seems that 
> > >>>>> most of
> > >> the
> > >>>> tooling is already available. However, the queriable state is
> > available
> > >>>> just for the streaming API, not at the StreamSQL API level. In
> > >> principle,
> > >>>> as the flink-table is aware of the query semantic and data 
> > >>>> output
> > type,
> > >> it
> > >>>> should be possible to configure the query compilation to nest
> > queriable
> > >>>> state in the process/window functions. Is there any plan in 
> > >>>> this
> > >> direction?
> > >>>>> Best,
> > >>>>> Stefano
> > >>>>>
> > >>>> --
> > >>> Liu, Renjie
> > >>> Software Engineer, MVAD
> > >>>
> > >> --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> > --
> Liu, Renjie
> Software Engineer, MVAD
>
--
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Hi, Timo:
I've read your QueryableStateTableSink implementation and that basically
implementes what I want to do. I also want to extend SQL client so that
user can do point query against the table sink. Do we still need a design
doc for that? It seems that I just need to finish the left part and do some
test against it.

Hi, Stefano:
Your requirement needs some changes to the flink table implementation but I
don't know why you need that? For debugging? The operator state is internal
and subject to optisimation logic, so I think it maybe meanless to expose
that.

On Fri, Mar 2, 2018 at 9:37 PM Stefano Bortoli <st...@huawei.com>
wrote:

> Hi Timo, Renjie,
>
> What I was thinking did not include the QueryableStateTableSink, but
> rather tap in directly into the state of a streaming operator. Perhaps it
> is the same thing, but just it sounds not intuitive to consider it a sink.
>
> So, we would need a way to configure the environment for the query to
> share the "state name" before the query is executed, and then use this to
> create the hook for the queriable state in the operator. Perhaps extend the
> current codegen and operator implementations to get as a parameter the
> StateDescriptor to be inquired.
>
> Looking forward for the design document, will be happy to give you
> feedback.
>
> Best,
> Stefano
>
> -----Original Message-----
> From: Renjie Liu [mailto:liurenjie2008@gmail.com]
> Sent: Friday, March 02, 2018 11:42 AM
> To: dev@flink.apache.org
> Subject: Re: StreamSQL queriable state
>
> Great, thank you.
> I'll start by writing a design doc.
>
> On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:
>
> > I gave you contributor permissions in Jira. You should be able to
> > assign it to yourself now.
> >
> > Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > > Hi, Timo:
> > > It seems that I can't assign it to myself. Could you please help to
> > assign
> > > that to me?
> > > My jira username is liurenjie1024 and my email is
> > liurenjie2008@gmail.com
> > >
> > > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org>
> wrote:
> > >
> > >> Hi Renjie,
> > >>
> > >> that would be great. There is already a Jira issue for it:
> > >> https://issues.apache.org/jira/browse/FLINK-6968
> > >>
> > >> Feel free to assign it to yourself. You can reuse parts of my code
> > >> if you want. But maybe it would make sense to have a little design
> > >> document first about what we want to support.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > >>> Hi, Timo, I've been planning on the same thing and would like to
> > >> contribute
> > >>> that.
> > >>>
> > >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> > wrote:
> > >>>
> > >>>> Hi Stefano,
> > >>>>
> > >>>> yes there are plan in this direction. Actually, I already worked
> > >>>> on
> > such
> > >>>> a QueryableStateTableSink [1] in the past but never finished it
> > because
> > >>>> of priority shifts. Would be great if somebody wants to
> > >>>> contribute
> > this
> > >>>> functionality :)
> > >>>>
> > >>>> Regards,
> > >>>> Timo
> > >>>>
> > >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> > >>>>
> > >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > >>>>> Hi guys,
> > >>>>>
> > >>>>> I am checking out the queriable state API, and it seems that
> > >>>>> most of
> > >> the
> > >>>> tooling is already available. However, the queriable state is
> > available
> > >>>> just for the streaming API, not at the StreamSQL API level. In
> > >> principle,
> > >>>> as the flink-table is aware of the query semantic and data output
> > type,
> > >> it
> > >>>> should be possible to configure the query compilation to nest
> > queriable
> > >>>> state in the process/window functions. Is there any plan in this
> > >> direction?
> > >>>>> Best,
> > >>>>> Stefano
> > >>>>>
> > >>>> --
> > >>> Liu, Renjie
> > >>> Software Engineer, MVAD
> > >>>
> > >> --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> > --
> Liu, Renjie
> Software Engineer, MVAD
>
-- 
Liu, Renjie
Software Engineer, MVAD

RE: StreamSQL queriable state

Posted by Stefano Bortoli <st...@huawei.com>.
Hi Timo, Renjie,

What I was thinking did not include the QueryableStateTableSink, but rather tap in directly into the state of a streaming operator. Perhaps it is the same thing, but just it sounds not intuitive to consider it a sink.

So, we would need a way to configure the environment for the query to share the "state name" before the query is executed, and then use this to create the hook for the queriable state in the operator. Perhaps extend the current codegen and operator implementations to get as a parameter the StateDescriptor to be inquired. 

Looking forward for the design document, will be happy to give you feedback.

Best,
Stefano

-----Original Message-----
From: Renjie Liu [mailto:liurenjie2008@gmail.com] 
Sent: Friday, March 02, 2018 11:42 AM
To: dev@flink.apache.org
Subject: Re: StreamSQL queriable state

Great, thank you.
I'll start by writing a design doc.

On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:

> I gave you contributor permissions in Jira. You should be able to 
> assign it to yourself now.
>
> Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > Hi, Timo:
> > It seems that I can't assign it to myself. Could you please help to
> assign
> > that to me?
> > My jira username is liurenjie1024 and my email is
> liurenjie2008@gmail.com
> >
> > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org> wrote:
> >
> >> Hi Renjie,
> >>
> >> that would be great. There is already a Jira issue for it:
> >> https://issues.apache.org/jira/browse/FLINK-6968
> >>
> >> Feel free to assign it to yourself. You can reuse parts of my code 
> >> if you want. But maybe it would make sense to have a little design 
> >> document first about what we want to support.
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> >>> Hi, Timo, I've been planning on the same thing and would like to
> >> contribute
> >>> that.
> >>>
> >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> wrote:
> >>>
> >>>> Hi Stefano,
> >>>>
> >>>> yes there are plan in this direction. Actually, I already worked 
> >>>> on
> such
> >>>> a QueryableStateTableSink [1] in the past but never finished it
> because
> >>>> of priority shifts. Would be great if somebody wants to 
> >>>> contribute
> this
> >>>> functionality :)
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> >>>>
> >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> >>>>> Hi guys,
> >>>>>
> >>>>> I am checking out the queriable state API, and it seems that 
> >>>>> most of
> >> the
> >>>> tooling is already available. However, the queriable state is
> available
> >>>> just for the streaming API, not at the StreamSQL API level. In
> >> principle,
> >>>> as the flink-table is aware of the query semantic and data output
> type,
> >> it
> >>>> should be possible to configure the query compilation to nest
> queriable
> >>>> state in the process/window functions. Is there any plan in this
> >> direction?
> >>>>> Best,
> >>>>> Stefano
> >>>>>
> >>>> --
> >>> Liu, Renjie
> >>> Software Engineer, MVAD
> >>>
> >> --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
> --
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Great, thank you.
I'll start by writing a design doc.

On Fri, Mar 2, 2018 at 6:40 PM Timo Walther <tw...@apache.org> wrote:

> I gave you contributor permissions in Jira. You should be able to assign
> it to yourself now.
>
> Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> > Hi, Timo:
> > It seems that I can't assign it to myself. Could you please help to
> assign
> > that to me?
> > My jira username is liurenjie1024 and my email is
> liurenjie2008@gmail.com
> >
> > On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org> wrote:
> >
> >> Hi Renjie,
> >>
> >> that would be great. There is already a Jira issue for it:
> >> https://issues.apache.org/jira/browse/FLINK-6968
> >>
> >> Feel free to assign it to yourself. You can reuse parts of my code if
> >> you want. But maybe it would make sense to have a little design document
> >> first about what we want to support.
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> >>> Hi, Timo, I've been planning on the same thing and would like to
> >> contribute
> >>> that.
> >>>
> >>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org>
> wrote:
> >>>
> >>>> Hi Stefano,
> >>>>
> >>>> yes there are plan in this direction. Actually, I already worked on
> such
> >>>> a QueryableStateTableSink [1] in the past but never finished it
> because
> >>>> of priority shifts. Would be great if somebody wants to contribute
> this
> >>>> functionality :)
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> >>>>
> >>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> >>>>> Hi guys,
> >>>>>
> >>>>> I am checking out the queriable state API, and it seems that most of
> >> the
> >>>> tooling is already available. However, the queriable state is
> available
> >>>> just for the streaming API, not at the StreamSQL API level. In
> >> principle,
> >>>> as the flink-table is aware of the query semantic and data output
> type,
> >> it
> >>>> should be possible to configure the query compilation to nest
> queriable
> >>>> state in the process/window functions. Is there any plan in this
> >> direction?
> >>>>> Best,
> >>>>> Stefano
> >>>>>
> >>>> --
> >>> Liu, Renjie
> >>> Software Engineer, MVAD
> >>>
> >> --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
> --
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Timo Walther <tw...@apache.org>.
I gave you contributor permissions in Jira. You should be able to assign 
it to yourself now.

Am 3/2/18 um 11:33 AM schrieb Renjie Liu:
> Hi, Timo:
> It seems that I can't assign it to myself. Could you please help to assign
> that to me?
> My jira username is liurenjie1024 and my email is liurenjie2008@gmail.com
>
> On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org> wrote:
>
>> Hi Renjie,
>>
>> that would be great. There is already a Jira issue for it:
>> https://issues.apache.org/jira/browse/FLINK-6968
>>
>> Feel free to assign it to yourself. You can reuse parts of my code if
>> you want. But maybe it would make sense to have a little design document
>> first about what we want to support.
>>
>> Regards,
>> Timo
>>
>>
>> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
>>> Hi, Timo, I've been planning on the same thing and would like to
>> contribute
>>> that.
>>>
>>> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org> wrote:
>>>
>>>> Hi Stefano,
>>>>
>>>> yes there are plan in this direction. Actually, I already worked on such
>>>> a QueryableStateTableSink [1] in the past but never finished it because
>>>> of priority shifts. Would be great if somebody wants to contribute this
>>>> functionality :)
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
>>>>
>>>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
>>>>> Hi guys,
>>>>>
>>>>> I am checking out the queriable state API, and it seems that most of
>> the
>>>> tooling is already available. However, the queriable state is available
>>>> just for the streaming API, not at the StreamSQL API level. In
>> principle,
>>>> as the flink-table is aware of the query semantic and data output type,
>> it
>>>> should be possible to configure the query compilation to nest queriable
>>>> state in the process/window functions. Is there any plan in this
>> direction?
>>>>> Best,
>>>>> Stefano
>>>>>
>>>> --
>>> Liu, Renjie
>>> Software Engineer, MVAD
>>>
>> --
> Liu, Renjie
> Software Engineer, MVAD
>


Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Hi, Timo:
It seems that I can't assign it to myself. Could you please help to assign
that to me?
My jira username is liurenjie1024 and my email is liurenjie2008@gmail.com

On Fri, Mar 2, 2018 at 6:24 PM Timo Walther <tw...@apache.org> wrote:

> Hi Renjie,
>
> that would be great. There is already a Jira issue for it:
> https://issues.apache.org/jira/browse/FLINK-6968
>
> Feel free to assign it to yourself. You can reuse parts of my code if
> you want. But maybe it would make sense to have a little design document
> first about what we want to support.
>
> Regards,
> Timo
>
>
> Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> > Hi, Timo, I've been planning on the same thing and would like to
> contribute
> > that.
> >
> > On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org> wrote:
> >
> >> Hi Stefano,
> >>
> >> yes there are plan in this direction. Actually, I already worked on such
> >> a QueryableStateTableSink [1] in the past but never finished it because
> >> of priority shifts. Would be great if somebody wants to contribute this
> >> functionality :)
> >>
> >> Regards,
> >> Timo
> >>
> >> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
> >>
> >> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> >>> Hi guys,
> >>>
> >>> I am checking out the queriable state API, and it seems that most of
> the
> >> tooling is already available. However, the queriable state is available
> >> just for the streaming API, not at the StreamSQL API level. In
> principle,
> >> as the flink-table is aware of the query semantic and data output type,
> it
> >> should be possible to configure the query compilation to nest queriable
> >> state in the process/window functions. Is there any plan in this
> direction?
> >>> Best,
> >>> Stefano
> >>>
> >> --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
> --
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Timo Walther <tw...@apache.org>.
Hi Renjie,

that would be great. There is already a Jira issue for it: 
https://issues.apache.org/jira/browse/FLINK-6968

Feel free to assign it to yourself. You can reuse parts of my code if 
you want. But maybe it would make sense to have a little design document 
first about what we want to support.

Regards,
Timo


Am 3/2/18 um 11:10 AM schrieb Renjie Liu:
> Hi, Timo, I've been planning on the same thing and would like to contribute
> that.
>
> On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org> wrote:
>
>> Hi Stefano,
>>
>> yes there are plan in this direction. Actually, I already worked on such
>> a QueryableStateTableSink [1] in the past but never finished it because
>> of priority shifts. Would be great if somebody wants to contribute this
>> functionality :)
>>
>> Regards,
>> Timo
>>
>> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
>>
>> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
>>> Hi guys,
>>>
>>> I am checking out the queriable state API, and it seems that most of the
>> tooling is already available. However, the queriable state is available
>> just for the streaming API, not at the StreamSQL API level. In principle,
>> as the flink-table is aware of the query semantic and data output type, it
>> should be possible to configure the query compilation to nest queriable
>> state in the process/window functions. Is there any plan in this direction?
>>> Best,
>>> Stefano
>>>
>> --
> Liu, Renjie
> Software Engineer, MVAD
>


Re: StreamSQL queriable state

Posted by Renjie Liu <li...@gmail.com>.
Hi, Timo, I've been planning on the same thing and would like to contribute
that.

On Fri, Mar 2, 2018 at 6:05 PM Timo Walther <tw...@apache.org> wrote:

> Hi Stefano,
>
> yes there are plan in this direction. Actually, I already worked on such
> a QueryableStateTableSink [1] in the past but never finished it because
> of priority shifts. Would be great if somebody wants to contribute this
> functionality :)
>
> Regards,
> Timo
>
> [1] https://github.com/twalthr/flink/tree/QueryableTableSink
>
> Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> > Hi guys,
> >
> > I am checking out the queriable state API, and it seems that most of the
> tooling is already available. However, the queriable state is available
> just for the streaming API, not at the StreamSQL API level. In principle,
> as the flink-table is aware of the query semantic and data output type, it
> should be possible to configure the query compilation to nest queriable
> state in the process/window functions. Is there any plan in this direction?
> >
> > Best,
> > Stefano
> >
>
> --
Liu, Renjie
Software Engineer, MVAD

Re: StreamSQL queriable state

Posted by Timo Walther <tw...@apache.org>.
Hi Stefano,

yes there are plan in this direction. Actually, I already worked on such 
a QueryableStateTableSink [1] in the past but never finished it because 
of priority shifts. Would be great if somebody wants to contribute this 
functionality :)

Regards,
Timo

[1] https://github.com/twalthr/flink/tree/QueryableTableSink

Am 3/2/18 um 10:58 AM schrieb Stefano Bortoli:
> Hi guys,
>
> I am checking out the queriable state API, and it seems that most of the tooling is already available. However, the queriable state is available just for the streaming API, not at the StreamSQL API level. In principle, as the flink-table is aware of the query semantic and data output type, it should be possible to configure the query compilation to nest queriable state in the process/window functions. Is there any plan in this direction?
>
> Best,
> Stefano
>