You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Padarn Wilson <pa...@gmail.com> on 2020/03/13 07:20:39 UTC

Implicit Flink Context Documentation

Hi Users,

I am trying to understand the details of how some aspects of Flink work.

While understanding `keyed state` I kept coming up against a claim that `there
is a specific key implicitly in context` I would like to understand how
this works, which I'm guessing means understanding the details of the
runtime context: Is there any documentation or FLIP someone can recommend
on this?

Re: Implicit Flink Context Documentation

Posted by Padarn Wilson <pa...@gmail.com>.
Thanks for the clarification. I'll dig in then!

On Mon, 16 Mar 2020, 3:47 pm Piotr Nowojski, <pi...@ververica.com> wrote:

> Hi,
>
> We are not maintaining internal docs. We have design docs for newly
> proposed features (previously informal design docs published on dev mailing
> list and recently as FLIP documents [1]), but keyed state is such an old
> concept that dates back so much into the past, that I’m pretty sure it pre
> dates any of that. So you would have to digg through the code if you want
> to understand it.
>
> Piotrek
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>
> On 13 Mar 2020, at 16:14, Padarn Wilson <pa...@gmail.com> wrote:
>
> Thanks Piotr,
>
> Conceptually I understand (and use) the key'ed state quite a lot, but the
> implementation details are what I was looking for.
>
> It looks like
> `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1`
> is what I'm looking for though. It would be cool if there were some
> internals design doc however? Quite hard to dig through the code as there
> is a log tied to how the execution of the job actually happens.
>
> Padarn
>
> On Fri, Mar 13, 2020 at 9:43 PM Piotr Nowojski <pi...@ververica.com>
> wrote:
>
>> Hi,
>>
>> Please take a look for example here:
>>
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state
>> And the example in particular
>>
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state
>>
>> The part about "there is a specific key implicitly in context” might be
>> referring to the fact, that for every instance of `CountWindowAverage` that
>> will be running in the cluster, user doesn’t have to set the key context
>> explicility. Flink will set the the key context automatically for the
>> `ValueState<Tuple2<Long, Long>> sum;` before any invocation of
>> `CountWindowAverage#flatMap` method.
>>
>> In other words, one parallel instance of `CountWindowAverage` function,
>> for two consecutive invocations of `CountWindowAverage#flatMap` can be
>> referring to different underlying value of `CountWindowAverage#sum` field.
>> For details you could take a look at
>> `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1`
>> method and how it’s being used/implemented.
>>
>> I hope that helps.
>>
>> Piotrek
>>
>> On 13 Mar 2020, at 08:20, Padarn Wilson <pa...@gmail.com> wrote:
>>
>> Hi Users,
>>
>> I am trying to understand the details of how some aspects of Flink work.
>>
>> While understanding `keyed state` I kept coming up against a claim that `there
>> is a specific key implicitly in context` I would like to understand how
>> this works, which I'm guessing means understanding the details of the
>> runtime context: Is there any documentation or FLIP someone can recommend
>> on this?
>>
>>
>>
>

Re: Implicit Flink Context Documentation

Posted by Piotr Nowojski <pi...@ververica.com>.
Hi,

We are not maintaining internal docs. We have design docs for newly proposed features (previously informal design docs published on dev mailing list and recently as FLIP documents [1]), but keyed state is such an old concept that dates back so much into the past, that I’m pretty sure it pre dates any of that. So you would have to digg through the code if you want to understand it.

Piotrek

[1] https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals <https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals>

> On 13 Mar 2020, at 16:14, Padarn Wilson <pa...@gmail.com> wrote:
> 
> Thanks Piotr,
> 
> Conceptually I understand (and use) the key'ed state quite a lot, but the implementation details are what I was looking for.
> 
> It looks like `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1` is what I'm looking for though. It would be cool if there were some internals design doc however? Quite hard to dig through the code as there is a log tied to how the execution of the job actually happens.
> 
> Padarn
> 
> On Fri, Mar 13, 2020 at 9:43 PM Piotr Nowojski <piotr@ververica.com <ma...@ververica.com>> wrote:
> Hi,
> 
> Please take a look for example here:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state>
> And the example in particular
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state>
> 
> The part about "there is a specific key implicitly in context” might be referring to the fact, that for every instance of `CountWindowAverage` that will be running in the cluster, user doesn’t have to set the key context explicility. Flink will set the the key context automatically for the `ValueState<Tuple2<Long, Long>> sum;` before any invocation of `CountWindowAverage#flatMap` method.
> 
> In other words, one parallel instance of `CountWindowAverage` function, for two consecutive invocations of `CountWindowAverage#flatMap` can be referring to different underlying value of `CountWindowAverage#sum` field. For details you could take a look at `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1` method and how it’s being used/implemented.
> 
> I hope that helps.
> 
> Piotrek
> 
>> On 13 Mar 2020, at 08:20, Padarn Wilson <padarn@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Users,
>> 
>> I am trying to understand the details of how some aspects of Flink work.
>> 
>> While understanding `keyed state` I kept coming up against a claim that `there is a specific key implicitly in context` I would like to understand how this works, which I'm guessing means understanding the details of the runtime context: Is there any documentation or FLIP someone can recommend on this?  
> 


Re: Implicit Flink Context Documentation

Posted by Padarn Wilson <pa...@gmail.com>.
Thanks Piotr,

Conceptually I understand (and use) the key'ed state quite a lot, but the
implementation details are what I was looking for.

It looks like
`org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1`
is what I'm looking for though. It would be cool if there were some
internals design doc however? Quite hard to dig through the code as there
is a log tied to how the execution of the job actually happens.

Padarn

On Fri, Mar 13, 2020 at 9:43 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> Please take a look for example here:
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state
> And the example in particular
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state
>
> The part about "there is a specific key implicitly in context” might be
> referring to the fact, that for every instance of `CountWindowAverage` that
> will be running in the cluster, user doesn’t have to set the key context
> explicility. Flink will set the the key context automatically for the
> `ValueState<Tuple2<Long, Long>> sum;` before any invocation of
> `CountWindowAverage#flatMap` method.
>
> In other words, one parallel instance of `CountWindowAverage` function,
> for two consecutive invocations of `CountWindowAverage#flatMap` can be
> referring to different underlying value of `CountWindowAverage#sum` field.
> For details you could take a look at
> `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1`
> method and how it’s being used/implemented.
>
> I hope that helps.
>
> Piotrek
>
> On 13 Mar 2020, at 08:20, Padarn Wilson <pa...@gmail.com> wrote:
>
> Hi Users,
>
> I am trying to understand the details of how some aspects of Flink work.
>
> While understanding `keyed state` I kept coming up against a claim that `there
> is a specific key implicitly in context` I would like to understand how
> this works, which I'm guessing means understanding the details of the
> runtime context: Is there any documentation or FLIP someone can recommend
> on this?
>
>
>

Re: Implicit Flink Context Documentation

Posted by Piotr Nowojski <pi...@ververica.com>.
Hi,

Please take a look for example here:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#keyed-state>
And the example in particular
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#using-managed-keyed-state>

The part about "there is a specific key implicitly in context” might be referring to the fact, that for every instance of `CountWindowAverage` that will be running in the cluster, user doesn’t have to set the key context explicility. Flink will set the the key context automatically for the `ValueState<Tuple2<Long, Long>> sum;` before any invocation of `CountWindowAverage#flatMap` method.

In other words, one parallel instance of `CountWindowAverage` function, for two consecutive invocations of `CountWindowAverage#flatMap` can be referring to different underlying value of `CountWindowAverage#sum` field. For details you could take a look at `org.apache.flink.streaming.api.operators.AbstractStreamOperator#setKeyContextElement1` method and how it’s being used/implemented.

I hope that helps.

Piotrek

> On 13 Mar 2020, at 08:20, Padarn Wilson <pa...@gmail.com> wrote:
> 
> Hi Users,
> 
> I am trying to understand the details of how some aspects of Flink work.
> 
> While understanding `keyed state` I kept coming up against a claim that `there is a specific key implicitly in context` I would like to understand how this works, which I'm guessing means understanding the details of the runtime context: Is there any documentation or FLIP someone can recommend on this?