You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Rex Fenley <Re...@remind101.com> on 2021/02/19 06:29:14 UTC

How is proctime represented?

Hello,

When using PROCTIME() in CREATE DDL for a source, is the proctime attribute
a timestamp generated at the time of row ingestion at the source and then
forwarded through the graph execution, or is proctime attribute a
placeholder that says "fill me in with a timestamp" once it's being used
directly by some operator, by some machine?

Thanks!

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Re: How is proctime represented?

Posted by Rex Fenley <Re...@remind101.com>.
Thanks yall this is really helpful!

On Fri, Feb 19, 2021 at 2:40 AM Timo Walther <tw...@apache.org> wrote:

> Chesnay is right. The PROCTIME() is lazy evaluated and executed when its
> result is needed as an argument for another expression or function. So
> within the pipeline the column is NULL but when you want to compute
> something e.g. CAST(proctime AS TIMESTAMP(3)) it will be materialized
> into the row. If you want to use ingestion time, you should be able to use:
>
> CREATE TABLE (
>    ingest_ts AS CAST(PROCTIME() AS TIMESTAMP(3))
> )
>
> Regards,
> Timo
>
>
> On 19.02.21 10:23, Chesnay Schepler wrote:
> > hmm...I can now see where that uncertainty comes from.
> >
> > My /impression/ is that PROCTIME is not evaluated eagerly, and instead
> > and operators relying on this column generate their own processing
> > timestamp. What throws me off is that I cannot tell how you would tell
> > Flink to store a processing timestamp as is in a row (to essentially
> > create something like ingestion time).
> >
> > I'm looping in Timo to provide some clarity.
> >
> > On 2/19/2021 8:39 AM, Rex Fenley wrote:
> >> Reading the documentation you posted again after posting this
> >> question, it does sound like it's simply a placeholder that only gets
> >> filled in when used by an operator, then again, that's still not
> >> exactly what it says so I only feel 70% confident like that's what is
> >> happening.
> >>
> >> On Thu, Feb 18, 2021 at 10:55 PM Chesnay Schepler <chesnay@apache.org
> >> <ma...@apache.org>> wrote:
> >>
> >>     Could you check whether this answers your question?
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time
> >>     <
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time
> >
> >>
> >>     On 2/19/2021 7:29 AM, Rex Fenley wrote:
> >>>     Hello,
> >>>
> >>>     When using PROCTIME() in CREATE DDL for a source, is the proctime
> >>>     attribute a timestamp generated at the time of row ingestion at
> >>>     the source and then forwarded through the graph execution, or is
> >>>     proctime attribute a placeholder that says "fill me in with a
> >>>     timestamp" once it's being used directly by some operator, by
> >>>     some machine?
> >>>
> >>>     Thanks!
> >>>
> >>>     --
> >>>
> >>>     Rex Fenley|Software Engineer - Mobile and Backend
> >>>
> >>>
> >>>     Remind.com <https://www.remind.com/>| BLOG
> >>>     <http://blog.remind.com/> | FOLLOW US
> >>>     <https://twitter.com/remindhq> | LIKE US
> >>>     <https://www.facebook.com/remindhq>
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> Rex Fenley|Software Engineer - Mobile and Backend
> >>
> >>
> >> Remind.com <https://www.remind.com/>| BLOG <http://blog.remind.com/> |
> >> FOLLOW US <https://twitter.com/remindhq> | LIKE US
> >> <https://www.facebook.com/remindhq>
> >>
> >
>
>

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Re: How is proctime represented?

Posted by Timo Walther <tw...@apache.org>.
Chesnay is right. The PROCTIME() is lazy evaluated and executed when its 
result is needed as an argument for another expression or function. So 
within the pipeline the column is NULL but when you want to compute 
something e.g. CAST(proctime AS TIMESTAMP(3)) it will be materialized 
into the row. If you want to use ingestion time, you should be able to use:

CREATE TABLE (
   ingest_ts AS CAST(PROCTIME() AS TIMESTAMP(3))
)

Regards,
Timo


On 19.02.21 10:23, Chesnay Schepler wrote:
> hmm...I can now see where that uncertainty comes from.
> 
> My /impression/ is that PROCTIME is not evaluated eagerly, and instead 
> and operators relying on this column generate their own processing 
> timestamp. What throws me off is that I cannot tell how you would tell 
> Flink to store a processing timestamp as is in a row (to essentially 
> create something like ingestion time).
> 
> I'm looping in Timo to provide some clarity.
> 
> On 2/19/2021 8:39 AM, Rex Fenley wrote:
>> Reading the documentation you posted again after posting this 
>> question, it does sound like it's simply a placeholder that only gets 
>> filled in when used by an operator, then again, that's still not 
>> exactly what it says so I only feel 70% confident like that's what is 
>> happening.
>>
>> On Thu, Feb 18, 2021 at 10:55 PM Chesnay Schepler <chesnay@apache.org 
>> <ma...@apache.org>> wrote:
>>
>>     Could you check whether this answers your question?
>>
>>     https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time
>>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time>
>>
>>     On 2/19/2021 7:29 AM, Rex Fenley wrote:
>>>     Hello,
>>>
>>>     When using PROCTIME() in CREATE DDL for a source, is the proctime
>>>     attribute a timestamp generated at the time of row ingestion at
>>>     the source and then forwarded through the graph execution, or is
>>>     proctime attribute a placeholder that says "fill me in with a
>>>     timestamp" once it's being used directly by some operator, by
>>>     some machine?
>>>
>>>     Thanks!
>>>
>>>     -- 
>>>
>>>     Rex Fenley|Software Engineer - Mobile and Backend
>>>
>>>
>>>     Remind.com <https://www.remind.com/>| BLOG
>>>     <http://blog.remind.com/> | FOLLOW US
>>>     <https://twitter.com/remindhq> | LIKE US
>>>     <https://www.facebook.com/remindhq>
>>>
>>
>>
>>
>> -- 
>>
>> Rex Fenley|Software Engineer - Mobile and Backend
>>
>>
>> Remind.com <https://www.remind.com/>| BLOG <http://blog.remind.com/> | 
>> FOLLOW US <https://twitter.com/remindhq> | LIKE US 
>> <https://www.facebook.com/remindhq>
>>
> 


Re: How is proctime represented?

Posted by Chesnay Schepler <ch...@apache.org>.
hmm...I can now see where that uncertainty comes from.

My /impression/ is that PROCTIME is not evaluated eagerly, and instead 
and operators relying on this column generate their own processing 
timestamp. What throws me off is that I cannot tell how you would tell 
Flink to store a processing timestamp as is in a row (to essentially 
create something like ingestion time).

I'm looping in Timo to provide some clarity.

On 2/19/2021 8:39 AM, Rex Fenley wrote:
> Reading the documentation you posted again after posting this 
> question, it does sound like it's simply a placeholder that only gets 
> filled in when used by an operator, then again, that's still not 
> exactly what it says so I only feel 70% confident like that's what is 
> happening.
>
> On Thu, Feb 18, 2021 at 10:55 PM Chesnay Schepler <chesnay@apache.org 
> <ma...@apache.org>> wrote:
>
>     Could you check whether this answers your question?
>
>     https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time>
>
>     On 2/19/2021 7:29 AM, Rex Fenley wrote:
>>     Hello,
>>
>>     When using PROCTIME() in CREATE DDL for a source, is the proctime
>>     attribute a timestamp generated at the time of row ingestion at
>>     the source and then forwarded through the graph execution, or is
>>     proctime attribute a placeholder that says "fill me in with a
>>     timestamp" once it's being used directly by some operator, by
>>     some machine?
>>
>>     Thanks!
>>
>>     -- 
>>
>>     Rex Fenley|Software Engineer - Mobile and Backend
>>
>>
>>     Remind.com <https://www.remind.com/>| BLOG
>>     <http://blog.remind.com/> | FOLLOW US
>>     <https://twitter.com/remindhq> | LIKE US
>>     <https://www.facebook.com/remindhq>
>>
>
>
>
> -- 
>
> Rex Fenley|Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/>| BLOG <http://blog.remind.com/> | 
> FOLLOW US <https://twitter.com/remindhq> | LIKE US 
> <https://www.facebook.com/remindhq>
>


Re: How is proctime represented?

Posted by Rex Fenley <Re...@remind101.com>.
Reading the documentation you posted again after posting this question, it
does sound like it's simply a placeholder that only gets filled in when
used by an operator, then again, that's still not exactly what it says so I
only feel 70% confident like that's what is happening.

On Thu, Feb 18, 2021 at 10:55 PM Chesnay Schepler <ch...@apache.org>
wrote:

> Could you check whether this answers your question?
>
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time
>
> On 2/19/2021 7:29 AM, Rex Fenley wrote:
>
> Hello,
>
> When using PROCTIME() in CREATE DDL for a source, is the proctime
> attribute a timestamp generated at the time of row ingestion at the source
> and then forwarded through the graph execution, or is proctime attribute a
> placeholder that says "fill me in with a timestamp" once it's being used
> directly by some operator, by some machine?
>
> Thanks!
>
> --
>
> Rex Fenley  |  Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>  |
>  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
> <https://www.facebook.com/remindhq>
>
>
>

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Re: How is proctime represented?

Posted by Chesnay Schepler <ch...@apache.org>.
Could you check whether this answers your question?

https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/timely-stream-processing.html#notions-of-time-event-time-and-processing-time

On 2/19/2021 7:29 AM, Rex Fenley wrote:
> Hello,
>
> When using PROCTIME() in CREATE DDL for a source, is the proctime 
> attribute a timestamp generated at the time of row ingestion at the 
> source and then forwarded through the graph execution, or is proctime 
> attribute a placeholder that says "fill me in with a timestamp" once 
> it's being used directly by some operator, by some machine?
>
> Thanks!
>
> -- 
>
> Rex Fenley|Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/>| BLOG <http://blog.remind.com/> | 
> FOLLOW US <https://twitter.com/remindhq> | LIKE US 
> <https://www.facebook.com/remindhq>
>