You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by HG <ha...@gmail.com> on 2022/01/26 18:56:43 UTC
Unbounded streaming with table API and large json as one of the columns
Hi,
I need to calculate elapsed times between steps of a transaction.
Each step is an event. All steps belonging to a single transaction have the
same transaction id. Every event has a handling time.
All information is part of a large JSON structure.
But I can have the incoming source supply transactionId and handlingTime
separately.
That would save me retrieving the windowingKey = transactionID and
handlingTime out of the nested JSON
Basically I want to use the SQL api to do:
select transactionId
, handlingTime - previousHandlingTime as elapsedTime
, largeJSON from (
select transactionId
, handlingTime
, lag(handlingTime) over (partition by transactionID order by
handlingTime) as previousHandlingTime
, largeJSON
from source
)
The largeJSON can be about 100K.
Would this work?
Regards Hans-Peter
Re: Unbounded streaming with table API and large json as one of the columns
Posted by HG <ha...@gmail.com>.
Thanks
On Fri, Jan 28, 2022, 07:47 Caizhi Weng <ts...@gmail.com> wrote:
> Hi!
>
> This job will work as long as your SQL statement is valid. Did you meet
> some difficulties? Or what is your concern? A record of 100K is sort of
> large, but I've seen quite a lot of jobs with such record size so it is OK.
>
> HG <ha...@gmail.com> 于2022年1月27日周四 02:57写道:
>
>> Hi,
>>
>> I need to calculate elapsed times between steps of a transaction.
>> Each step is an event. All steps belonging to a single transaction have
>> the same transaction id. Every event has a handling time.
>> All information is part of a large JSON structure.
>> But I can have the incoming source supply transactionId and handlingTime
>> separately.
>> That would save me retrieving the windowingKey = transactionID and
>> handlingTime out of the nested JSON
>> Basically I want to use the SQL api to do:
>>
>> select transactionId
>> , handlingTime - previousHandlingTime as elapsedTime
>> , largeJSON from (
>> select transactionId
>> , handlingTime
>> , lag(handlingTime) over (partition by transactionID order by
>> handlingTime) as previousHandlingTime
>> , largeJSON
>> from source
>> )
>>
>> The largeJSON can be about 100K.
>> Would this work?
>>
>> Regards Hans-Peter
>>
>>
Re: Unbounded streaming with table API and large json as one of the columns
Posted by Caizhi Weng <ts...@gmail.com>.
Hi!
This job will work as long as your SQL statement is valid. Did you meet
some difficulties? Or what is your concern? A record of 100K is sort of
large, but I've seen quite a lot of jobs with such record size so it is OK.
HG <ha...@gmail.com> 于2022年1月27日周四 02:57写道:
> Hi,
>
> I need to calculate elapsed times between steps of a transaction.
> Each step is an event. All steps belonging to a single transaction have
> the same transaction id. Every event has a handling time.
> All information is part of a large JSON structure.
> But I can have the incoming source supply transactionId and handlingTime
> separately.
> That would save me retrieving the windowingKey = transactionID and
> handlingTime out of the nested JSON
> Basically I want to use the SQL api to do:
>
> select transactionId
> , handlingTime - previousHandlingTime as elapsedTime
> , largeJSON from (
> select transactionId
> , handlingTime
> , lag(handlingTime) over (partition by transactionID order by
> handlingTime) as previousHandlingTime
> , largeJSON
> from source
> )
>
> The largeJSON can be about 100K.
> Would this work?
>
> Regards Hans-Peter
>
>