You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Ashwin Sinha <as...@go-mmt.com> on 2018/07/16 12:25:25 UTC

rowTime from json nested timestamp field in SQL-Client

Hi Users,

In Flink1.5 SQL CLient
<https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/sqlClient.html>,
we are trying to define rowTime from a nested JSON element, but struggling
with syntax.

JSON data format: https://pastebin.com/ByCLhEnF
YML table config: https://pastebin.com/cgEtQPDQ

Now, in above config, we want to access *payload.after.modifiedon *as
rowTime column. We tried SQL query <https://pastebin.com/yCU4WWhK> with
aggregation on 'payload.after.modifiedon' as time but get this
<https://pastebin.com/bTbushYN> error.

Is there anyway where we can register nested timestamp field as rowTime for
the source table?

-- 
*Ashwin Sinha *| Data Engineer
ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
<https://www.makemytrip.com/> <https://www.goibibo.com/>
<https://www.redbus.in/>

-- 


::DISCLAIMER::


----------------------------------------------------------------------------------------------------------------------------------------------------





This message is intended only for the use of the addressee and may 
contain information that is privileged, confidential and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this e-mail in error, please notify us 
immediately by return e-mail and delete this e-mail and all attachments 
from your system.

Re: rowTime from json nested timestamp field in SQL-Client

Posted by Timo Walther <tw...@apache.org>.
Hi Ashwin,

if you quickly want to make this work you can look into 
org.apache.flink.table.descriptors.RowtimeValidator#getRowtimeComponents.

This is the component that converts the string property into a 
org.apache.flink.table.sources.tsextractors.TimestampExtractor. You can 
implement your custom timestamp extractor that performs some logic.

Regards,
Timo


Am 17.07.18 um 14:37 schrieb Ashwin Sinha:
> Hi Timo,
>
> We want to add this functionality in a forked branch. Can you guide us 
> with steps to quickly apply a patch/fix for the same?
>
> On Mon, Jul 16, 2018 at 9:06 PM Ashwin Sinha <ashwin.sinha@go-mmt.com 
> <ma...@go-mmt.com>> wrote:
>
>     Thanks Timo for the clarification, but our processing actually
>     involves aggregations on huge past data also, which won't be
>     served by processing time.
>
>     Is this a WIP feature?
>
>     On Mon, Jul 16, 2018 at 7:29 PM Timo Walther <twalthr@apache.org
>     <ma...@apache.org>> wrote:
>
>         Hi Ashwin,
>
>         the SQL Client is in an early development stage right now and
>         has some limitations. Your problem is one of them. I files an
>         issue for this: https://issues.apache.org/jira/browse/FLINK-9864
>
>         There is no easy solution to fix this problem. Maybe you can
>         use processing-time for your windows?
>
>         Regards,
>         Timo
>
>         Am 16.07.18 um 14:25 schrieb Ashwin Sinha:
>>         Hi Users,
>>
>>         In Flink1.5 SQL CLient
>>         <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/sqlClient.html>,
>>         we are trying to define rowTime from a nested JSON element,
>>         but struggling with syntax.
>>
>>         JSON data format: https://pastebin.com/ByCLhEnF
>>         YML table config: https://pastebin.com/cgEtQPDQ
>>
>>         Now, in above config, we want to access
>>         *payload.after.modifiedon *as rowTime column. We tried SQL
>>         query <https://pastebin.com/yCU4WWhK> with aggregation on
>>         'payload.after.modifiedon' as time but get this
>>         <https://pastebin.com/bTbushYN> error.
>>
>>         Is there anyway where we can register nested timestamp field
>>         as rowTime for the source table?
>>
>>         -- 
>>         *Ashwin Sinha *| Data Engineer
>>         ashwin.sinha@go-mmt.com <ma...@go-mmt.com> |
>>         9452075361
>>         <https://www.makemytrip.com/><https://www.goibibo.com/>
>>         <https://www.redbus.in/>
>>
>>         ::DISCLAIMER::
>>
>>         ----------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>>         This message is intended only for the use of the addressee
>>         and may contain information that is privileged, confidential
>>         and exempt from disclosure under applicable law. If the
>>         reader of this message is not the intended recipient, or the
>>         employee or agent responsible for delivering the message to
>>         the intended recipient, you are hereby notified that any
>>         dissemination, distribution or copying of this communication
>>         is strictly prohibited. If you have received this e-mail in
>>         error, please notify us immediately by return e-mail and
>>         delete this e-mail and all attachments from your system.
>>
>
>
>
>     -- 
>     *Ashwin Sinha *| Data Engineer
>     ashwin.sinha@go-mmt.com <ma...@go-mmt.com> | 9452075361
>     <https://www.makemytrip.com/><https://www.goibibo.com/>
>     <https://www.redbus.in/>
>
>
>
> -- 
> *Ashwin Sinha *| Data Engineer
> ashwin.sinha@go-mmt.com <ma...@go-mmt.com> | 9452075361
> <https://www.makemytrip.com/><https://www.goibibo.com/>
> <https://www.redbus.in/>
>
> ::DISCLAIMER::
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> This message is intended only for the use of the addressee and may 
> contain information that is privileged, confidential and exempt from 
> disclosure under applicable law. If the reader of this message is not 
> the intended recipient, or the employee or agent responsible for 
> delivering the message to the intended recipient, you are hereby 
> notified that any dissemination, distribution or copying of this 
> communication is strictly prohibited. If you have received this e-mail 
> in error, please notify us immediately by return e-mail and delete 
> this e-mail and all attachments from your system.
>


Re: rowTime from json nested timestamp field in SQL-Client

Posted by Ashwin Sinha <as...@go-mmt.com>.
Hi Timo,

We want to add this functionality in a forked branch. Can you guide us with
steps to quickly apply a patch/fix for the same?

On Mon, Jul 16, 2018 at 9:06 PM Ashwin Sinha <as...@go-mmt.com>
wrote:

> Thanks Timo for the clarification, but our processing actually involves
> aggregations on huge past data also, which won't be served by processing
> time.
>
> Is this a WIP feature?
>
> On Mon, Jul 16, 2018 at 7:29 PM Timo Walther <tw...@apache.org> wrote:
>
>> Hi Ashwin,
>>
>> the SQL Client is in an early development stage right now and has some
>> limitations. Your problem is one of them. I files an issue for this:
>> https://issues.apache.org/jira/browse/FLINK-9864
>>
>> There is no easy solution to fix this problem. Maybe you can use
>> processing-time for your windows?
>>
>> Regards,
>> Timo
>>
>> Am 16.07.18 um 14:25 schrieb Ashwin Sinha:
>>
>> Hi Users,
>>
>> In Flink1.5 SQL CLient
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/sqlClient.html>,
>> we are trying to define rowTime from a nested JSON element, but struggling
>> with syntax.
>>
>> JSON data format: https://pastebin.com/ByCLhEnF
>> YML table config: https://pastebin.com/cgEtQPDQ
>>
>> Now, in above config, we want to access *payload.after.modifiedon *as
>> rowTime column. We tried SQL query <https://pastebin.com/yCU4WWhK> with
>> aggregation on 'payload.after.modifiedon' as time but get this
>> <https://pastebin.com/bTbushYN> error.
>>
>> Is there anyway where we can register nested timestamp field as rowTime
>> for the source table?
>>
>> --
>> *Ashwin Sinha *| Data Engineer
>> ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
>> <https://www.makemytrip.com/> <https://www.goibibo.com/>
>> <https://www.redbus.in/>
>>
>> ::DISCLAIMER::
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> This message is intended only for the use of the addressee and may
>> contain information that is privileged, confidential and exempt from
>> disclosure under applicable law. If the reader of this message is not the
>> intended recipient, or the employee or agent responsible for delivering the
>> message to the intended recipient, you are hereby notified that any
>> dissemination, distribution or copying of this communication is strictly
>> prohibited. If you have received this e-mail in error, please notify us
>> immediately by return e-mail and delete this e-mail and all attachments
>> from your system.
>>
>>
>>
>
> --
> *Ashwin Sinha *| Data Engineer
> ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
> <https://www.makemytrip.com/> <https://www.goibibo.com/>
> <https://www.redbus.in/>
>


-- 
*Ashwin Sinha *| Data Engineer
ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
<https://www.makemytrip.com/> <https://www.goibibo.com/>
<https://www.redbus.in/>

-- 


::DISCLAIMER::


----------------------------------------------------------------------------------------------------------------------------------------------------





This message is intended only for the use of the addressee and may 
contain information that is privileged, confidential and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this e-mail in error, please notify us 
immediately by return e-mail and delete this e-mail and all attachments 
from your system.

Re: rowTime from json nested timestamp field in SQL-Client

Posted by Ashwin Sinha <as...@go-mmt.com>.
Thanks Timo for the clarification, but our processing actually involves
aggregations on huge past data also, which won't be served by processing
time.

Is this a WIP feature?

On Mon, Jul 16, 2018 at 7:29 PM Timo Walther <tw...@apache.org> wrote:

> Hi Ashwin,
>
> the SQL Client is in an early development stage right now and has some
> limitations. Your problem is one of them. I files an issue for this:
> https://issues.apache.org/jira/browse/FLINK-9864
>
> There is no easy solution to fix this problem. Maybe you can use
> processing-time for your windows?
>
> Regards,
> Timo
>
> Am 16.07.18 um 14:25 schrieb Ashwin Sinha:
>
> Hi Users,
>
> In Flink1.5 SQL CLient
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/sqlClient.html>,
> we are trying to define rowTime from a nested JSON element, but struggling
> with syntax.
>
> JSON data format: https://pastebin.com/ByCLhEnF
> YML table config: https://pastebin.com/cgEtQPDQ
>
> Now, in above config, we want to access *payload.after.modifiedon *as
> rowTime column. We tried SQL query <https://pastebin.com/yCU4WWhK> with
> aggregation on 'payload.after.modifiedon' as time but get this
> <https://pastebin.com/bTbushYN> error.
>
> Is there anyway where we can register nested timestamp field as rowTime
> for the source table?
>
> --
> *Ashwin Sinha *| Data Engineer
> ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
> <https://www.makemytrip.com/> <https://www.goibibo.com/>
> <https://www.redbus.in/>
>
> ::DISCLAIMER::
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> This message is intended only for the use of the addressee and may contain
> information that is privileged, confidential and exempt from disclosure
> under applicable law. If the reader of this message is not the intended
> recipient, or the employee or agent responsible for delivering the message
> to the intended recipient, you are hereby notified that any dissemination,
> distribution or copying of this communication is strictly prohibited. If
> you have received this e-mail in error, please notify us immediately by
> return e-mail and delete this e-mail and all attachments from your system.
>
>
>

-- 
*Ashwin Sinha *| Data Engineer
ashwin.sinha@go-mmt.com <sh...@go-mmt.com> | 9452075361
<https://www.makemytrip.com/> <https://www.goibibo.com/>
<https://www.redbus.in/>

-- 


::DISCLAIMER::


----------------------------------------------------------------------------------------------------------------------------------------------------





This message is intended only for the use of the addressee and may 
contain information that is privileged, confidential and exempt from 
disclosure under applicable law. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this e-mail in error, please notify us 
immediately by return e-mail and delete this e-mail and all attachments 
from your system.

Re: rowTime from json nested timestamp field in SQL-Client

Posted by Timo Walther <tw...@apache.org>.
Hi Ashwin,

the SQL Client is in an early development stage right now and has some 
limitations. Your problem is one of them. I files an issue for this: 
https://issues.apache.org/jira/browse/FLINK-9864

There is no easy solution to fix this problem. Maybe you can use 
processing-time for your windows?

Regards,
Timo

Am 16.07.18 um 14:25 schrieb Ashwin Sinha:
> Hi Users,
>
> In Flink1.5 SQL CLient 
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/sqlClient.html>, 
> we are trying to define rowTime from a nested JSON element, but 
> struggling with syntax.
>
> JSON data format: https://pastebin.com/ByCLhEnF
> YML table config: https://pastebin.com/cgEtQPDQ
>
> Now, in above config, we want to access *payload.after.modifiedon *as 
> rowTime column. We tried SQL query <https://pastebin.com/yCU4WWhK> 
> with aggregation on 'payload.after.modifiedon' as time but get this 
> <https://pastebin.com/bTbushYN> error.
>
> Is there anyway where we can register nested timestamp field as 
> rowTime for the source table?
>
> -- 
> *Ashwin Sinha *| Data Engineer
> ashwin.sinha@go-mmt.com <ma...@go-mmt.com> | 9452075361
> <https://www.makemytrip.com/><https://www.goibibo.com/>
> <https://www.redbus.in/>
>
> ::DISCLAIMER::
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> This message is intended only for the use of the addressee and may 
> contain information that is privileged, confidential and exempt from 
> disclosure under applicable law. If the reader of this message is not 
> the intended recipient, or the employee or agent responsible for 
> delivering the message to the intended recipient, you are hereby 
> notified that any dissemination, distribution or copying of this 
> communication is strictly prohibited. If you have received this e-mail 
> in error, please notify us immediately by return e-mail and delete 
> this e-mail and all attachments from your system.
>