You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user-zh@flink.apache.org by guanyq <dl...@163.com> on 2022/03/05 01:46:46 UTC

flink1.14.0 temporal join hive

kafka实时流关联hive的最新分区表数据时，关于缓存刷新的问题


'streaming-source.monitor-interval'='12 h'
这个参数我理解是：按照启动开始时间算起，每12小时读取一下最新分区的数据是吧？
还有个问题是读取最新分区的时间间隔之间，实时流里面进入了预关联新分区的数据，那么是不是就相当于关联的还是上一次的最新分区数据吧？


SETtable.sql-dialect=hive;CREATETABLEdimension_table(product_idSTRING,product_nameSTRING,unit_priceDECIMAL(10,4),pv_countBIGINT,like_countBIGINT,comment_countBIGINT,update_timeTIMESTAMP(3),update_userSTRING,...)PARTITIONEDBY(pt_yearSTRING,pt_monthSTRING,pt_daySTRING)TBLPROPERTIES(-- using default partition-name order to load the latest partition every 12h (the most recommended and convenient way)
'streaming-source.enable'='true','streaming-source.partition.include'='latest','streaming-source.monitor-interval'='12 h','streaming-source.partition-order'='partition-name',-- option with default value, can be ignored.

Re:flink1.14.0 temporal join hive

Posted by mack143 <ma...@163.com>.

退订
在 2022-03-05 09:46:46，"guanyq" <dl...@163.com> 写道：
>kafka实时流关联hive的最新分区表数据时，关于缓存刷新的问题
>
>
>'streaming-source.monitor-interval'='12 h'
>这个参数我理解是：按照启动开始时间算起，每12小时读取一下最新分区的数据是吧？
>还有个问题是读取最新分区的时间间隔之间，实时流里面进入了预关联新分区的数据，那么是不是就相当于关联的还是上一次的最新分区数据吧？
>
>
>SETtable.sql-dialect=hive;CREATETABLEdimension_table(product_idSTRING,product_nameSTRING,unit_priceDECIMAL(10,4),pv_countBIGINT,like_countBIGINT,comment_countBIGINT,update_timeTIMESTAMP(3),update_userSTRING,...)PARTITIONEDBY(pt_yearSTRING,pt_monthSTRING,pt_daySTRING)TBLPROPERTIES(-- using default partition-name order to load the latest partition every 12h (the most recommended and convenient way)
>'streaming-source.enable'='true','streaming-source.partition.include'='latest','streaming-source.monitor-interval'='12 h','streaming-source.partition-order'='partition-name',-- option with default value, can be ignored.
>
> 
>
>
>
>
> 
>
>
>
>
>
> 
>
>
>
>
>
> 
>
>
>
>
>
>