You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Leonard Xu (Jira)" <ji...@apache.org> on 2020/10/22 13:56:00 UTC
[jira] [Commented] (FLINK-19644) Support read specific partition of
Hive table in temporal join
[ https://issues.apache.org/jira/browse/FLINK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219021#comment-17219021 ]
Leonard Xu commented on FLINK-19644:
------------------------------------
According to my Investigation, a major user case is to load latest hive partition table as dimension table , and the
hive table also updated by batch pipeline per day.
So, I'd like to use following case to specify the partition:
// case 1 : always reload specific partition
'lookup.join.partition' = 'pt_year=2020;pt_month=09;pt_day=15',
// case 2: load latest partition on time partition key, use specific partition on other partition key:
'lookup.join.partition' = 'pt_area=china;pt_day=max_partition',
// case 3: always reload latest specific partition:
'lookup.join.partition' = 'pt_month=max_partition();pt_day=max_partition'
[~lirui] [~lzljs3620320] You're familiar with Hive ecosystem, Do you have any insight ?
> Support read specific partition of Hive table in temporal join
> --------------------------------------------------------------
>
> Key: FLINK-19644
> URL: https://issues.apache.org/jira/browse/FLINK-19644
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Hive, Table SQL / Ecosystem
> Reporter: Leonard Xu
> Assignee: Leonard Xu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
> It's a common case to use hive partitioned table as dimension table.
> Currently Hive table only supports load all data, It will be helpful if we can support read user specific partition in temporal table.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)