You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Leonard Xu (Jira)" <ji...@apache.org> on 2020/10/23 03:12:00 UTC
[jira] [Comment Edited] (FLINK-19644) Support read specific
partition of Hive table in temporal join
[ https://issues.apache.org/jira/browse/FLINK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219021#comment-17219021 ]
Leonard Xu edited comment on FLINK-19644 at 10/23/20, 3:11 AM:
---------------------------------------------------------------
According to my Investigation, a major user case is to load latest hive partition table as dimension table , and the
hive table also updated by batch pipeline per day.
So, I'd like to use following case to specify the partition:
// case 1 : always reload specific partition
'lookup.join.partition' = 'pt_year=2020;pt_month=09;pt_day=15',
// case 2: load latest partition on time partition key, use specific partition on other partition key:
'lookup.join.partition' = 'pt_area=china;pt_day=max_partition()',
// case 3: always reload latest specific partition:
'lookup.join.partition' = 'pt_month=max_partition();pt_day=max_partition'
[~lirui] [~lzljs3620320] You're familiar with Hive ecosystem, Do you have any insight ?
was (Author: leonard xu):
According to my Investigation, a major user case is to load latest hive partition table as dimension table , and the
hive table also updated by batch pipeline per day.
So, I'd like to use following case to specify the partition:
// case 1 : always reload specific partition
'lookup.join.partition' = 'pt_year=2020;pt_month=09;pt_day=15',
// case 2: load latest partition on time partition key, use specific partition on other partition key:
'lookup.join.partition' = 'pt_area=china;pt_day=max_partition',
// case 3: always reload latest specific partition:
'lookup.join.partition' = 'pt_month=max_partition();pt_day=max_partition'
[~lirui] [~lzljs3620320] You're familiar with Hive ecosystem, Do you have any insight ?
> Support read specific partition of Hive table in temporal join
> --------------------------------------------------------------
>
> Key: FLINK-19644
> URL: https://issues.apache.org/jira/browse/FLINK-19644
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Hive, Table SQL / Ecosystem
> Reporter: Leonard Xu
> Assignee: Leonard Xu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.12.0
>
>
> It's a common case to use hive partitioned table as dimension table.
> Currently Hive table only supports load all data, It will be helpful if we can support read user specific partition in temporal table.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)