You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Leonard Xu (Jira)" <ji...@apache.org> on 2020/10/23 03:12:00 UTC

[jira] [Comment Edited] (FLINK-19644) Support read specific partition of Hive table in temporal join

    [ https://issues.apache.org/jira/browse/FLINK-19644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219021#comment-17219021 ] 

Leonard Xu edited comment on FLINK-19644 at 10/23/20, 3:11 AM:
---------------------------------------------------------------

According to my Investigation, a major user case is to load latest hive partition table as dimension table , and the 

hive table also updated by batch pipeline per day.

 

So, I'd like to use following case to specify the partition:  

  // case 1 : always reload specific partition 

'lookup.join.partition' = 'pt_year=2020;pt_month=09;pt_day=15',

// case 2:  load latest partition on time partition key, use specific partition on other partition key:

'lookup.join.partition' = 'pt_area=china;pt_day=max_partition()',

//  case 3: always reload latest specific partition:

'lookup.join.partition'  = 'pt_month=max_partition();pt_day=max_partition'

 

[~lirui] [~lzljs3620320] You're familiar with Hive ecosystem, Do you have any insight ? 

 


was (Author: leonard xu):
According to my Investigation, a major user case is to load latest hive partition table as dimension table , and the 

hive table also updated by batch pipeline per day.

 

So, I'd like to use following case to specify the partition:  

  // case 1 : always reload specific partition 

'lookup.join.partition' = 'pt_year=2020;pt_month=09;pt_day=15',

// case 2:  load latest partition on time partition key, use specific partition on other partition key:

'lookup.join.partition' = 'pt_area=china;pt_day=max_partition',

//  case 3: always reload latest specific partition:

'lookup.join.partition'  = 'pt_month=max_partition();pt_day=max_partition'

 

[~lirui] [~lzljs3620320] You're familiar with Hive ecosystem, Do you have any insight ? 

 

> Support read specific partition of Hive table in temporal join
> --------------------------------------------------------------
>
>                 Key: FLINK-19644
>                 URL: https://issues.apache.org/jira/browse/FLINK-19644
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Hive, Table SQL / Ecosystem
>            Reporter: Leonard Xu
>            Assignee: Leonard Xu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.12.0
>
>
> It's a common case to use hive partitioned table as dimension table.
> Currently Hive table only supports load all data, It will be helpful if we can support  read user specific partition in temporal table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)