You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "mahesh kumar behera (JIRA)" <ji...@apache.org> on 2019/05/22 03:39:00 UTC
[jira] [Created] (HIVE-21773) Supporting external table replication
with partition filter.
mahesh kumar behera created HIVE-21773:
------------------------------------------
Summary: Supporting external table replication with partition filter.
Key: HIVE-21773
URL: https://issues.apache.org/jira/browse/HIVE-21773
Project: Hive
Issue Type: Sub-task
Components: HiveServer2, repl
Affects Versions: 4.0.0
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera
Fix For: 4.0.0
Hive external table replication is done differently than managed table replication. In case of external table, list is created for the locations of the table and partitions to be replicated. If the partition location is within the table location, then partition location is not added to the list. For partitions with location outside table, partition location is added to the list. In case of incremental dump, the data related events are ignored and just the metadata related events are dumped. The list of location is prepared and that is used for replication. During load, the events are replayed and then the distcp tasks are created, one for each location present in the list.
For partition level replication, not all partition will be present in the dump. So even if the partition locations are within the table location, each partition location will be added to the list.
* If where condition is present in the REPL DUMP command then add location for each satisfying partition even though the partition location is within table location.
* If table is not mentioned in the where clause then follow the older behavior.
* If table is mentioned with a key but the key does not match any of the partitioned column then fail repl dump.
* If the table is mentioned with the key and even if all the partitions are satisfying the filter condition, add location for each partition. This is to avoid copying partitions which are added using alter after the dump.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)