You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sankar Hariappan (JIRA)" <ji...@apache.org> on 2019/05/21 07:37:00 UTC

[jira] [Created] (HIVE-21763) Incremental replication to allow changing include/exclude tables list in replication policy.

Sankar Hariappan created HIVE-21763:
---------------------------------------

             Summary: Incremental replication to allow changing include/exclude tables list in replication policy.
                 Key: HIVE-21763
                 URL: https://issues.apache.org/jira/browse/HIVE-21763
             Project: Hive
          Issue Type: Sub-task
          Components: repl
            Reporter: Sankar Hariappan
            Assignee: Sankar Hariappan


- REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
{code}
- REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM <last_repl_id> WITH <key_values_list>;
- current_repl_policy and previous_repl_policy can be any format mentioned in Point-4.
- REPLACE clause to be supported to take previous repl policy as input. If REPLACE clause is not there, then the policy remains unchanged.
- Rest of the format remains same.
{code}
- Now, REPL DUMP on this DB will replicate the tables based on current_repl_policy.
- If any table is added dynamically either due to change in regular expression or added to include list should be bootstrapped using independant table level replication policy.
{code}
- Hive will automatically figure out the list of tables newly included in the list by comparing the current_repl_policy & previous_repl_policy inputs and combine bootstrap dump for added tables as part of incremental dump. "_bootstrap" directory can be created in dump dir to accommodate all tables to be bootstrapped.
- If any table is renamed, then it may gets dynamically added/removed for replication based on defined replication policy + include/exclude list. So, Hive will perform bootstrap for the table which is just included after rename.
{code}
- REPL LOAD on incremental dump should check for "_bootstrap" directory and perform bootstrap load on them first and then continue with incremental load based on events directories.
- REPL LOAD should check for changes in repl policy and drop the tables/views excluded in the new policy  compared to previous policy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)