You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ashutosh Bapat (JIRA)" <ji...@apache.org> on 2019/07/05 10:36:00 UTC

[jira] [Commented] (HIVE-21893) Handle concurrent write + drop when ACID tables are getting bootstrapped.

    [ https://issues.apache.org/jira/browse/HIVE-21893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879131#comment-16879131 ] 

Ashutosh Bapat commented on HIVE-21893:
---------------------------------------

[~sankarh],  these two issues can happen even in case of normal bootstrap for a new policy, not just in case of the one during incremental phase. But anyway here’s my analysis of problematic cases.

The key point here is following comment in org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask#getValidTxnListForReplDump()

 
{code:java}
// Key design point for REPL DUMP is to not have any txns older than current txn in which
// dump runs. This is needed to ensure that Repl dump doesn't copy any data files written by
// any open txns mainly for streaming ingest case where one delta file shall have data from
// committed/aborted/open txns. It may also have data inconsistency if the on-going txns
// doesn't have corresponding open/write events captured which means, catch-up incremental
// phase won't be able to replicate those txns. So, the logic is to wait for the given amount
// of time to see if all open txns < current txn is getting aborted/committed. If not, then
// we forcefully abort those txns just like AcidHouseKeeperService.{code}
 

 Case 1
{quote}If Step-11 happens between Step-1 and Step-2. Also, Step-13 completes before we forcefully abort Tx2 from REPL DUMP thread T1. Also, assume Step-14 is done after bootstrap is completed. In this case, bootstrap would replicate the data/writeId written by Tx2. But, the next incremental cycle would also replicate the open_txn, allocate_writeid and commit_txn events which would duplicate the data.
{quote}
If step-11 happens between step-1 and step-2 that itself can cause multiple problems as the open transaction event is replayed twice (once during bootstrap and once during next incremental), thus causing writeIds on target going out of sync with the source. A better solution would be to combine setLastReplIdForDump() and openTransaction() in Driver.compile() for REPL DUMP case. We should let openTransaction() return the eventId of the open transaction event of the REPL DUMP. This eventId would be set as the lastReplIdForDump(). The next incremental dump will start from the events following this open transaction event.

With that we will prohibit step 11 from happening between step 1 and step 2. So step-11 can happen either after step 2 or before 1.
 # If it happens after 2, it will not be recorded in the snapshot of DUMP and thus changes within that transaction will not be replicated during bootstrap. The next incremental will replicate the events.

 # If step-11 happens before step-1 and commits before we start the dump, the changes by it will be replicated during bootstrap since that transaction will be considered as visible to the REPL DUMP transaction. If alloc_writeId event is idempotent for a given transaction on source, once the open transaction event has been replicated as part of bootstrap, same writeId will be allocated however times the alloc_writeId event is replicated, thus keeping the writeIds on source and target in sync. Any files written will be marked with the same writeId, so copying them multiple times will not duplicate data. So there’s not correctness issue there in this case either.

case 2
{quote}If Step-11 to Step-14 in Thread T2 happens after Step-1 in REPL DUMP thread T1. In this case, table is not bootstrapped but the corresponding open_txn, allocate_writeid, commit_txn and drop events would be replicated in next cycle. During next cycle, REPL LOAD would fail on commitTxn event as table is dropped or event is missing.
{quote}
If step-11 to step 14 happen before step-1, those will be covered by bootstrap itself and they will not appear in the incremental. I think you wanted to say that step 14 happens before step 4 thus the table is not bootstrapped, but any event after open transaction are part of next incremental.

This case is covered by test org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables#testAcidTablesBootstrapWithConcurrentDropTable().

In this case, the ALTER TABLE events created by INSERT operation are converted to CreateTable on target and thus at the time of commit it sees the table, which is dropped by subsequent drop event. So, no correctness issue here as well.

> Handle concurrent write + drop when ACID tables are getting bootstrapped.
> -------------------------------------------------------------------------
>
>                 Key: HIVE-21893
>                 URL: https://issues.apache.org/jira/browse/HIVE-21893
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: DR, Replication
>
> ACID tables will be bootstrapped during incremental phase in couple of cases. 
> 1. hive.repl.bootstrap.acid.tables is set to true in WITH clause of REPL DUMP.
> 2. If replication policy is changed using REPLACE clause in REPL DUMP where the ACID table is matching new policy but not old policy.
> REPL DUMP performed below sequence of operations. Let's say Thread (T1)
> 1. Get Last Repl ID (lastId)
> 2. Open Transaction (Tx1)
> 3. Dump events until lastId.
> 4. Get the list of tables in the given DB.
> 5. If table matches current policy, then bootstrap dump it.
> Let's say, concurrently another thread  (let's say T2) is running as follows.
> 11. Open Transaction (Tx2).
> 12. Insert into ACID table Tbl1.
> 13. Commit Transaction (Tx2)
> 14. Drop table (Tbl1) --> Not necessarily same thread, may be from different thread as well.
> *Problematic Use-cases:*
> 1. If Step-11 happens between Step-1 and Step-2. Also, Step-13 completes before we forcefully abort Tx2 from REPL DUMP thread T1. Also, assume Step-14 is done after bootstrap is completed. In this case, bootstrap would replicate the data/writeId written by Tx2. But, the next incremental cycle would also replicate the open_txn, allocate_writeid and commit_txn events which would duplicate the data.
> 2. If Step-11 to Step-14 in Thread T2 happens after Step-1 in REPL DUMP thread T1. In this case, table is not bootstrapped but the corresponding open_txn, allocate_writeid, commit_txn and drop events would be replicated in next cycle. During next cycle, REPL LOAD would fail on commmitTxn event as table is dropped or event is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)