You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Peter Varga (Jira)" <ji...@apache.org> on 2020/12/03 15:42:00 UTC
[jira] [Created] (HIVE-24481) Skipped compaction can cause data
corruption with streaming
Peter Varga created HIVE-24481:
----------------------------------
Summary: Skipped compaction can cause data corruption with streaming
Key: HIVE-24481
URL: https://issues.apache.org/jira/browse/HIVE-24481
Project: Hive
Issue Type: Bug
Reporter: Peter Varga
Assignee: Peter Varga
Timeline:
1. create a partitioned table, add one static partition
2. transaction 1 writes delta_1, and aborts
3. create streaming connection, with batch 3, withStaticPartitionValues with the existing partition
4. beginTransaction, write, commitTransaction
5. beginTransaction, write, abortTransaction
6. beingTransaction, write, commitTransaction
7. close connection, count of the table is 2
8. run manual minor compaction on the partition. it will skip compaction, because deltacount =1 but clean, because there is aborted txn1
9. cleaner will remove both aborted record from txn_components
10. wait for acidhousekeeper to remove empty aborted txns
11. select * from table return *3* records, reading the aborted record
--
This message was sent by Atlassian Jira
(v8.3.4#803005)