You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vipin Vishvkarma (Jira)" <ji...@apache.org> on 2020/08/10 07:08:00 UTC
[jira] [Updated] (HIVE-24020) Automatic Compaction not working in
existing partitions for Streaming Ingest with Dynamic Partition
[ https://issues.apache.org/jira/browse/HIVE-24020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vipin Vishvkarma updated HIVE-24020:
------------------------------------
Component/s: Transactions
Streaming
> Automatic Compaction not working in existing partitions for Streaming Ingest with Dynamic Partition
> ---------------------------------------------------------------------------------------------------
>
> Key: HIVE-24020
> URL: https://issues.apache.org/jira/browse/HIVE-24020
> Project: Hive
> Issue Type: Bug
> Components: Streaming, Transactions
> Affects Versions: 4.0.0, 3.1.2
> Reporter: Vipin Vishvkarma
> Assignee: Vipin Vishvkarma
> Priority: Major
>
> This issue happens when we try to do streaming ingest with dynamic partition on already existing partitions. I checked in the code, we have following check in the AbstractRecordWriter.
>
> {code:java}
> PartitionInfo partitionInfo = conn.createPartitionIfNotExists(partitionValues);
> // collect the newly added partitions. connection.commitTransaction() will report the dynamically added
> // partitions to TxnHandler
> if (!partitionInfo.isExists()) {
> addedPartitions.add(partitionInfo.getName());
> } else {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Partition {} already exists for table {}",
> partitionInfo.getName(), fullyQualifiedTableName);
> }
> }
> {code}
> Above *addedPartitions* is passed to *addDynamicPartitions* during TransactionBatch commit. So in case of already existing partitions, *addedPartitions* will be empty and *addDynamicPartitions* **will not move entries from TXN_COMPONENTS to COMPLETED_TXN_COMPONENTS. This results in Initiator not able to trigger auto compaction.
> Another issue which has been observed is, we are not clearing *addedPartitions* on writer close, which results in information flowing across transactions.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)