You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Craig Condit (JIRA)" <ji...@apache.org> on 2019/06/24 14:41:00 UTC

[jira] [Updated] (HIVE-21917) COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

     [ https://issues.apache.org/jira/browse/HIVE-21917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Condit updated HIVE-21917:
--------------------------------
    Summary: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs  (was: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compator runs)

> COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs
> ------------------------------------------------------------------------
>
>                 Key: HIVE-21917
>                 URL: https://issues.apache.org/jira/browse/HIVE-21917
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: Craig Condit
>            Priority: Major
>
> The Initiator thread in the metastore repeatedly loops over entries in the COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might need to be compacted. However, entries are never removed from this table except by a completed Compactor run.
> In a cluster where most tables / partitions are write-once read-many, this results in stale entries in this table never being cleaned up. In a small test cluster, we have observed approximately 45k entries in this table (virtually equal to the number of partitions in the cluster) while < 100 of these tables have delta files at all. Since most of the tables will never get enough writes to trigger a compaction (and in fact have only ever been written to once), the initiator thread keeps trying to evaluate them on every loop.
> On this test cluster, it takes approximately 10 minutes to loop through all the entries and results in severe performance degradation on metastore operations. With the default run timing of 5 minutes, the initiator basically never stops running.
> On a production cluster with 2M partitions, this would be a non-starter.
> The initiator thread should proactively remove entries from COMPLETED_TXN_COMPONENTS when it determines that a compaction is not needed, so that they are not evaluated again on the next loop.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)