You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/12/22 16:18:00 UTC
[jira] [Updated] (HIVE-24805) Compactor: Initiator shouldn't fetch table details again and again for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-24805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-24805:
----------------------------------
Labels: pull-request-available (was: )
> Compactor: Initiator shouldn't fetch table details again and again for partitioned tables
> -----------------------------------------------------------------------------------------
>
> Key: HIVE-24805
> URL: https://issues.apache.org/jira/browse/HIVE-24805
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Reporter: Rajesh Balamohan
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Initiator shouldn't be fetch table details for all its partitions. When there are large number of databases/tables, it takes lot of time for Initiator to complete its initial iteration and load on DB also goes higher.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129
> https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456
> For all the following partitions, table details would be the same. However, it ends up fetching table details from HMS again and again.
> {noformat}
> 2021-02-22 08:13:16,106 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
> 2021-02-22 08:13:16,124 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
> 2021-02-22 08:13:16,140 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
> 2021-02-22 08:13:16,149 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
> 2021-02-22 08:13:16,158 INFO org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)