You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Antal Sinkovits (Jira)" <ji...@apache.org> on 2022/01/19 15:30:00 UTC

[jira] [Resolved] (HIVE-24805) Compactor: Initiator shouldn't fetch table details again and again for partitioned tables

     [ https://issues.apache.org/jira/browse/HIVE-24805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antal Sinkovits resolved HIVE-24805.
------------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Pushed to master. Thanks for the review [~dkuzmenko].

> Compactor: Initiator shouldn't fetch table details again and again for partitioned tables
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-24805
>                 URL: https://issues.apache.org/jira/browse/HIVE-24805
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>            Reporter: Rajesh Balamohan
>            Assignee: Antal Sinkovits
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Initiator shouldn't be fetch table details for all its partitions. When there are large number of databases/tables, it takes lot of time for Initiator to complete its initial iteration and load on DB also goes higher.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129
> https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456
> For all the following partitions, table details would be the same. However, it ends up fetching table details from HMS again and again.
> {noformat}
> 2021-02-22 08:13:16,106 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
> 2021-02-22 08:13:16,124 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
> 2021-02-22 08:13:16,140 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
> 2021-02-22 08:13:16,149 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
> 2021-02-22 08:13:16,158 INFO  org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see if we should compact tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)