You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/04/14 18:05:00 UTC
[jira] [Work logged] (HIVE-27266) Retrieve only partNames if not need drop data in HMSHandler.dropPartitionsAndGetLocations
[ https://issues.apache.org/jira/browse/HIVE-27266?focusedWorklogId=857133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-857133 ]
ASF GitHub Bot logged work on HIVE-27266:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Apr/23 18:04
Start Date: 14/Apr/23 18:04
Worklog Time Spent: 10m
Work Description: wecharyu opened a new pull request, #4238:
URL: https://github.com/apache/hive/pull/4238
### What changes were proposed in this pull request?
A small improvement of `HMSHandler.dropPartitionsAndGetLocations` , retrieve only partNames rather than partName and location pairs if we do not need check location.
### Why are the changes needed?
Performance improvement, especially when the table partition number is large.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
1. pass all existing test
2. add a new benchmark test **dropTableMetadataWithPartitions**
- Before this patch
```bash
Operation Mean Med Min Max Err%
dropTableMetaOnlyWithPartitions.10 23.70 21.87 19.36 31.73 14.48
dropTableMetaOnlyWithPartitions.100 54.42 54.15 45.92 76.68 8.891
dropTableMetaOnlyWithPartitions.1000 462.5 456.1 321.0 654.3 15.96
```
- After this patch
```bash
Operation Mean Med Min Max Err%
dropTableMetaOnlyWithPartitions.10 21.49 21.24 19.30 27.90 6.661
dropTableMetaOnlyWithPartitions.100 51.51 48.30 44.86 85.23 16.91
dropTableMetaOnlyWithPartitions.1000 415.4 407.2 308.8 595.2 14.28
```
Issue Time Tracking
-------------------
Worklog Id: (was: 857133)
Remaining Estimate: 0h
Time Spent: 10m
> Retrieve only partNames if not need drop data in HMSHandler.dropPartitionsAndGetLocations
> -----------------------------------------------------------------------------------------
>
> Key: HIVE-27266
> URL: https://issues.apache.org/jira/browse/HIVE-27266
> Project: Hive
> Issue Type: Improvement
> Components: Hive
> Affects Versions: 4.0.0-alpha-2
> Reporter: Wechar
> Assignee: Wechar
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Followed HIVE-19783, we only need partNames instead of partName and location pairs if we do not need check location.
> We add a new benchmark *dropTableMetadataWithPartitions* to delete only metadata rather than the real table data.
> Test results like:
> * Before the patch:
> {code:bash}
> Operation Mean Med Min Max Err%
> dropTableMetaOnlyWithPartitions.10 23.70 21.87 19.36 31.73 14.48
> dropTableMetaOnlyWithPartitions.100 54.42 54.15 45.92 76.68 8.891
> dropTableMetaOnlyWithPartitions.1000 462.5 456.1 321.0 654.3 15.96
> {code}
> * After the patch:
> {code:bash}
> Operation Mean Med Min Max Err%
> dropTableMetaOnlyWithPartitions.10 21.49 21.24 19.30 27.90 6.661
> dropTableMetaOnlyWithPartitions.100 51.51 48.30 44.86 85.23 16.91
> dropTableMetaOnlyWithPartitions.1000 415.4 407.2 308.8 595.2 14.28
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)