You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/07 11:02:00 UTC
[jira] [Work logged] (HIVE-26437) dump unpartitioned Tables in parallel
[ https://issues.apache.org/jira/browse/HIVE-26437?focusedWorklogId=806654&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-806654 ]
ASF GitHub Bot logged work on HIVE-26437:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Sep/22 11:01
Start Date: 07/Sep/22 11:01
Worklog Time Spent: 10m
Work Description: atsaonerk commented on PR #3444:
URL: https://github.com/apache/hive/pull/3444#issuecomment-1239240003
Results of jmh performance benchmark test indicates an improvement in TableExport operation. If table is dumped in parallel manner instead of serial , the operation completes much faster giving. Below is the result seen when 500 tablexport operations are done both in serial and parallel manner.
Result "org.apache.hive.benchmark.ql.exec.TableAndPartitionExportBench.BaseBench.parallel":
N = 5
mean = 640.862 ?(99.9%) 113.354 ms/op
Result "org.apache.hive.benchmark.ql.exec.TableAndPartitionExportBench.BaseBench.serial":
N = 5
mean = 51697.322 ?(99.9%) 322.747 ms/op
Benchmark Mode Cnt Score Error Units
TableAndPartitionExportBench.BaseBench.parallel ss 5 640.862 ? 113.354 ms/op
TableAndPartitionExportBench.BaseBench.serial ss 5 51697.322 ? 322.747 ms/op
Issue Time Tracking
-------------------
Worklog Id: (was: 806654)
Remaining Estimate: 0h
Time Spent: 10m
> dump unpartitioned Tables in parallel
> -------------------------------------
>
> Key: HIVE-26437
> URL: https://issues.apache.org/jira/browse/HIVE-26437
> Project: Hive
> Issue Type: Improvement
> Components: Hive
> Reporter: Amit Saonerkar
> Assignee: Amit Saonerkar
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)