You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2018/02/13 21:36:00 UTC
[jira] [Created] (HIVE-18709) Enable Compaction to work on more
than one partition per job
Eugene Koifman created HIVE-18709:
-------------------------------------
Summary: Enable Compaction to work on more than one partition per job
Key: HIVE-18709
URL: https://issues.apache.org/jira/browse/HIVE-18709
Project: Hive
Issue Type: Improvement
Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
currently compaction launches 1 MR job per partition that needs to be compacted.
The number of tasks is equal to the number of buckets in the table (or number or writers in the 'widest' write).
The number of AMs in a cluster is usually limited to a small percentage of the nodes. This limits how much compaction can be done in parallel.
Investigate what it would take for a single job to be able to handle multiple partitions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)