You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Roshan Naik (JIRA)" <ji...@apache.org> on 2013/08/23 03:09:51 UTC
[jira] [Created] (HIVE-5143) Streaming - Compaction of partitions
Roshan Naik created HIVE-5143:
---------------------------------
Summary: Streaming - Compaction of partitions
Key: HIVE-5143
URL: https://issues.apache.org/jira/browse/HIVE-5143
Project: Hive
Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
Task is to support compaction of partitions.
Rationale: Streaming partitions are composed of a large number of small files (each commit is one file). Since compaction can be a potentially expensive operation (for e.g. converting to single ORC file), we do not compact the streaming partition at the time of rolling it into a standard partition. This allows rolling to be quick and atomic.
Compaction will be performed at a later time. The streaming partition is converted as is (typically with a many small files) into a standard partition. This new standard partition will be queued up for compaction by a separate job.
This decouples the compaction feature from streaming support, and makes it more generally available for any partitions.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira