You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wei Zheng (JIRA)" <ji...@apache.org> on 2016/05/03 02:18:13 UTC

[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request

    [ https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15267792#comment-15267792 ] 

Wei Zheng commented on HIVE-13354:
----------------------------------

New usages after this improvement.

- Allow new tblproperties on DDL.
    - Specify compactor MR job properties.
          e.g. CREATE TABLE t1 ... TBLPROPERTIES ('compactor.mapreduce.map.memory.mb'='1024');
    - Specify compactor thresholds for triggering compaction (currently, hive.compactor.delta.num.threshold and hive.compactor.delta.pct.threshold).
          e.g. CREATE TABLE t1 ... TBLPROPERTIES ('compactorthreshold.hive.compactor.delta.num.threshold'='5');

- Allow tblproperties on ALTER TABLE .. COMPACT.
    - Speficy compactor MR job properties or other hive properties.
          ALTER TABLE t1 ... COMPACT ... WITH OVERWRITE TBLPROPERTIES ('compactor.mapreduce.map.memory.mb'='1024', 'tblprops.orc.compress.size'='8192');

> Add ability to specify Compaction options per table and per request
> -------------------------------------------------------------------
>
>                 Key: HIVE-13354
>                 URL: https://issues.apache.org/jira/browse/HIVE-13354
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>              Labels: TODOC2.1
>
> Currently the are a few options that determine when automatic compaction is triggered.  They are specified once for the warehouse.
> This doesn't make sense - some table may be more important and need to be compacted more often.
> We should allow specifying these on per table basis.
> Also, compaction is an MR job launched from within the metastore.  There is currently no way to control job parameters (like memory, for example) except to specify it in hive-site.xml for metastore which means they are site wide.
> Should add a way to specify these per table (perhaps even per compaction if launched via ALTER TABLE)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)