You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Mithun Radhakrishnan (JIRA)" <ji...@apache.org> on 2015/02/23 19:51:12 UTC

[jira] [Commented] (HIVE-9629) HCatClient.dropPartitions() needs speeding up.

    [ https://issues.apache.org/jira/browse/HIVE-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333651#comment-14333651 ] 

Mithun Radhakrishnan commented on HIVE-9629:
--------------------------------------------

Just an update on performance numbers: (A follow-on to those quoted in HIVE-9588)

1. Dropping 2K partitions from a managed Hive table took 204 seconds on my Hive/HCat test setup (with remote metastore, backed with Oracle).
2. HIVE-9588 reduced this to 83 seconds.
3. The combination of HIVE-9631, HIVE-9681 and HIVE-9736 has reduced this now to 16 seconds.
(The patch for HIVE-9631 isn't currently up. Selina has an internal patch that works with Oracle.)

I'll be testing this some more. In the meantime, I'd be grateful if the patches (other than HIVE-9631) could be reviewed.
 

> HCatClient.dropPartitions() needs speeding up.
> ----------------------------------------------
>
>                 Key: HIVE-9629
>                 URL: https://issues.apache.org/jira/browse/HIVE-9629
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog, Metastore
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>
> This is an über JIRA for the work required to speed up HCatClient.dropPartitions().
> As it stands right now, {{dropPartitions()}} is slow because it takes N thrift-calls to drop N partitions, and attempts to store all N partitions in memory while it executes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)