You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Peter Vary (JIRA)" <ji...@apache.org> on 2018/05/02 13:16:00 UTC

[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

    [ https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461004#comment-16461004 ] 

Peter Vary commented on HIVE-18705:
-----------------------------------

{quote}+So here's a question+: should I get rid of the batched scenario as all the tables are queried and are accessible at a time already, and there's little reason for me to query them in batches later (for memory reasons) instead of all of them at once. This way I could have the non-batched (send one dropDB only) scenario only which doesn't suffer from all the slowing effects I described above, and is generally 4-5 times faster than the current implementation.
{quote}
As we discussed offline I think we should keep the batched scenario. There are constant memory problems, and  we should strive to remove places from code where we query every table/partition to the memory, not introducing new ones :).

Also it would be good idea to check if it is possible to shorten the closure time for the DFSClient.

> Improve HiveMetaStoreClient.dropDatabase
> ----------------------------------------
>
>                 Key: HIVE-18705
>                 URL: https://issues.apache.org/jira/browse/HIVE-18705
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: HIVE-18705.0.patch, HIVE-18705.1.patch, HIVE-18705.2.patch, HIVE-18705.4.patch
>
>
> {{HiveMetaStoreClient.dropDatabase}} has a strange implementation to ensure dealing with client side hooks (for non-native tables e.g. HBase). Currently it starts by retrieving all the tables from HMS, and then sends {{dropTable}} calls to HMS table-by-table. At the end a {{dropDatabase}} just to be sure :) 
> I believe this could be refactored so that it speeds up the dropDB in situations where the average table count per DB is very high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)