You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Venki Korukanti (JIRA)" <ji...@apache.org> on 2014/11/15 01:09:34 UTC

[jira] [Commented] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes

    [ https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213120#comment-14213120 ] 

Venki Korukanti commented on HIVE-5664:
---------------------------------------

bq. Proposed in current form, a db containing 1000 tables, we will make 1000 trips to metastore from client while doing cascade drop. That will be expensive, it will be good to avoid if we can.
In current Hive (without the fix attached to this jira), HiveMetastoreClient is getting all table names first and for each table name it is getting the {{Table}} object as part of dropTable() method. So with this fix, we are not making it any more expensive, but I agree that it is very expensive to make n+1 metastore calls.

bq. I wonder if instead of getAllTables() , if we can use getTables() or listTableNamesByFilter() to only retrieve base tables and no index tables. Since index tables follow specific naming pattern, we can construct and pass that pattern to one of the above methods to get only base tables. This will avoid multiple round trips to metastore from client.
Index tables can have custom names like in {{CREATE INDEX temp_tbl3_idx ON TABLE temp_tbl3(id) AS 'COMPACT' with DEFERRED REBUILD IN TABLE temp_tbl3_idx_tbl;}}

Couple of other alternatives:
1. Failure is because there is no {{Table}} object is found in {{dropTable}} method for index tables. Simple fix is to ignore {{NoSuchObjectExceptions}}.
2. Add a method {{getTables(TableType)}} to MetaStore interface to retrieve {{Table}} objects based on table type. We may need to fetch the tables in batches to avoid memory issues.

> Drop cascade database fails when the db has any tables with indexes
> -------------------------------------------------------------------
>
>                 Key: HIVE-5664
>                 URL: https://issues.apache.org/jira/browse/HIVE-5664
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing, Metastore
>    Affects Versions: 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>         Attachments: HIVE-5664.1.patch.txt, HIVE-5664.2.patch.txt
>
>
> {code}
> CREATE DATABASE db2; 
> USE db2; 
> CREATE TABLE tab1 (id int, name string); 
> CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN TABLE tab1_indx; 
> DROP DATABASE db2 CASCADE;
> {code}
> Last DDL fails with the following error:
> {code}
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database does not exist: db2
> Hive.log has following exception
> 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: db2
>         at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
>         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915)
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Caused by: NoSuchObjectException(message:db2.tab1_indx table not found)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
>         at com.sun.proxy.$Proxy7.get_table(Unknown Source)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:546)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>         at com.sun.proxy.$Proxy8.dropDatabase(Unknown Source)
>         at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:284)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3470)
>         ... 18 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)