You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Pablo Junge (Jira)" <ji...@apache.org> on 2020/02/28 21:32:00 UTC

[jira] [Updated] (HIVE-22566) Drop table involved in materialized view leaves the table in inconsistent state

     [ https://issues.apache.org/jira/browse/HIVE-22566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pablo Junge updated HIVE-22566:
-------------------------------
    Attachment: HIVE-22566.patch
        Status: Patch Available  (was: Open)

> Drop table involved in materialized view leaves the table in inconsistent state
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-22566
>                 URL: https://issues.apache.org/jira/browse/HIVE-22566
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.1.0
>            Reporter: Pablo Junge
>            Assignee: Pablo Junge
>            Priority: Minor
>         Attachments: HIVE-22566.patch
>
>
> If you try dropping a table which is part of the definition of a created materialized view, the table is not dropped, which is the desired state as it is part of the materialized view.
> However, there was a "drop" call to the table, so it tried to drop it but did not succeed, leaving it in an inconsistent state.
>  
> Repro:
> -------
> 1) Create tables:
>  
> {code:java}
> CREATE TABLE emps (  empid INT,  deptno INT,  name VARCHAR(256),  salary FLOAT,  hire_date TIMESTAMP)STORED AS ORC TBLPROPERTIES ('transactional'='true'); 
> CREATE TABLE depts (  deptno INT,  deptname VARCHAR(256),  locationid INT)STORED AS ORC TBLPROPERTIES ('transactional'='true');
> {code}
>  
> 2) Create the VM:
>  
> {code:java}
> CREATE MATERIALIZED VIEW mv1 AS SELECT empid, deptname, hire_date FROM emps JOIN depts  ON (emps.deptno = depts.deptno) WHERE hire_date >= '2016-01-01';
> {code}
>  
> 3) Following is in backend database at this point:
>  
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE          |
> +--------+-------+-------+----------+-------------------+
> |     81 |    16 |    81 | emps     | MANAGED_TABLE     |
> |     83 |    16 |    83 | depts    | MANAGED_TABLE     |
> |     84 |    16 |    84 | mv1      | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
>  
> 4) Let's drop the 'emps' table:
>  
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> drop table emps;
> INFO  : Compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 0.05 seconds
> INFO  : Executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 10.281 seconds
> INFO  : OK
> No rows affected (16.949 seconds)
> {code}
> No issue displayed
>  
> 5) List tables:
>  
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> show tables;
> INFO  : Compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
> INFO  : Completed compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.041 seconds
> INFO  : Executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.016 seconds
> INFO  : OK
> +-----------+
> | tab_name  |
> +-----------+
> | depts     |
> | emps      |
> +-----------+
> 2 rows selected (0.08 seconds)
> {code}
>  
> 6) Now, from the backend-db point of view:
>  
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE          |
> +--------+-------+-------+----------+-------------------+
> |     81 |    16 |  NULL | emps     | MANAGED_TABLE     |
> |     83 |    16 |    83 | depts    | MANAGED_TABLE     |
> |     84 |    16 |    84 | mv1      | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
> The table is left with NULL in SD_ID, making it not available.
>  
> 7) From Metastore.log
>  
> {code:java}
> 2019-12-02T20:00:25,545 INFO  [pool-6-thread-195]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(907)) - 196: source:172.25.34.150 drop_table : tbl=hive.mvs.emps
> 2019-12-02T20:00:25,545 INFO  [pool-6-thread-195]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(349)) - ugi=hive	ip=172.25.34.150	cmd=source:172.25.34.150 drop_table : tbl=hive.mvs.emps	
> 2019-12-02T20:00:25,580 INFO  [pool-6-thread-195]: metastore.ObjectStore$RetryingExecutor (ObjectStore.java:run(9966)) - Attempting to acquire the DB log notification lock: 0 out of 10 retries
> javax.jdo.JDODataStoreException: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".
> 	at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) ~[datanucleus-api-jdo-4.2.4.jar:?]
> 	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) ~[datanucleus-api-jdo-4.2.4.jar:?]
> 	at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) ~[datanucleus-api-jdo-4.2.4.jar:?]
> 	at org.apache.hadoop.hive.metastore.ObjectStore.lambda$lockForUpdate$0(ObjectStore.java:9936) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:9963) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:9938) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:10002) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) ~[?:?]
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
> 	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> 	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at com.sun.proxy.$Proxy28.addNotificationEvent(Unknown Source) [?:?]
> 	at org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:968) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hive.hcatalog.listener.DbNotificationListener.onDropTable(DbNotificationListener.java:198) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:99) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:273) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:335) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2670) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2842) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
> 	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at com.sun.proxy.$Proxy30.drop_table_with_environment_context(Unknown Source) [?:?]
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15533) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15517) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112]
> 	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.0.0-78.jar:?]
> 	at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
> 	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.sql.BatchUpdateException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> 	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2058) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
> 	at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
> 	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
> 	at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
> 	at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
> 	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
> 	... 37 more
> Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112]
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112]
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112]
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112]
> 	at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.Util.getInstance(Util.java:386) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2159) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2462) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2010) ~[mysql-connector-java.jar:?]
> 	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
> 	at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
> 	at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
> 	at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
> 	at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
> 	at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
> 	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
> 	... 37 more
> {code}
>  
>  
> 8) If you try to query the table:
>  
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from emps;
> Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
> {code}
>  
> It fails as expected.
> 9) If you try to query the MV:
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from mv1; INFO : Compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:mv1.empid, type:int, comment:null), FieldSchema(name:mv1.deptname, type:varchar(256), comment:null), FieldSchema(name:mv1.hire_date, type:timestamp, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.229 seconds INFO : Executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Completed executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.01 seconds INFO : OK +------------+---------------+----------------+ | mv1.empid | mv1.deptname | mv1.hire_date | +------------+---------------+----------------+ +------------+---------------+----------------+ No rows selected (0.276 seconds)
> {code}
> It does not fail, as the underlying data has not changed, and the table is still being shown as valid.
>  
> 10) Insert data into "depts" table and rebuild the mv.
> {code:java}
> $ INSERT INTO TABLE depts VALUES (101,'IT',25);
> $ INSERT INTO TABLE depts VALUES (102,'Eng',11);
> 0: jdbc:hive2://c1122-node2.squadron.support.> ALTER MATERIALIZED VIEW mvs.mv1 REBUILD;
> Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
> {code}
> This fails as expected.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)