You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Pablo Junge (Jira)" <ji...@apache.org> on 2020/02/28 21:32:00 UTC
[jira] [Updated] (HIVE-22566) Drop table involved in materialized
view leaves the table in inconsistent state
[ https://issues.apache.org/jira/browse/HIVE-22566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pablo Junge updated HIVE-22566:
-------------------------------
Attachment: HIVE-22566.patch
Status: Patch Available (was: Open)
> Drop table involved in materialized view leaves the table in inconsistent state
> -------------------------------------------------------------------------------
>
> Key: HIVE-22566
> URL: https://issues.apache.org/jira/browse/HIVE-22566
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 3.1.0
> Reporter: Pablo Junge
> Assignee: Pablo Junge
> Priority: Minor
> Attachments: HIVE-22566.patch
>
>
> If you try dropping a table which is part of the definition of a created materialized view, the table is not dropped, which is the desired state as it is part of the materialized view.
> However, there was a "drop" call to the table, so it tried to drop it but did not succeed, leaving it in an inconsistent state.
>
> Repro:
> -------
> 1) Create tables:
>
> {code:java}
> CREATE TABLE emps ( empid INT, deptno INT, name VARCHAR(256), salary FLOAT, hire_date TIMESTAMP)STORED AS ORC TBLPROPERTIES ('transactional'='true');
> CREATE TABLE depts ( deptno INT, deptname VARCHAR(256), locationid INT)STORED AS ORC TBLPROPERTIES ('transactional'='true');
> {code}
>
> 2) Create the VM:
>
> {code:java}
> CREATE MATERIALIZED VIEW mv1 AS SELECT empid, deptname, hire_date FROM emps JOIN depts ON (emps.deptno = depts.deptno) WHERE hire_date >= '2016-01-01';
> {code}
>
> 3) Following is in backend database at this point:
>
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |
> +--------+-------+-------+----------+-------------------+
> | 81 | 16 | 81 | emps | MANAGED_TABLE |
> | 83 | 16 | 83 | depts | MANAGED_TABLE |
> | 84 | 16 | 84 | mv1 | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
>
> 4) Let's drop the 'emps' table:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> drop table emps;
> INFO : Compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO : Completed compiling command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 0.05 seconds
> INFO : Executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b): drop table emps
> INFO : Starting task [Stage-0:DDL] in serial mode
> INFO : Completed executing command(queryId=hive_20191202200025_c13079d0-8695-4485-8a18-14804b8b014b); Time taken: 10.281 seconds
> INFO : OK
> No rows affected (16.949 seconds)
> {code}
> No issue displayed
>
> 5) List tables:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> show tables;
> INFO : Compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
> INFO : Completed compiling command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.041 seconds
> INFO : Executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413): show tables
> INFO : Starting task [Stage-0:DDL] in serial mode
> INFO : Completed executing command(queryId=hive_20191202200125_ca12565b-1d4d-4433-a602-ecf685863413); Time taken: 0.016 seconds
> INFO : OK
> +-----------+
> | tab_name |
> +-----------+
> | depts |
> | emps |
> +-----------+
> 2 rows selected (0.08 seconds)
> {code}
>
> 6) Now, from the backend-db point of view:
>
> {code:java}
> mysql> select TBL_ID, DB_ID, SD_ID, TBL_NAME, TBL_TYPE from TBLS where DB_ID=16;
> +--------+-------+-------+----------+-------------------+
> | TBL_ID | DB_ID | SD_ID | TBL_NAME | TBL_TYPE |
> +--------+-------+-------+----------+-------------------+
> | 81 | 16 | NULL | emps | MANAGED_TABLE |
> | 83 | 16 | 83 | depts | MANAGED_TABLE |
> | 84 | 16 | 84 | mv1 | MATERIALIZED_VIEW |
> +--------+-------+-------+----------+-------------------+
> 3 rows in set (0.00 sec)
> {code}
> The table is left with NULL in SD_ID, making it not available.
>
> 7) From Metastore.log
>
> {code:java}
> 2019-12-02T20:00:25,545 INFO [pool-6-thread-195]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(907)) - 196: source:172.25.34.150 drop_table : tbl=hive.mvs.emps
> 2019-12-02T20:00:25,545 INFO [pool-6-thread-195]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(349)) - ugi=hive ip=172.25.34.150 cmd=source:172.25.34.150 drop_table : tbl=hive.mvs.emps
> 2019-12-02T20:00:25,580 INFO [pool-6-thread-195]: metastore.ObjectStore$RetryingExecutor (ObjectStore.java:run(9966)) - Attempting to acquire the DB log notification lock: 0 out of 10 retries
> javax.jdo.JDODataStoreException: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".
> at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) ~[datanucleus-api-jdo-4.2.4.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) ~[datanucleus-api-jdo-4.2.4.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216) ~[datanucleus-api-jdo-4.2.4.jar:?]
> at org.apache.hadoop.hive.metastore.ObjectStore.lambda$lockForUpdate$0(ObjectStore.java:9936) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.ObjectStore$RetryingExecutor.run(ObjectStore.java:9963) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.ObjectStore.lockForUpdate(ObjectStore.java:9938) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:10002) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source) ~[?:?]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at com.sun.proxy.$Proxy28.addNotificationEvent(Unknown Source) [?:?]
> at org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:968) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hive.hcatalog.listener.DbNotificationListener.onDropTable(DbNotificationListener.java:198) [hive-hcatalog-server-extensions-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:99) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:273) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:335) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2670) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:2842) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
> at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at com.sun.proxy.$Proxy30.drop_table_with_environment_context(Unknown Source) [?:?]
> at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15533) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table_with_environment_context.getResult(ThriftHiveMetastore.java:15517) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_112]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) [hadoop-common-3.1.1.3.1.0.0-78.jar:?]
> at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: java.sql.BatchUpdateException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2058) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
> at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
> at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
> at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
> ... 37 more
> Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot delete or update a parent row: a foreign key constraint fails ("hive"."MV_TABLES_USED", CONSTRAINT "MV_TABLES_USED_FK2" FOREIGN KEY ("TBL_ID") REFERENCES "TBLS" ("TBL_ID"))
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_112]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_112]
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_112]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_112]
> at com.mysql.jdbc.Util.handleNewInstance(Util.java:411) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.Util.getInstance(Util.java:386) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1041) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2820) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2159) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2462) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:2010) ~[mysql-connector-java.jar:?]
> at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1471) ~[mysql-connector-java.jar:?]
> at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) ~[HikariCP-2.6.1.jar:?]
> at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java) ~[HikariCP-2.6.1.jar:?]
> at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeBatch(ParamLoggingPreparedStatement.java:366) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.SQLController.processConnectionStatement(SQLController.java:676) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.SQLController.getStatementForQuery(SQLController.java:319) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:211) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:633) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) ~[datanucleus-rdbms-4.1.19.jar:?]
> at org.datanucleus.store.query.Query.execute(Query.java:1726) ~[datanucleus-core-4.1.17.jar:?]
> at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374) ~[datanucleus-api-jdo-4.2.4.jar:?]
> ... 37 more
> {code}
>
>
> 8) If you try to query the table:
>
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from emps;
> Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
> {code}
>
> It fails as expected.
> 9) If you try to query the MV:
> {code:java}
> 0: jdbc:hive2://c1122-node2.squadron.support.> select * from mv1; INFO : Compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:mv1.empid, type:int, comment:null), FieldSchema(name:mv1.deptname, type:varchar(256), comment:null), FieldSchema(name:mv1.hire_date, type:timestamp, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.229 seconds INFO : Executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2): select * from mv1 INFO : Completed executing command(queryId=hive_20191202200818_91bf194d-8133-4670-b8d5-542ee56b6cc2); Time taken: 0.01 seconds INFO : OK +------------+---------------+----------------+ | mv1.empid | mv1.deptname | mv1.hire_date | +------------+---------------+----------------+ +------------+---------------+----------------+ No rows selected (0.276 seconds)
> {code}
> It does not fail, as the underlying data has not changed, and the table is still being shown as valid.
>
> 10) Insert data into "depts" table and rebuild the mv.
> {code:java}
> $ INSERT INTO TABLE depts VALUES (101,'IT',25);
> $ INSERT INTO TABLE depts VALUES (102,'Eng',11);
> 0: jdbc:hive2://c1122-node2.squadron.support.> ALTER MATERIALIZED VIEW mvs.mv1 REBUILD;
> Error: Error while compiling statement: FAILED: SemanticException Unable to fetch table emps. null (state=42000,code=40000)
> {code}
> This fails as expected.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)