You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Alexander Kolbasov (JIRA)" <ji...@apache.org> on 2018/03/07 00:01:00 UTC

[jira] [Created] (HIVE-18885) Cascaded alter table + notifications = disaster

Alexander Kolbasov created HIVE-18885:
-----------------------------------------

             Summary: Cascaded alter table + notifications = disaster
                 Key: HIVE-18885
                 URL: https://issues.apache.org/jira/browse/HIVE-18885
             Project: Hive
          Issue Type: Bug
          Components: Hive, Metastore
    Affects Versions: 3.0.0
            Reporter: Alexander Kolbasov


You can see the problem from looking at the code, but it actually created severe problems for real life Hive user.

When {{alter table}} has {{cascade}} option it does the following:
{code:java}
         msdb.openTransaction()
          ...
          List<Partition> parts = msdb.getPartitions(dbname, name, -1);
          for (Partition part : parts) {
            List<FieldSchema> oldCols = part.getSd().getCols();
            part.getSd().setCols(newt.getSd().getCols());
            String oldPartName = Warehouse.makePartName(oldt.getPartitionKeys(), part.getValues());
            updatePartColumnStatsForAlterColumns(msdb, part, oldPartName, part.getValues(), oldCols, part);
            msdb.alterPartition(dbname, name, part.getValues(), part);
          }
 {code}

So it walks all partitions (and this may be huge list) and does some non-trivial operations in one single uber-transaction.

When DbNotificationListener is enabled, it adds an event for each partition, all while
holding a row lock on NOTIFICATION_SEQUENCE table. As a result, while this is happening no other write DDL can proceed. This can sometimes cause DB lock timeouts which cause HMS level operation retries which make things even worse.

In one particular case this pretty much made HMS unusable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)