You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/01/28 10:06:14 UTC

[GitHub] [hive] zabetak commented on a change in pull request #2984: HIVE-25900: Materialized view registry does not clean non existing views at refresh

zabetak commented on a change in pull request #2984:
URL: https://github.com/apache/hive/pull/2984#discussion_r794306957



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */

Review comment:
       It is a bit a confusing that we have the same javadoc twice (here and in the top level class).

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {
 
-  private final MaterializedViewsCache materializedViewsCache = new MaterializedViewsCache();
+    /**
+     * Adds a newly created materialized view to the cache.
+     */
+    default void createMaterializedView(HiveConf conf, Table materializedViewTable) { }
 
-  /* Whether the cache has been initialized or not. */
-  private final AtomicBoolean initialized = new AtomicBoolean(false);
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table materializedViewTable) {
+
+    }
 
-  private HiveMaterializedViewsRegistry() {
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table oldMaterializedViewTable, Table materializedViewTable) {
+
+    }
+
+    /**
+     * Removes the materialized view from the cache (based on table object equality), if exists.
+     */
+    default void dropMaterializedView(Table materializedViewTable) { }
+
+    /**
+     * Removes the materialized view from the cache (based on qualified name), if exists.
+     */
+    default void dropMaterializedView(String dbName, String tableName) { }
+
+    /**
+     * Returns all the materialized views enabled for Calcite based rewriting in the cache.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews() { return Collections.emptyList(); }
+
+    /**
+     * Returns the materialized views in the cache for the given database.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */

Review comment:
       The javadoc does not correspond to what this method does.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {
 
-  private final MaterializedViewsCache materializedViewsCache = new MaterializedViewsCache();
+    /**
+     * Adds a newly created materialized view to the cache.

Review comment:
       NIT: Use "cache" or "registry" but not both. Probably we should follow the name of the class/interface.
   
   It seems that some methods mention registry while others mention cache.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {
 
-  private final MaterializedViewsCache materializedViewsCache = new MaterializedViewsCache();
+    /**
+     * Adds a newly created materialized view to the cache.
+     */
+    default void createMaterializedView(HiveConf conf, Table materializedViewTable) { }
 
-  /* Whether the cache has been initialized or not. */
-  private final AtomicBoolean initialized = new AtomicBoolean(false);
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table materializedViewTable) {
+
+    }
 
-  private HiveMaterializedViewsRegistry() {
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table oldMaterializedViewTable, Table materializedViewTable) {
+
+    }
+
+    /**
+     * Removes the materialized view from the cache (based on table object equality), if exists.
+     */
+    default void dropMaterializedView(Table materializedViewTable) { }
+
+    /**
+     * Removes the materialized view from the cache (based on qualified name), if exists.
+     */
+    default void dropMaterializedView(String dbName, String tableName) { }
+
+    /**
+     * Returns all the materialized views enabled for Calcite based rewriting in the cache.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews() { return Collections.emptyList(); }
+
+    /**
+     * Returns the materialized views in the cache for the given database.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default HiveRelOptMaterialization getRewritingMaterializedView(
+            String dbName, String viewName, EnumSet<HiveRelOptMaterialization.RewriteAlgorithm> scope) {
+      return null;
+    }
+
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews(String querySql) {
+      return Collections.emptyList();
+    }
+
+    default boolean isEmpty() {
+      return true;
+    }

Review comment:
       Missing javadoc

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {
 
-  private final MaterializedViewsCache materializedViewsCache = new MaterializedViewsCache();
+    /**
+     * Adds a newly created materialized view to the cache.
+     */
+    default void createMaterializedView(HiveConf conf, Table materializedViewTable) { }
 
-  /* Whether the cache has been initialized or not. */
-  private final AtomicBoolean initialized = new AtomicBoolean(false);
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table materializedViewTable) {
+
+    }
 
-  private HiveMaterializedViewsRegistry() {
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table oldMaterializedViewTable, Table materializedViewTable) {
+
+    }
+
+    /**
+     * Removes the materialized view from the cache (based on table object equality), if exists.
+     */
+    default void dropMaterializedView(Table materializedViewTable) { }
+
+    /**
+     * Removes the materialized view from the cache (based on qualified name), if exists.
+     */
+    default void dropMaterializedView(String dbName, String tableName) { }
+
+    /**
+     * Returns all the materialized views enabled for Calcite based rewriting in the cache.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews() { return Collections.emptyList(); }
+
+    /**
+     * Returns the materialized views in the cache for the given database.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default HiveRelOptMaterialization getRewritingMaterializedView(
+            String dbName, String viewName, EnumSet<HiveRelOptMaterialization.RewriteAlgorithm> scope) {
+      return null;
+    }
+
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews(String querySql) {
+      return Collections.emptyList();
+    }
+
+    default boolean isEmpty() {
+      return true;

Review comment:
       ```suggestion
         return getRewritingMaterializedViews().isEmpty();
   ```

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -139,109 +200,262 @@ public void init() {
     }
   }
 
-  public void init(Hive db) {
+  private static void init(Hive db) {
     final boolean dummy = db.getConf().get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname)
         .equals("DUMMY");
     if (dummy) {
       // Dummy registry does not cache information and forwards all requests to metastore
-      initialized.set(true);
+      SINGLETON = new MaterializedViewsRegistry() {};
       LOG.info("Using dummy materialized views registry");
     } else {
+      SINGLETON = new InMemoryMaterializedViewsRegistry(HiveMaterializedViewsRegistry::createMaterialization);
       // We initialize the cache
       long period = HiveConf.getTimeVar(db.getConf(), ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_REFRESH, TimeUnit.SECONDS);
+      if (period <= 0) {
+        return;
+      }
+      
       ScheduledExecutorService pool = Executors.newSingleThreadScheduledExecutor(
           new ThreadFactoryBuilder()
               .setDaemon(true)
               .setNameFormat("HiveMaterializedViewsRegistry-%d")
               .build());
-      pool.scheduleAtFixedRate(new Loader(db), 0, period, TimeUnit.SECONDS);
+
+      MaterializedViewObjects objects = db::getAllMaterializedViewObjectsForRewriting;
+      pool.scheduleAtFixedRate(new Loader(db.getConf(), SINGLETON, objects), 0, period, TimeUnit.SECONDS);
     }
   }
 
-  private class Loader implements Runnable {
-    private final Hive db;
+  public interface MaterializedViewObjects {
+    List<Table> getAllMaterializedViewObjectsForRewriting() throws HiveException;
+  }
 
-    private Loader(Hive db) {
-      this.db = db;
+  public static class Loader implements Runnable {
+    protected final HiveConf hiveConf;
+    protected final MaterializedViewsRegistry materializedViewsRegistry;
+    protected final MaterializedViewObjects materializedViewObjects;
+    /* Whether the cache has been initialized or not. */

Review comment:
       Bogus comment?
   ```suggestion
       /* Whether the cache has been initialized or not. */
   ```

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {

Review comment:
       Why do we need to have both a class (`HiveMaterializedViewsRegistry`) and an inner interface (`MaterializedViewsRegistry`)? This pattern is not very easy to follow and it is not clear what are the responsibilities of each.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -139,109 +200,262 @@ public void init() {
     }
   }
 
-  public void init(Hive db) {
+  private static void init(Hive db) {
     final boolean dummy = db.getConf().get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname)
         .equals("DUMMY");
     if (dummy) {
       // Dummy registry does not cache information and forwards all requests to metastore
-      initialized.set(true);
+      SINGLETON = new MaterializedViewsRegistry() {};
       LOG.info("Using dummy materialized views registry");
     } else {
+      SINGLETON = new InMemoryMaterializedViewsRegistry(HiveMaterializedViewsRegistry::createMaterialization);
       // We initialize the cache
       long period = HiveConf.getTimeVar(db.getConf(), ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_REFRESH, TimeUnit.SECONDS);
+      if (period <= 0) {
+        return;
+      }
+      
       ScheduledExecutorService pool = Executors.newSingleThreadScheduledExecutor(
           new ThreadFactoryBuilder()
               .setDaemon(true)
               .setNameFormat("HiveMaterializedViewsRegistry-%d")
               .build());
-      pool.scheduleAtFixedRate(new Loader(db), 0, period, TimeUnit.SECONDS);
+
+      MaterializedViewObjects objects = db::getAllMaterializedViewObjectsForRewriting;
+      pool.scheduleAtFixedRate(new Loader(db.getConf(), SINGLETON, objects), 0, period, TimeUnit.SECONDS);
     }
   }
 
-  private class Loader implements Runnable {
-    private final Hive db;
+  public interface MaterializedViewObjects {
+    List<Table> getAllMaterializedViewObjectsForRewriting() throws HiveException;
+  }

Review comment:
       The new interface gives the impression that we are adding complexity instead of simplifying things. Keeping a `List` of tables looks simpler.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -97,23 +98,83 @@
   private static final Logger LOG = LoggerFactory.getLogger(HiveMaterializedViewsRegistry.class);
   private static final String CLASS_NAME = HiveMaterializedViewsRegistry.class.getName();
 
-  /* Singleton */
-  private static final HiveMaterializedViewsRegistry SINGLETON = new HiveMaterializedViewsRegistry();
+  /**
+   * Registry for materialized views. The goal of this cache is to avoid parsing and creating
+   * logical plans for the materialized views at query runtime. When a query arrives, we will
+   * just need to consult this cache and extract the logical plans for the views (which had
+   * already been parsed) from it. This cache lives in HS2.
+   */
+  public interface MaterializedViewsRegistry {
 
-  private final MaterializedViewsCache materializedViewsCache = new MaterializedViewsCache();
+    /**
+     * Adds a newly created materialized view to the cache.
+     */
+    default void createMaterializedView(HiveConf conf, Table materializedViewTable) { }
 
-  /* Whether the cache has been initialized or not. */
-  private final AtomicBoolean initialized = new AtomicBoolean(false);
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table materializedViewTable) {
+
+    }
 
-  private HiveMaterializedViewsRegistry() {
+    /**
+     * Update the materialized view in the registry (if existing materialized view matches).
+     */
+    default void refreshMaterializedView(HiveConf conf, Table oldMaterializedViewTable, Table materializedViewTable) {
+
+    }
+
+    /**
+     * Removes the materialized view from the cache (based on table object equality), if exists.
+     */
+    default void dropMaterializedView(Table materializedViewTable) { }
+
+    /**
+     * Removes the materialized view from the cache (based on qualified name), if exists.
+     */
+    default void dropMaterializedView(String dbName, String tableName) { }
+
+    /**
+     * Returns all the materialized views enabled for Calcite based rewriting in the cache.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews() { return Collections.emptyList(); }
+
+    /**
+     * Returns the materialized views in the cache for the given database.
+     *
+     * @return the collection of materialized views, or the empty collection if none
+     */
+    default HiveRelOptMaterialization getRewritingMaterializedView(
+            String dbName, String viewName, EnumSet<HiveRelOptMaterialization.RewriteAlgorithm> scope) {
+      return null;
+    }
+
+    default List<HiveRelOptMaterialization> getRewritingMaterializedViews(String querySql) {
+      return Collections.emptyList();
+    }
+
+    default boolean isEmpty() {
+      return true;
+    }
   }
 
+  private HiveMaterializedViewsRegistry() {}
+
+  /* Singleton */
+  private static MaterializedViewsRegistry SINGLETON = null;
+
   /**
    * Get instance of HiveMaterializedViewsRegistry.
    *
    * @return the singleton
    */
-  public static HiveMaterializedViewsRegistry get() {
+  public static MaterializedViewsRegistry get() {
+    if (SINGLETON == null) {
+      init();
+    }

Review comment:
       This kind of lazy initialization pattern suffers from race conditions.
   
   Effective java: ITEM 71: USE LAZY INITIALIZATION JUDICIOUSLY
   Java concurrency in practice: 2.2.2. Example: Race Conditions in Lazy Initialization
   
   We may not care at the moment but if the way we access the registry changes slightly we may end up with hard to debug bugs.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -139,109 +200,262 @@ public void init() {
     }
   }
 
-  public void init(Hive db) {
+  private static void init(Hive db) {
     final boolean dummy = db.getConf().get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname)
         .equals("DUMMY");
     if (dummy) {
       // Dummy registry does not cache information and forwards all requests to metastore
-      initialized.set(true);
+      SINGLETON = new MaterializedViewsRegistry() {};
       LOG.info("Using dummy materialized views registry");
     } else {
+      SINGLETON = new InMemoryMaterializedViewsRegistry(HiveMaterializedViewsRegistry::createMaterialization);
       // We initialize the cache
       long period = HiveConf.getTimeVar(db.getConf(), ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_REFRESH, TimeUnit.SECONDS);
+      if (period <= 0) {
+        return;
+      }
+      
       ScheduledExecutorService pool = Executors.newSingleThreadScheduledExecutor(
           new ThreadFactoryBuilder()
               .setDaemon(true)
               .setNameFormat("HiveMaterializedViewsRegistry-%d")
               .build());
-      pool.scheduleAtFixedRate(new Loader(db), 0, period, TimeUnit.SECONDS);
+
+      MaterializedViewObjects objects = db::getAllMaterializedViewObjectsForRewriting;
+      pool.scheduleAtFixedRate(new Loader(db.getConf(), SINGLETON, objects), 0, period, TimeUnit.SECONDS);
     }
   }
 
-  private class Loader implements Runnable {
-    private final Hive db;
+  public interface MaterializedViewObjects {
+    List<Table> getAllMaterializedViewObjectsForRewriting() throws HiveException;
+  }
 
-    private Loader(Hive db) {
-      this.db = db;
+  public static class Loader implements Runnable {

Review comment:
       Does the class need to be public?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -139,109 +200,262 @@ public void init() {
     }
   }
 
-  public void init(Hive db) {
+  private static void init(Hive db) {
     final boolean dummy = db.getConf().get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname)
         .equals("DUMMY");
     if (dummy) {
       // Dummy registry does not cache information and forwards all requests to metastore
-      initialized.set(true);
+      SINGLETON = new MaterializedViewsRegistry() {};
       LOG.info("Using dummy materialized views registry");
     } else {
+      SINGLETON = new InMemoryMaterializedViewsRegistry(HiveMaterializedViewsRegistry::createMaterialization);
       // We initialize the cache
       long period = HiveConf.getTimeVar(db.getConf(), ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_REFRESH, TimeUnit.SECONDS);
+      if (period <= 0) {
+        return;
+      }
+      
       ScheduledExecutorService pool = Executors.newSingleThreadScheduledExecutor(
           new ThreadFactoryBuilder()
               .setDaemon(true)
               .setNameFormat("HiveMaterializedViewsRegistry-%d")
               .build());
-      pool.scheduleAtFixedRate(new Loader(db), 0, period, TimeUnit.SECONDS);
+
+      MaterializedViewObjects objects = db::getAllMaterializedViewObjectsForRewriting;
+      pool.scheduleAtFixedRate(new Loader(db.getConf(), SINGLETON, objects), 0, period, TimeUnit.SECONDS);
     }
   }
 
-  private class Loader implements Runnable {
-    private final Hive db;
+  public interface MaterializedViewObjects {
+    List<Table> getAllMaterializedViewObjectsForRewriting() throws HiveException;
+  }
 
-    private Loader(Hive db) {
-      this.db = db;
+  public static class Loader implements Runnable {
+    protected final HiveConf hiveConf;
+    protected final MaterializedViewsRegistry materializedViewsRegistry;
+    protected final MaterializedViewObjects materializedViewObjects;
+    /* Whether the cache has been initialized or not. */
+
+    Loader(HiveConf hiveConf,
+            MaterializedViewsRegistry materializedViewsRegistry,
+            MaterializedViewObjects materializedViewObjects) {
+      this.hiveConf = hiveConf;
+      this.materializedViewsRegistry = materializedViewsRegistry;
+      this.materializedViewObjects = materializedViewObjects;
     }
 
     @Override
     public void run() {
-      SessionState ss = new SessionState(db.getConf());
-      ss.setIsHiveServerQuery(true); // All is served from HS2, we do not need e.g. Tez sessions
-      SessionState.start(ss);
-      PerfLogger perfLogger = SessionState.getPerfLogger();
+      refresh();
+    }
+
+    public void refresh() {
+      PerfLogger perfLogger = getPerfLogger();
       perfLogger.perfLogBegin(CLASS_NAME, PerfLogger.MATERIALIZED_VIEWS_REGISTRY_REFRESH);
       try {
-        if (initialized.get()) {
-          for (Table mvTable : db.getAllMaterializedViewObjectsForRewriting()) {
-            RelOptMaterialization existingMV = getRewritingMaterializedView(
-                    mvTable.getDbName(), mvTable.getTableName(), ALL);
-            if (existingMV != null) {
-              // We replace if the existing MV is not newer
-              Table existingMVTable = HiveMaterializedViewUtils.extractTable(existingMV);
-              if (existingMVTable.getCreateTime() < mvTable.getCreateTime() ||
-                  (existingMVTable.getCreateTime() == mvTable.getCreateTime() &&
-                      existingMVTable.getMVMetadata().getMaterializationTime() <= mvTable.getMVMetadata().getMaterializationTime())) {
-                refreshMaterializedView(db.getConf(), existingMVTable, mvTable);
-              }
-            } else {
-              // Simply replace if it still does not exist
-              refreshMaterializedView(db.getConf(), null, mvTable);
+        List<Table> materializedViewObjects = this.materializedViewObjects.getAllMaterializedViewObjectsForRewriting();
+        for (Table mvTable : materializedViewObjects) {
+          RelOptMaterialization existingMV = materializedViewsRegistry.getRewritingMaterializedView(
+                  mvTable.getDbName(), mvTable.getTableName(), ALL);
+          if (existingMV != null) {
+            // We replace if the existing MV is not newer
+            Table existingMVTable = HiveMaterializedViewUtils.extractTable(existingMV);
+            if (existingMVTable.getCreateTime() < mvTable.getCreateTime() ||
+                    (existingMVTable.getCreateTime() == mvTable.getCreateTime() &&
+                            existingMVTable.getMVMetadata().getMaterializationTime() <= mvTable.getMVMetadata().getMaterializationTime())) {
+              materializedViewsRegistry.refreshMaterializedView(hiveConf, existingMVTable, mvTable);
             }
+          } else {
+            // Simply replace if it still does not exist
+            materializedViewsRegistry.refreshMaterializedView(hiveConf, null, mvTable);
           }
-          LOG.info("Materialized views registry has been refreshed");
-        } else {
-          for (Table mvTable : db.getAllMaterializedViewObjectsForRewriting()) {
-            refreshMaterializedView(db.getConf(), null, mvTable);
+        }
+
+        for (HiveRelOptMaterialization materialization : materializedViewsRegistry.getRewritingMaterializedViews()) {
+          Table mvTableInCache = HiveMaterializedViewUtils.extractTable(materialization);
+          Table mvTableInHMS = materializedViewObjects.stream()
+                  .filter(table -> table.getDbName().equals(mvTableInCache.getDbName())
+                          && table.getTableName().equals(mvTableInCache.getTableName()))
+                  .findAny()
+                  .orElse(null);
+
+          if (mvTableInHMS == null) {
+            materializedViewsRegistry.dropMaterializedView(mvTableInCache);
           }
-          initialized.set(true);
-          LOG.info("Materialized views registry has been initialized");
         }
+
+        LOG.info("Materialized views registry has been refreshed");
       } catch (HiveException e) {
-        if (initialized.get()) {
-          LOG.error("Problem connecting to the metastore when refreshing the view registry", e);
-        } else {
-          LOG.error("Problem connecting to the metastore when initializing the view registry", e);
-        }
+        LOG.error("Problem connecting to the metastore when refreshing the view registry", e);
       }
       perfLogger.perfLogEnd(CLASS_NAME, PerfLogger.MATERIALIZED_VIEWS_REGISTRY_REFRESH);
     }
+
+    private PerfLogger getPerfLogger() {
+      SessionState ss = new SessionState(hiveConf);
+      ss.setIsHiveServerQuery(true); // All is served from HS2, we do not need e.g. Tez sessions
+      SessionState.start(ss);
+      return SessionState.getPerfLogger();
+    }

Review comment:
       Do we really need the perf logger? Using a session just to obtain a logger seems a bit of an overkill.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java
##########
@@ -139,109 +200,262 @@ public void init() {
     }
   }
 
-  public void init(Hive db) {
+  private static void init(Hive db) {
     final boolean dummy = db.getConf().get(HiveConf.ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_IMPL.varname)
         .equals("DUMMY");
     if (dummy) {
       // Dummy registry does not cache information and forwards all requests to metastore
-      initialized.set(true);
+      SINGLETON = new MaterializedViewsRegistry() {};
       LOG.info("Using dummy materialized views registry");
     } else {
+      SINGLETON = new InMemoryMaterializedViewsRegistry(HiveMaterializedViewsRegistry::createMaterialization);
       // We initialize the cache
       long period = HiveConf.getTimeVar(db.getConf(), ConfVars.HIVE_SERVER2_MATERIALIZED_VIEWS_REGISTRY_REFRESH, TimeUnit.SECONDS);
+      if (period <= 0) {
+        return;
+      }
+      
       ScheduledExecutorService pool = Executors.newSingleThreadScheduledExecutor(
           new ThreadFactoryBuilder()
               .setDaemon(true)
               .setNameFormat("HiveMaterializedViewsRegistry-%d")
               .build());
-      pool.scheduleAtFixedRate(new Loader(db), 0, period, TimeUnit.SECONDS);
+
+      MaterializedViewObjects objects = db::getAllMaterializedViewObjectsForRewriting;
+      pool.scheduleAtFixedRate(new Loader(db.getConf(), SINGLETON, objects), 0, period, TimeUnit.SECONDS);
     }
   }
 
-  private class Loader implements Runnable {
-    private final Hive db;
+  public interface MaterializedViewObjects {
+    List<Table> getAllMaterializedViewObjectsForRewriting() throws HiveException;
+  }
 
-    private Loader(Hive db) {
-      this.db = db;
+  public static class Loader implements Runnable {
+    protected final HiveConf hiveConf;
+    protected final MaterializedViewsRegistry materializedViewsRegistry;
+    protected final MaterializedViewObjects materializedViewObjects;
+    /* Whether the cache has been initialized or not. */
+
+    Loader(HiveConf hiveConf,
+            MaterializedViewsRegistry materializedViewsRegistry,
+            MaterializedViewObjects materializedViewObjects) {
+      this.hiveConf = hiveConf;
+      this.materializedViewsRegistry = materializedViewsRegistry;
+      this.materializedViewObjects = materializedViewObjects;
     }
 
     @Override
     public void run() {
-      SessionState ss = new SessionState(db.getConf());
-      ss.setIsHiveServerQuery(true); // All is served from HS2, we do not need e.g. Tez sessions
-      SessionState.start(ss);
-      PerfLogger perfLogger = SessionState.getPerfLogger();
+      refresh();
+    }
+
+    public void refresh() {
+      PerfLogger perfLogger = getPerfLogger();
       perfLogger.perfLogBegin(CLASS_NAME, PerfLogger.MATERIALIZED_VIEWS_REGISTRY_REFRESH);
       try {
-        if (initialized.get()) {
-          for (Table mvTable : db.getAllMaterializedViewObjectsForRewriting()) {
-            RelOptMaterialization existingMV = getRewritingMaterializedView(
-                    mvTable.getDbName(), mvTable.getTableName(), ALL);
-            if (existingMV != null) {
-              // We replace if the existing MV is not newer
-              Table existingMVTable = HiveMaterializedViewUtils.extractTable(existingMV);
-              if (existingMVTable.getCreateTime() < mvTable.getCreateTime() ||
-                  (existingMVTable.getCreateTime() == mvTable.getCreateTime() &&
-                      existingMVTable.getMVMetadata().getMaterializationTime() <= mvTable.getMVMetadata().getMaterializationTime())) {
-                refreshMaterializedView(db.getConf(), existingMVTable, mvTable);
-              }
-            } else {
-              // Simply replace if it still does not exist
-              refreshMaterializedView(db.getConf(), null, mvTable);
+        List<Table> materializedViewObjects = this.materializedViewObjects.getAllMaterializedViewObjectsForRewriting();
+        for (Table mvTable : materializedViewObjects) {
+          RelOptMaterialization existingMV = materializedViewsRegistry.getRewritingMaterializedView(
+                  mvTable.getDbName(), mvTable.getTableName(), ALL);
+          if (existingMV != null) {
+            // We replace if the existing MV is not newer
+            Table existingMVTable = HiveMaterializedViewUtils.extractTable(existingMV);
+            if (existingMVTable.getCreateTime() < mvTable.getCreateTime() ||
+                    (existingMVTable.getCreateTime() == mvTable.getCreateTime() &&
+                            existingMVTable.getMVMetadata().getMaterializationTime() <= mvTable.getMVMetadata().getMaterializationTime())) {
+              materializedViewsRegistry.refreshMaterializedView(hiveConf, existingMVTable, mvTable);
             }
+          } else {
+            // Simply replace if it still does not exist
+            materializedViewsRegistry.refreshMaterializedView(hiveConf, null, mvTable);
           }
-          LOG.info("Materialized views registry has been refreshed");
-        } else {
-          for (Table mvTable : db.getAllMaterializedViewObjectsForRewriting()) {
-            refreshMaterializedView(db.getConf(), null, mvTable);
+        }
+
+        for (HiveRelOptMaterialization materialization : materializedViewsRegistry.getRewritingMaterializedViews()) {
+          Table mvTableInCache = HiveMaterializedViewUtils.extractTable(materialization);
+          Table mvTableInHMS = materializedViewObjects.stream()

Review comment:
       How do we keep `materializedViewObjects` up-to-date with the HMS? It seems that we obtain the views once when the `Loader` is created but then I don't see it changed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org