You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/02/26 09:56:04 UTC

[GitHub] [hive] kasakrisz commented on a change in pull request #2013: HIVE-24820: MaterializedViewCache enables adding multiple entries of the same Materialization instance

kasakrisz commented on a change in pull request #2013:
URL: https://github.com/apache/hive/pull/2013#discussion_r583517959



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/MaterializedViewsCache.java
##########
@@ -80,52 +80,11 @@ public void putIfAbsent(Table materializedViewTable, HiveRelOptMaterialization m
     return dbMap;
   }
 
-  public void refresh(
-          Table oldMaterializedViewTable, Table materializedViewTable, HiveRelOptMaterialization newMaterialization) {
-    ConcurrentMap<String, HiveRelOptMaterialization> dbMap = ensureDbMap(materializedViewTable);
-
-    dbMap.compute(materializedViewTable.getTableName(), (mvTableName, existingMaterialization) -> {
-      List<HiveRelOptMaterialization> optMaterializationList = sqlToMaterializedView.computeIfAbsent(
-              materializedViewTable.getViewExpandedText(), s -> new ArrayList<>());
+  public void refresh(Table materializedViewTable, HiveRelOptMaterialization newMaterialization) {
+    remove(materializedViewTable.getDbName(), materializedViewTable.getTableName());

Review comment:
       I updated the implementation of `refresh` to have the logic within a single operation.
   
   What is the reason of checking the old and the cached version are the same?
   
   Please consider the following usecase: we have three threads and two HS2 instances
   0. T1 and T2 are runing on the same HS2 instance Both are executing the same query which can be rewritten to scan the same mat1 materialized view which is already in the cache and up to date.
   1. Both threads generates the new plan. During plan generation both threads got a reference to the V1 version of mat1 form the cache.
   2. Parallelly a T3 thread rebuilds the MV from an other HS2 instance. -> mat1 has a V2 version in Metastore.
   3. T1 thread runs 'Hive.validateMaterializedViewsFromRegistry' and refresh the cache. It can do it because the T1 holds the V1 version of mat1 and the cache also has V1 version. After refresh the cache has V2 of mat1.
   4. T3 thread rebuilds the MV again from an other HS2 instance. -> mat1 has a V3 version in Metastore.
   5. T2 thread runs 'Hive.validateMaterializedViewsFromRegistry' and tries to refresh the cache. It can not do it because T2 holds the V1 version of mat1 and the cache has V2 version. -> The cache still has V2.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org