You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/11/09 15:10:00 UTC

[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check

     [ https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=509208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-509208 ]

ASF GitHub Bot logged work on HIVE-24259:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Nov/20 15:09
            Start Date: 09/Nov/20 15:09
    Worklog Time Spent: 10m 
      Work Description: sankarh commented on a change in pull request #1610:
URL: https://github.com/apache/hive/pull/1610#discussion_r519879870



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -2844,15 +2814,10 @@ public SQLAllTableConstraints getAllTableConstraints(String catName, String dbNa
       return rawStore.getAllTableConstraints(catName, dbName, tblName);
     }
 
-    Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
-    if (tbl == null) {
-      // The table containing the constraints is not yet loaded in cache
-      return rawStore.getAllTableConstraints(catName, dbName, tblName);
-    }
     SQLAllTableConstraints constraints = sharedCache.listCachedAllTableConstraints(catName, dbName, tblName);
 
-    // if any of the constraint value is missing then there might be the case of partial constraints are stored in cached.
-    // So fall back to raw store for correct values
+    /* If constraint value is missing then there might be the case that table is not stored in cached or

Review comment:
       Add a TODO that this check is inefficient as all calls will hit rawstore if even one of the constraint type is not set for the given table.

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##########
@@ -2397,7 +2397,7 @@ public SQLAllTableConstraints listCachedAllTableConstraints(String catName, Stri
 
   public List<SQLForeignKey> listCachedForeignKeys(String catName, String foreignDbName, String foreignTblName,
                                                    String parentDbName, String parentTblName) {
-    List<SQLForeignKey> keys = new ArrayList<>();
+    List<SQLForeignKey> keys = null;

Review comment:
       keys with null is accessed at line 2413

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##########
@@ -2836,14 +2836,32 @@ long getPartsFound() {
   @Override
   public SQLAllTableConstraints getAllTableConstraints(String catName, String dbName, String tblName)
       throws MetaException, NoSuchObjectException {
-    SQLAllTableConstraints sqlAllTableConstraints = new SQLAllTableConstraints();
-    sqlAllTableConstraints.setPrimaryKeys(getPrimaryKeys(catName, dbName, tblName));
-    sqlAllTableConstraints.setForeignKeys(getForeignKeys(catName, null, null, dbName, tblName));
-    sqlAllTableConstraints.setUniqueConstraints(getUniqueConstraints(catName, dbName, tblName));
-    sqlAllTableConstraints.setDefaultConstraints(getDefaultConstraints(catName, dbName, tblName));
-    sqlAllTableConstraints.setCheckConstraints(getCheckConstraints(catName, dbName, tblName));
-    sqlAllTableConstraints.setNotNullConstraints(getNotNullConstraints(catName, dbName, tblName));
-    return sqlAllTableConstraints;
+
+    catName = StringUtils.normalizeIdentifier(catName);
+    dbName = StringUtils.normalizeIdentifier(dbName);
+    tblName = StringUtils.normalizeIdentifier(tblName);
+    if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && rawStore.isActiveTransaction())) {
+      return rawStore.getAllTableConstraints(catName, dbName, tblName);
+    }
+
+    Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
+    if (tbl == null) {
+      // The table containing the constraints is not yet loaded in cache
+      return rawStore.getAllTableConstraints(catName, dbName, tblName);
+    }
+    SQLAllTableConstraints constraints = sharedCache.listCachedAllTableConstraints(catName, dbName, tblName);
+
+    // if any of the constraint value is missing then there might be the case of partial constraints are stored in cached.
+    // So fall back to raw store for correct values
+    if (constraints != null && CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && CollectionUtils

Review comment:
       Why is it hard to set a flag for consistency? If we hit else flow in this case, we shall try to update the cache and if that fails for some reason, then set the flag to false else set to true. Same with pre-warm and refresh flows.
   Also, we need to check only the flag to validate the consistency of cache. No need to check for null or empty constraints.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 509208)
    Time Spent: 1h 10m  (was: 1h)

> [CachedStore] Optimise get constraints call by removing redundant table check 
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-24259
>                 URL: https://issues.apache.org/jira/browse/HIVE-24259
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Ashish Sharma
>            Assignee: Ashish Sharma
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Description -
> Problem - 
> 1. Redundant check if table is present or not
> 2. Currently in order to get all constraint form the cachedstore. 6 different call is made with in the cached store. Which led to 6 different call to raw store
>  
> DOD
> 1. Check only once if table exit in cached store.
> 2. Instead of calling individual constraint in cached store. Add a method which return all constraint at once and if data is not consistent then fall back to rawstore.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)