You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/12/10 10:04:21 UTC

[GitHub] [hive] kgyrtkirk commented on a change in pull request #2636: HIVE-24776:Reduce HMS DB calls during stats updates

kgyrtkirk commented on a change in pull request #2636:
URL: https://github.com/apache/hive/pull/2636#discussion_r766527736



##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
##########
@@ -346,11 +346,12 @@ public void alterTable(RawStore msdb, Warehouse wh, String catName, String dbnam
             }
           }
           Deadline.checkTimeout();
+          Table table = msdb.getTable(catName, newDbName, newTblName);
           for (Entry<Partition, ColumnStatistics> partColStats : columnStatsNeedUpdated.entries()) {
             ColumnStatistics newPartColStats = partColStats.getValue();
             newPartColStats.getStatsDesc().setDbName(newDbName);
             newPartColStats.getStatsDesc().setTableName(newTblName);
-            msdb.updatePartitionColumnStatistics(newPartColStats, partColStats.getKey().getValues(),
+            msdb.updatePartitionColumnStatistics(table, newPartColStats, partColStats.getKey().getValues(),

Review comment:
       looking at the above code - I'm wondering why we need this at all; I believe `alterPartitions` clears the stat data - but I've not seen it explicitly - and this added logic here adds it back after that was done
   
   * old values are really removed ?
   * can't we simply retain the old stat values - because at the end of the day that's what happens here...or I've missed something? - doing this would reduce the number of calls drastically; since we would simply retain things
   
   It also seems like the `alterPartitions` / `alterTable` is also killing the basic stat state - after a rename I don't think we must do that....
   

##########
File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##########
@@ -9687,31 +9687,35 @@ private void writeMPartitionColumnStatistics(Table table, Partition partition,
       List<ColumnStatisticsObj> statsObjs = colStats.getStatsObj();
       ColumnStatisticsDesc statsDesc = colStats.getStatsDesc();
       String catName = statsDesc.isSetCatName() ? statsDesc.getCatName() : getDefaultCatalog(conf);
-      MTable mTable = ensureGetMTable(catName, statsDesc.getDbName(), statsDesc.getTableName());
-      Table table = convertToTable(mTable);
-      Partition partition = convertToPart(getMPartition(
-          catName, statsDesc.getDbName(), statsDesc.getTableName(), partVals, mTable), false);
-      List<String> colNames = new ArrayList<>();
+      MTable mTable = null;
+      if(table == null) {

Review comment:
       let's not play with `null` -s and alternate code paths
   you decided to change the method signature and added `table` ; fill it out everywhere or remove the parameter.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org