You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2020/07/10 21:42:35 UTC

[GitHub] [hive] vihangk1 commented on a change in pull request #1217: Hive 23767 : Pass ValidWriteIdList in get_* HMS API requests from HMS Client

vihangk1 commented on a change in pull request #1217:
URL: https://github.com/apache/hive/pull/1217#discussion_r453071142



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -3580,7 +3596,17 @@ public boolean dropPartition(String dbName, String tableName, List<String> parti
     List<String> pvals = MetaStoreUtils.getPvals(t.getPartCols(), partSpec);
 
     try {
-      names = getMSC().listPartitionNames(dbName, tblName, pvals, max);
+      GetPartitionNamesPsRequest req = new GetPartitionNamesPsRequest();

Review comment:
       Is there a reason why we don't send the ValidWriteIdList in the getPartitionNames method defined at line 3574?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -1459,6 +1464,22 @@ public Table getTable(final String dbName, final String tableName, boolean throw
     return new Table(tTable);
   }
 
+  /**
+   * Get ValidWriteIdList for the current transaction.

Review comment:
       I think it would be good to have more details in this javadoc. Specifically, that this fetches the ValidWriteIdList from the metastore for the given table if txnManager has a open transaction.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -3911,20 +3962,27 @@ public boolean dropPartition(String dbName, String tableName, List<String> parti
    * @param tbl The table containing the partitions.
    * @param expr A serialized expression for partition predicates.
    * @param conf Hive config.
-   * @param result the resulting list of partitions
+   * @param partitions the resulting list of partitions
    * @return whether the resulting list contains partitions which may or may not match the expr
    */
   public boolean getPartitionsByExpr(Table tbl, ExprNodeGenericFuncDesc expr, HiveConf conf,
-      List<Partition> result) throws HiveException, TException {
-    assert result != null;
+      List<Partition> partitions) throws HiveException, TException {
+
+    assert partitions != null;

Review comment:
       May be use Preconditions.checkNotNull(partitions) here since assert is not guaranteed to be enabled always.

##########
File path: ql/src/test/org/apache/hadoop/hive/ql/parse/TestUpdateDeleteSemanticAnalyzer.java
##########
@@ -232,6 +235,8 @@ public void setup() throws Exception {
     conf.setVar(HiveConf.ConfVars.HIVEMAPREDMODE, "nonstrict");
     conf.setVar(HiveConf.ConfVars.HIVE_TXN_MANAGER, "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager");
     conf.setBoolVar(HiveConf.ConfVars.HIVE_IN_TEST, true);
+    conf.set(ValidTxnList.VALID_TXNS_KEY,

Review comment:
       Why do we need this? If the objective is to cleanup the state may be conf.unset(ValidTxnList.VALID_TXN_KEY) is a better choice?

##########
File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
##########
@@ -1980,6 +1989,7 @@ public boolean listPartitionsSpecByExpr(String catName, String dbName, String tb
     if (maxParts >= 0) {
       req.setMaxParts(maxParts);
     }
+    req.setValidWriteIdList(validWriteIdList);

Review comment:
       Its confusing to me that we can two different ways to pass down the ValidWriteIdList. I see many other methods in this class which use getValidWriteIdList to set the ValidWriteIdList in the requests. Is it intentional to have them different for some methods? If yes, please document why we do it differently for some methods.
   
   On  a side note, the getValidWriteIdList method retrieves the ValidWriteIdList of the table from conf using the key VALID_TABLES_WRITEIDS_KEY which is set during the lock acquisition phase __after__ query compilation. Doesn't that defeat the purpose of sending the ValidWriteIdList since the objective is to compile the query for a given snapshot?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -1459,6 +1464,22 @@ public Table getTable(final String dbName, final String tableName, boolean throw
     return new Table(tTable);
   }
 
+  /**
+   * Get ValidWriteIdList for the current transaction.
+   * @param dbName
+   * @param tableName
+   * @return
+   * @throws LockException
+   */
+  private ValidWriteIdList getValidWriteIdList(String dbName, String tableName) throws LockException {
+    ValidWriteIdList validWriteIdList = null;
+    long txnId = SessionState.get().getTxnMgr() != null ? SessionState.get().getTxnMgr().getCurrentTxnId() : 0;
+    if (txnId > 0) {

Review comment:
       I think we should we explicitly checking conf.get(ValidTxnList.VALID_TXNS_KEY) != null here instead of txnId > 0 since txnId > 0 doesn't guarantee that validTransactions have been recorded in the conf. On a side note, its bit weird DbTxnManager code using assert statements instead of Preconditions since jvm disables assert statements by default. So for example, if a given code-path doesn't have VALID_TXNS_KEY set in the conf, this method will do and fetch the current ValidWriteIdList for the table which is not the snapshot view of the table during query compilation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org