You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by "deniskuzZ (via GitHub)" <gi...@apache.org> on 2023/03/01 14:27:39 UTC

[GitHub] [hive] deniskuzZ commented on a diff in pull request #3576: HIVE-26704: Cleaner shouldn't be blocked by global min open txnId

deniskuzZ commented on code in PR #3576:
URL: https://github.com/apache/hive/pull/3576#discussion_r1121822320


##########
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java:
##########
@@ -140,41 +140,36 @@ public void run() {
                     HiveConf.ConfVars.HIVE_COMPACTOR_CLEANER_DURATION_UPDATE_INTERVAL, TimeUnit.MILLISECONDS),
                     new CleanerCycleUpdater(MetricsConstants.COMPACTION_CLEANER_CYCLE_DURATION, startedAt));
           }
-
           long minOpenTxnId = txnHandler.findMinOpenTxnIdForCleaner();
-
           checkInterrupt();
 
           List<CompactionInfo> readyToClean = txnHandler.findReadyToClean(minOpenTxnId, retentionTime);
-
           checkInterrupt();
 
           if (!readyToClean.isEmpty()) {
-            long minTxnIdSeenOpen = txnHandler.findMinTxnIdSeenOpen();
-            final long cleanerWaterMark =
-                minTxnIdSeenOpen < 0 ? minOpenTxnId : Math.min(minOpenTxnId, minTxnIdSeenOpen);
-
-            LOG.info("Cleaning based on min open txn id: " + cleanerWaterMark);
             List<CompletableFuture<Void>> cleanerList = new ArrayList<>();
             // For checking which compaction can be cleaned we can use the minOpenTxnId
             // However findReadyToClean will return all records that were compacted with old version of HMS
             // where the CQ_NEXT_TXN_ID is not set. For these compactions we need to provide minTxnIdSeenOpen
             // to the clean method, to avoid cleaning up deltas needed for running queries
             // when min_history_level is finally dropped, than every HMS will commit compaction the new way
             // and minTxnIdSeenOpen can be removed and minOpenTxnId can be used instead.
-            for (CompactionInfo compactionInfo : readyToClean) {
-
+            for (CompactionInfo ci : readyToClean) {
               //Check for interruption before scheduling each compactionInfo and return if necessary
               checkInterrupt();
-
+              
               CompletableFuture<Void> asyncJob =
                   CompletableFuture.runAsync(
-                          ThrowingRunnable.unchecked(() -> clean(compactionInfo, cleanerWaterMark, metricsEnabled)),
-                          cleanerExecutor)
-                      .exceptionally(t -> {
-                        LOG.error("Error clearing {}", compactionInfo.getFullPartitionName(), t);
-                        return null;
-                      });
+                      ThrowingRunnable.unchecked(() -> {
+                        long minOpenTxn = (ci.minOpenWriteId > 0) ? 
+                            ci.nextTxnId + 1 : Math.min(minOpenTxnId, txnHandler.findMinTxnIdSeenOpen());

Review Comment:
   yea, missed that, if we have minOpenWriteId, wee should call findMinTxnIdSeenOpen
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org