You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uniffle.apache.org by ck...@apache.org on 2023/01/17 02:55:27 UTC

[incubator-uniffle] 01/04: Fix incorrect metrics of event_queue_size and total_write_handler (#411)

This is an automated email from the ASF dual-hosted git repository.

ckj pushed a commit to branch branch-0.6
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git

commit 790ea613284971503a38fbfd57d56f9c9768d838
Author: Junfan Zhang <zu...@apache.org>
AuthorDate: Tue Dec 13 17:45:54 2022 +0800

    Fix incorrect metrics of event_queue_size and total_write_handler (#411)
    
    ### What changes were proposed in this pull request?
    
    Fix incorrect metrics of event_queue_size and total_write_handler
    
    ### Why are the changes needed?
    In current codebase, there are bugs on above metrics.
    
    1. The metric of total_write_handler won't desc when exception happened on flushing to file
    2. The metric of event_queue_size won't show the correct wait queue size. In original logic, if all events are waiting to be operated in flush thread pool, the flushQueue is always 0 and the metric value also will be 0. This is wrong.
    
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    
    ### How was this patch tested?
    Don't need.
---
 .../main/java/org/apache/uniffle/server/ShuffleFlushManager.java   | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
index 96f89296..619fe9ca 100644
--- a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
+++ b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
@@ -95,12 +95,13 @@ public class ShuffleFlushManager {
           ShuffleDataFlushEvent event = flushQueue.take();
           threadPoolExecutor.execute(() -> {
             try {
-              ShuffleServerMetrics.gaugeEventQueueSize.set(flushQueue.size());
               ShuffleServerMetrics.gaugeWriteHandler.inc();
               flushToFile(event);
-              ShuffleServerMetrics.gaugeWriteHandler.dec();
             } catch (Exception e) {
               LOG.error("Exception happened when flush data for " + event, e);
+            } finally {
+              ShuffleServerMetrics.gaugeWriteHandler.dec();
+              ShuffleServerMetrics.gaugeEventQueueSize.dec();
             }
           });
         } catch (Exception e) {
@@ -137,6 +138,8 @@ public class ShuffleFlushManager {
   public void addToFlushQueue(ShuffleDataFlushEvent event) {
     if (!flushQueue.offer(event)) {
       LOG.warn("Flush queue is full, discard event: " + event);
+    } else {
+      ShuffleServerMetrics.gaugeEventQueueSize.inc();
     }
   }