You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uniffle.apache.org by ck...@apache.org on 2023/01/17 02:55:27 UTC
[incubator-uniffle] 01/04: Fix incorrect metrics of event_queue_size and total_write_handler (#411)
This is an automated email from the ASF dual-hosted git repository.
ckj pushed a commit to branch branch-0.6
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git
commit 790ea613284971503a38fbfd57d56f9c9768d838
Author: Junfan Zhang <zu...@apache.org>
AuthorDate: Tue Dec 13 17:45:54 2022 +0800
Fix incorrect metrics of event_queue_size and total_write_handler (#411)
### What changes were proposed in this pull request?
Fix incorrect metrics of event_queue_size and total_write_handler
### Why are the changes needed?
In current codebase, there are bugs on above metrics.
1. The metric of total_write_handler won't desc when exception happened on flushing to file
2. The metric of event_queue_size won't show the correct wait queue size. In original logic, if all events are waiting to be operated in flush thread pool, the flushQueue is always 0 and the metric value also will be 0. This is wrong.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Don't need.
---
.../main/java/org/apache/uniffle/server/ShuffleFlushManager.java | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
index 96f89296..619fe9ca 100644
--- a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
+++ b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
@@ -95,12 +95,13 @@ public class ShuffleFlushManager {
ShuffleDataFlushEvent event = flushQueue.take();
threadPoolExecutor.execute(() -> {
try {
- ShuffleServerMetrics.gaugeEventQueueSize.set(flushQueue.size());
ShuffleServerMetrics.gaugeWriteHandler.inc();
flushToFile(event);
- ShuffleServerMetrics.gaugeWriteHandler.dec();
} catch (Exception e) {
LOG.error("Exception happened when flush data for " + event, e);
+ } finally {
+ ShuffleServerMetrics.gaugeWriteHandler.dec();
+ ShuffleServerMetrics.gaugeEventQueueSize.dec();
}
});
} catch (Exception e) {
@@ -137,6 +138,8 @@ public class ShuffleFlushManager {
public void addToFlushQueue(ShuffleDataFlushEvent event) {
if (!flushQueue.offer(event)) {
LOG.warn("Flush queue is full, discard event: " + event);
+ } else {
+ ShuffleServerMetrics.gaugeEventQueueSize.inc();
}
}