You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/06/12 23:34:27 UTC

[GitHub] [kafka] spal1 opened a new pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

spal1 opened a new pull request #8863:
URL: https://github.com/apache/kafka/pull/8863


   Adds JMX metrics to Trogdor to keep track of Created, Running, and Done tasks and the number of Active Agents in a Trogdor cluster as seen in [KAFKA-8528](https://issues.apache.org/jira/browse/KAFKA-8528).
   
   Adds a `TrogdorMetrics` class that contains a metric and sensors for tasks in the aforementioned states as well as for active agents. Utilizes the `Platform` interface shared by Tasks and Agents to create  common `TrogdorMetrics` instance in a `MetricsContainer` class.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450921851



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/agent/Agent.java
##########
@@ -92,6 +93,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       There are a few ways you could do this.  One is to have separate classes for agent metrics and coordinator metrics.  Another is to have a boolean or something that you pass in to determine whether agent or coordinator metrics are created.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] spal1 commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
spal1 commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450545145



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/agent/Agent.java
##########
@@ -92,6 +93,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       A `TrogdorMetrics` was shared between the `Coordinator` and `Agent` solely to allow for a count of the active agents to be added to the existing metrics. There doesn't seem to be a way to do this only from the `Coordinator`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450505811



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/coordinator/Coordinator.java
##########
@@ -70,6 +70,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       should be static and just "metrics"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450504757



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/coordinator/TaskManager.java
##########
@@ -395,11 +401,13 @@ public Void call() throws Exception {
                 log.error("Unable to find nodes for task {}", task.id, e);
                 task.doneMs = time.milliseconds();
                 task.state = TaskStateType.DONE;
+                trogdorMetrics.recordDoneTask();

Review comment:
       It would be better to create a method in ManagedTask like `setState(...)` and increment the appropriate metric if the state had changed.  Then we could move all these `state = foo` lines over to using that API.  This would prevent making a mistake here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450505811



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/coordinator/Coordinator.java
##########
@@ -70,6 +70,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       should be final and just "metrics"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] spal1 commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
spal1 commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450545145



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/agent/Agent.java
##########
@@ -92,6 +93,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       `TrogdorMetrics` was shared between the `Coordinator` and `Agent` solely to allow for a count of the active agents to be added to the existing metrics. There doesn't seem to be a way to do this only from the `Coordinator`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450505717



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/common/Platform.java
##########
@@ -54,6 +62,29 @@ public static Platform parse(String curNodeName, String path) throws Exception {
         }
     }
 
+    class MetricsContainer {
+        private static TrogdorMetrics trogdorMetrics = null;

Review comment:
       I'm a bit confused by why this is static.  It seems like it should just be tracked by each Coordinator or Agent.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450504170



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/coordinator/TaskManager.java
##########
@@ -127,7 +127,12 @@
      */
     private long nextWorkerId;
 
-    TaskManager(Platform platform, Scheduler scheduler, long firstWorkerId) {
+    /**
+     * The TrogdorMetrics for this coordinator.
+     */
+    private TrogdorMetrics trogdorMetrics;

Review comment:
       This should be final and probably just be called "metrics" since there are no other metrics that we need to distinguish it from




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cmccabe commented on a change in pull request #8863: KAFKA-8528: Expose Trogdor-specific JMX metrics for Tasks and Agents

Posted by GitBox <gi...@apache.org>.
cmccabe commented on a change in pull request #8863:
URL: https://github.com/apache/kafka/pull/8863#discussion_r450503568



##########
File path: tools/src/main/java/org/apache/kafka/trogdor/agent/Agent.java
##########
@@ -92,6 +93,8 @@
 
     private final Time time;
 
+    final TrogdorMetrics trogdorMetrics;

Review comment:
       It doesn't seem right for the Agent to share the same metrics with the Coordinator.  For example, `active-agents-count` is meaningless on the agent, since it doesn't know how many other agents there are.  We should probably clarify this in the KIP as well (metrics that are not relevant to the agent are not included there).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org