You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/07/08 00:49:08 UTC

[GitHub] [pulsar] jerrypeng opened a new pull request #7474: Log scheduler stats for Pulsar Functions

jerrypeng opened a new pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474


   ### Motivation
   
   Add stats to be logged for schedule, rebalance, and check failure routines for debugging and performance details
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451695027



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/MembershipManager.java
##########
@@ -225,7 +232,8 @@ public void checkFailures(FunctionMetaDataManager functionMetaDataManager,
             functionRuntimeManager.removeAssignments(needRemove);
         }
         if (triggerScheduler) {
-            log.info("Functions that need scheduling/rescheduling: {}", needSchedule);
+            log.info("Failure check - Total number of instances that need to be scheduled/rescheduled: {} | Number of unassigned instances that need to be scheduled: {} | Number of instances on dead workers that need to be reassigned {}",

Review comment:
       I actually liked the old message. What was the reason for the change? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451254764



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/MembershipManager.java
##########
@@ -225,7 +232,8 @@ public void checkFailures(FunctionMetaDataManager functionMetaDataManager,
             functionRuntimeManager.removeAssignments(needRemove);
         }
         if (triggerScheduler) {
-            log.info("Functions that need scheduling/rescheduling: {}", needSchedule);
+            log.info("Failure check - Total number of instances that need to be scheduled/rescheduled: {} | Number of unassigned instances that need to be scheduled: {} | Number of instances on dead workers that need to be reassigned {}",

Review comment:
       Maybe a better wording here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451310965



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/MembershipManager.java
##########
@@ -225,7 +232,8 @@ public void checkFailures(FunctionMetaDataManager functionMetaDataManager,
             functionRuntimeManager.removeAssignments(needRemove);
         }
         if (triggerScheduler) {
-            log.info("Functions that need scheduling/rescheduling: {}", needSchedule);
+            log.info("Failure check - Total number of instances that need to be scheduled/rescheduled: {} | Number of unassigned instances that need to be scheduled: {} | Number of instances on dead workers that need to be reassigned {}",

Review comment:
       Do you have a suggestion?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451753300



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/MembershipManager.java
##########
@@ -225,7 +232,8 @@ public void checkFailures(FunctionMetaDataManager functionMetaDataManager,
             functionRuntimeManager.removeAssignments(needRemove);
         }
         if (triggerScheduler) {
-            log.info("Functions that need scheduling/rescheduling: {}", needSchedule);
+            log.info("Failure check - Total number of instances that need to be scheduled/rescheduled: {} | Number of unassigned instances that need to be scheduled: {} | Number of instances on dead workers that need to be reassigned {}",

Review comment:
       Added "Failure check" in the beginning so it will be easier to search for




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451308305



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/SchedulerManager.java
##########
@@ -368,8 +390,13 @@ private void invokeRebalance() {
             functionRuntimeManager.processAssignment(assignment);
             // update message id associated with current view of assignments map
             lastMessageProduced = messageId;
+            // update stats
+            schedulerStats.newAssignment(assignment);
         }
-        log.info("Rebalance - Total number of new assignments computed: {}", rebalancedAssignments.size());
+
+        log.info("Rebalance summary - execution time: {} sec | stats: {}\n{}",

Review comment:
       Yes I was going to work on that in the near future.  Stats like scheduler execution time can be in prometheus.  However, a breakdown of what how many instances moved grouped by worker is not suitable to be put into prometheus




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451255338



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/SchedulerManager.java
##########
@@ -526,4 +553,118 @@ static String checkHeartBeatFunction(Instance funInstance) {
 
     public static class RebalanceInProgressException extends RuntimeException {
     }
+
+    @Data

Review comment:
       Is this exported somewhere?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] merlimat commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
merlimat commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451230995



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/SchedulerManager.java
##########
@@ -368,8 +390,13 @@ private void invokeRebalance() {
             functionRuntimeManager.processAssignment(assignment);
             // update message id associated with current view of assignments map
             lastMessageProduced = messageId;
+            // update stats
+            schedulerStats.newAssignment(assignment);
         }
-        log.info("Rebalance - Total number of new assignments computed: {}", rebalancedAssignments.size());
+
+        log.info("Rebalance summary - execution time: {} sec | stats: {}\n{}",

Review comment:
       Can we also report this stats to prometheus?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474#discussion_r451308599



##########
File path: pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/SchedulerManager.java
##########
@@ -526,4 +553,118 @@ static String checkHeartBeatFunction(Instance funInstance) {
 
     public static class RebalanceInProgressException extends RuntimeException {
     }
+
+    @Data

Review comment:
       I will remove




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng merged pull request #7474: Log scheduler stats for Pulsar Functions

Posted by GitBox <gi...@apache.org>.
jerrypeng merged pull request #7474:
URL: https://github.com/apache/pulsar/pull/7474


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org