You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/12/09 02:22:25 UTC

[GitHub] [pinot] zhtaoxiang opened a new pull request, #9954: report minion task metadata last update time as metric

zhtaoxiang opened a new pull request, #9954:
URL: https://github.com/apache/pinot/pull/9954

   Report minion task metadata last update time as metric. This metric is helpful when we are going to track/debug minion tasks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
zhtaoxiang commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1048876709


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/MinionTaskMetadataUtils.java:
##########
@@ -63,6 +68,66 @@ private static ZNRecord fetchTaskMetadata(HelixPropertyStore<ZNRecord> propertyS
     return znRecord;
   }
 
+  /**
+   * Gets the last update time (in ms) of all minion task metadata.
+   * @param propertyStore the property store where all minion task metadata is stored.
+   * @return a map storing the last update time (in ms) of all minion task metadata: (tableNameWithType -> taskType
+   *         -> last update time in ms)
+   */
+  @Nonnull
+  public static Map<String, Map<String, Long>> getAllTaskMetadataLastUpdateTimeMs(
+      @Nonnull HelixPropertyStore<ZNRecord> propertyStore) {
+    Map<String, Map<String, Long>> tableTaskLastUpdateTimeMsMap = new HashMap<>();
+    String propertyStorePathForMinionTaskMetadataPrefix =
+        ZKMetadataProvider.getPropertyStorePathForMinionTaskMetadataPrefix();
+    // the old and new path may exist at the same time
+    List<String> tableNameWithTypeOrTaskTypes =
+        propertyStore.getChildNames(propertyStorePathForMinionTaskMetadataPrefix, AccessOption.PERSISTENT);
+    if (tableNameWithTypeOrTaskTypes == null || tableNameWithTypeOrTaskTypes.isEmpty()) {
+      return tableTaskLastUpdateTimeMsMap;
+    }
+    for (String tableNameWithTypeOrTaskType : tableNameWithTypeOrTaskTypes) {
+      String metadataNodeDirectParentPath =
+          StringUtil.join("/", propertyStorePathForMinionTaskMetadataPrefix, tableNameWithTypeOrTaskType);
+      List<String> metadataNodeNames =
+          propertyStore.getChildNames(metadataNodeDirectParentPath, AccessOption.PERSISTENT);
+      if (metadataNodeNames == null || metadataNodeNames.isEmpty()) {
+        continue;
+      }
+      // the new path is MINION_TASK_METADATA/${tableNameWthType}/${taskType}
+      // the old path is MINION_TASK_METADATA/${taskType}/${tableNameWthType}
+      boolean isNewPath =

Review Comment:
   The variable `tableNameWithTypeOrTaskType` stores the first level child name of `MINION_TASK_METADATA`, that why when it ends with `OFFLINE` or `REALTIME`, it is a new path.
   
   Let me add this explanation to make it clearer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
zhtaoxiang commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1053850337


##########
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/controller.yml:
##########
@@ -112,6 +112,13 @@ rules:
   labels:
     taskType: "$1"
     status: "$2"
+- pattern: "\"org.apache.pinot.common.metrics\"<type=\"ControllerMetrics\", name=\"pinot.controller.timeMsSinceLastMinionTaskMetadataUpdate.(\\w+)_(\\w+)\\.(\\w+)\"><>(\\w+)"

Review Comment:
   yes, tested with java agent, although not using docker



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee merged pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee merged PR #9954:
URL: https://github.com/apache/pinot/pull/9954


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #9954:
URL: https://github.com/apache/pinot/pull/9954#issuecomment-1343961344

   # [Codecov](https://codecov.io/gh/apache/pinot/pull/9954?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#9954](https://codecov.io/gh/apache/pinot/pull/9954?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (18ca704) into [master](https://codecov.io/gh/apache/pinot/commit/60a62fe74de597956818f7a14c37fc1b1a136625?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (60a62fe) will **decrease** coverage by `45.24%`.
   > The diff coverage is `65.38%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #9954       +/-   ##
   =============================================
   - Coverage     70.39%   25.15%   -45.25%     
   + Complexity     5071       44     -5027     
   =============================================
     Files          1986     1974       -12     
     Lines        106777   106472      -305     
     Branches      16195    16163       -32     
   =============================================
   - Hits          75168    26778    -48390     
   - Misses        26357    76963    +50606     
   + Partials       5252     2731     -2521     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `25.15% <65.38%> (+0.09%)` | :arrow_up: |
   | integration2 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/pinot/pull/9954?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...he/pinot/common/utils/helix/FakePropertyStore.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvaGVsaXgvRmFrZVByb3BlcnR5U3RvcmUuamF2YQ==) | `0.00% <0.00%> (-65.52%)` | :arrow_down: |
   | [...e/pinot/common/minion/MinionTaskMetadataUtils.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWluaW9uL01pbmlvblRhc2tNZXRhZGF0YVV0aWxzLmphdmE=) | `64.86% <71.87%> (-20.85%)` | :arrow_down: |
   | [...g/apache/pinot/common/metrics/ControllerGauge.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Db250cm9sbGVyR2F1Z2UuamF2YQ==) | `98.07% <100.00%> (+0.03%)` | :arrow_up: |
   | [...lix/core/minion/PinotHelixTaskResourceManager.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL21pbmlvbi9QaW5vdEhlbGl4VGFza1Jlc291cmNlTWFuYWdlci5qYXZh) | `32.25% <100.00%> (-10.72%)` | :arrow_down: |
   | [...ntroller/helix/core/minion/TaskMetricsEmitter.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL21pbmlvbi9UYXNrTWV0cmljc0VtaXR0ZXIuamF2YQ==) | `88.23% <100.00%> (+2.18%)` | :arrow_up: |
   | [...in/java/org/apache/pinot/spi/utils/BytesUtils.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdXRpbHMvQnl0ZXNVdGlscy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/BaseRecording.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvQmFzZVJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...java/org/apache/pinot/spi/trace/NoOpRecording.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvdHJhY2UvTm9PcFJlY29yZGluZy5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/table/FSTType.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3RhYmxlL0ZTVFR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...ava/org/apache/pinot/spi/config/user/RoleType.java](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGlub3Qtc3BpL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zcGkvY29uZmlnL3VzZXIvUm9sZVR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [1451 more](https://codecov.io/gh/apache/pinot/pull/9954/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] PrachiKhobragade commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
PrachiKhobragade commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1046385591


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/MinionTaskMetadataUtils.java:
##########
@@ -63,6 +68,66 @@ private static ZNRecord fetchTaskMetadata(HelixPropertyStore<ZNRecord> propertyS
     return znRecord;
   }
 
+  /**
+   * Gets the last update time (in ms) of all minion task metadata.
+   * @param propertyStore the property store where all minion task metadata is stored.
+   * @return a map storing the last update time (in ms) of all minion task metadata: (tableNameWithType -> taskType
+   *         -> last update time in ms)
+   */
+  @Nonnull
+  public static Map<String, Map<String, Long>> getAllTaskMetadataLastUpdateTimeMs(
+      @Nonnull HelixPropertyStore<ZNRecord> propertyStore) {
+    Map<String, Map<String, Long>> tableTaskLastUpdateTimeMsMap = new HashMap<>();
+    String propertyStorePathForMinionTaskMetadataPrefix =
+        ZKMetadataProvider.getPropertyStorePathForMinionTaskMetadataPrefix();
+    // the old and new path may exist at the same time
+    List<String> tableNameWithTypeOrTaskTypes =
+        propertyStore.getChildNames(propertyStorePathForMinionTaskMetadataPrefix, AccessOption.PERSISTENT);
+    if (tableNameWithTypeOrTaskTypes == null || tableNameWithTypeOrTaskTypes.isEmpty()) {
+      return tableTaskLastUpdateTimeMsMap;
+    }
+    for (String tableNameWithTypeOrTaskType : tableNameWithTypeOrTaskTypes) {
+      String metadataNodeDirectParentPath =
+          StringUtil.join("/", propertyStorePathForMinionTaskMetadataPrefix, tableNameWithTypeOrTaskType);
+      List<String> metadataNodeNames =
+          propertyStore.getChildNames(metadataNodeDirectParentPath, AccessOption.PERSISTENT);
+      if (metadataNodeNames == null || metadataNodeNames.isEmpty()) {
+        continue;
+      }
+      // the new path is MINION_TASK_METADATA/${tableNameWthType}/${taskType}
+      // the old path is MINION_TASK_METADATA/${taskType}/${tableNameWthType}
+      boolean isNewPath =

Review Comment:
   if the old path ends with tableNameWithType shouldn't the isNewPath should be negation of the condition used?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1044638238


##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java:
##########
@@ -52,6 +52,7 @@ public enum ControllerGauge implements AbstractMetrics.Gauge {
   OFFLINE_TABLE_COUNT("TableCount", true),
   DISABLED_TABLE_COUNT("TableCount", true),
   PERIODIC_TASK_NUM_TABLES_PROCESSED("PeriodicTaskNumTablesProcessed", true),
+  MINION_TASK_METADATA_LAST_UPDATE_TIME_MS("MinionTaskMetadataLastUpdateTimeMs", true),

Review Comment:
   Can we think of normalizing the metrics? Reporting the raw timestamp value will require ppl to re-interpret that to the human readable format so it's going to be a bit hard to use.
   
   For example, the same information can be defined as `time since last update in hours (or days)`. This will show as low number in regular time and it will increase if there's no update at all for a long time. Once we make some progress, it will go down to 0 again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
zhtaoxiang commented on PR #9954:
URL: https://github.com/apache/pinot/pull/9954#issuecomment-1360484846

   > LGTM other than the minor comments.
   > 
   > Q: This new metric will provide us some `progress` is happening for each task types that writes data to `MINION_TASK_METADATA`. One good thing about this is that it's not task specific (e.g. mergeRollupTask has its own metric - watermark - to provide the progress). Can we think of some `progress` for tasks that do not write extra mask metadata?
   
   That is a great question. I will look into this and try to come up with a metric. Not sure if it's easy to achieve or not, but let's try


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1044638238


##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java:
##########
@@ -52,6 +52,7 @@ public enum ControllerGauge implements AbstractMetrics.Gauge {
   OFFLINE_TABLE_COUNT("TableCount", true),
   DISABLED_TABLE_COUNT("TableCount", true),
   PERIODIC_TASK_NUM_TABLES_PROCESSED("PeriodicTaskNumTablesProcessed", true),
+  MINION_TASK_METADATA_LAST_UPDATE_TIME_MS("MinionTaskMetadataLastUpdateTimeMs", true),

Review Comment:
   Can we think of normalizing the metrics? Reporting the raw timestamp value will require ppl to re-interpret that to the human readable format so it's going to be a bit hard to use.
   
   For example, the same information can be defined as `time since last update in hours (or days)`. This will show as low number in regular cases and it will increase if there's no update for a long time. Once we make some progress again, it will go down to 0 again.
   
   For data purge or merge/rollup metrics, we used the similar normalized metrics because it's much easier to understand and also easier to set the alerts if we want to.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1044638238


##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java:
##########
@@ -52,6 +52,7 @@ public enum ControllerGauge implements AbstractMetrics.Gauge {
   OFFLINE_TABLE_COUNT("TableCount", true),
   DISABLED_TABLE_COUNT("TableCount", true),
   PERIODIC_TASK_NUM_TABLES_PROCESSED("PeriodicTaskNumTablesProcessed", true),
+  MINION_TASK_METADATA_LAST_UPDATE_TIME_MS("MinionTaskMetadataLastUpdateTimeMs", true),

Review Comment:
   Can we think of normalizing the metrics? Reporting the raw timestamp value will require ppl to re-interpret that to the human readable format so it's going to be a bit hard to use.
   
   For example, the same information can be defined as `the time since last update in hours (or days)`. This will show as low number in regular time and it will increase if there's no update at all for a long time. Once we make some progress, it will go down to 0 again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1044638238


##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java:
##########
@@ -52,6 +52,7 @@ public enum ControllerGauge implements AbstractMetrics.Gauge {
   OFFLINE_TABLE_COUNT("TableCount", true),
   DISABLED_TABLE_COUNT("TableCount", true),
   PERIODIC_TASK_NUM_TABLES_PROCESSED("PeriodicTaskNumTablesProcessed", true),
+  MINION_TASK_METADATA_LAST_UPDATE_TIME_MS("MinionTaskMetadataLastUpdateTimeMs", true),

Review Comment:
   Can we think of normalizing the metrics? Reporting the raw timestamp value will require ppl to re-interpret that to the human readable format so it's going to be a bit hard to use.
   
   For example, the same information can be defined as `time since last update in hours (or days)`. This will show as low number in regular cases and it will increase if there's no update for a long time. Once we make some progress again, it will go down to 0 again.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] zhtaoxiang commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
zhtaoxiang commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1048873343


##########
pinot-common/src/main/java/org/apache/pinot/common/metrics/ControllerGauge.java:
##########
@@ -52,6 +52,7 @@ public enum ControllerGauge implements AbstractMetrics.Gauge {
   OFFLINE_TABLE_COUNT("TableCount", true),
   DISABLED_TABLE_COUNT("TableCount", true),
   PERIODIC_TASK_NUM_TABLES_PROCESSED("PeriodicTaskNumTablesProcessed", true),
+  MINION_TASK_METADATA_LAST_UPDATE_TIME_MS("MinionTaskMetadataLastUpdateTimeMs", true),

Review Comment:
   Had an offline discussion with @snleee .
   
   Although it is fairly easy to perform calculation in PromQL (e.g., `now() - last update time` will be the time since the last update time), it may still be more straight forward if we use normalized metrics.
   
   Therefore, we will update this metric to be `the time since the last update time`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #9954: report minion task metadata last update time as metric

Posted by GitBox <gi...@apache.org>.
snleee commented on code in PR #9954:
URL: https://github.com/apache/pinot/pull/9954#discussion_r1053005212


##########
pinot-common/src/main/java/org/apache/pinot/common/minion/MinionTaskMetadataUtils.java:
##########
@@ -63,6 +68,68 @@ private static ZNRecord fetchTaskMetadata(HelixPropertyStore<ZNRecord> propertyS
     return znRecord;
   }
 
+  /**
+   * Gets the last update time (in ms) of all minion task metadata.
+   * @param propertyStore the property store where all minion task metadata is stored.
+   * @return a map storing the last update time (in ms) of all minion task metadata: (tableNameWithType -> taskType
+   *         -> last update time in ms)
+   */
+  @Nonnull

Review Comment:
   (@Jackie-Jiang's nit) I think that we want to annotate `Nullable` only and assume `Nonnull` by default. We can remove `@Nonnull` annotation.



##########
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/controller.yml:
##########
@@ -112,6 +112,13 @@ rules:
   labels:
     taskType: "$1"
     status: "$2"
+- pattern: "\"org.apache.pinot.common.metrics\"<type=\"ControllerMetrics\", name=\"pinot.controller.timeMsSinceLastMinionTaskMetadataUpdate.(\\w+)_(\\w+)\\.(\\w+)\"><>(\\w+)"

Review Comment:
   I suppose that this has been tested?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org