You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2023/01/09 19:27:07 UTC

[GitHub] [ozone] xBis7 opened a new pull request, #4164: HDDS-7506. Expose more snapshot metrics under OMMetrics

xBis7 opened a new pull request, #4164:
URL: https://github.com/apache/ozone/pull/4164

   ## What changes were proposed in this pull request?
   
   This PR is adding more snapshot metrics in `OMMetrics` to keep track of the count of each snapshot status. 
   Right now there is no implementation for snapshot delete or reclaim and therefore these metrics are not getting a value anywhere in the code. 
   
   We are going over the snapshot table on OM `start()`, `restart()` and `reloadOMState()` and get a count for every snapshot status. Also we are incrementing the number of active snapshots during every create request. In the future we might want to decrement the number of active when incrementing the number of delete and similarly decrement number of delete when incrementing number of reclaimed.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7506
   
   ## How was this patch tested?
   
   A new test was added in `TestOmMetrics` under `integration-test` package. Also we might want to expand this test method in the future as more snapshot features are made available.
   
   This patch was also tested manually in a docker cluster like so
   
   create a snapshot
   ```
   ❯ docker-compose up --scale datanode=3 -d
   Creating network "ozone_default" with the default driver
   Creating ozone_recon_1    ... done
   Creating ozone_om_1       ... done
   Creating ozone_s3g_1      ... done
   Creating ozone_scm_1      ... done
   Creating ozone_datanode_1 ... done
   Creating ozone_datanode_2 ... done
   Creating ozone_datanode_3 ... done
   
   ❯ docker exec -it ozone_om_1 bash
   bash-4.2$ ozone sh volume create /vol1
   bash-4.2$ ozone sh bucket create /vol1/bucket1 
   bash-4.2$ ozone sh key put /vol1/bucket1/key1 README.md
   bash-4.2$ ozone sh snapshot create /vol1/bucket1 snap1
   ```
   on `0.0.0.0:9874/jmx`
   ```
       "NumSnapshotActive" : 1,
       "NumSnapshotCreateFails" : 0,
       "NumSnapshotCreates" : 1,
   ```
   
   create another snapshot
   ```
   bash-4.2$ ozone sh key put /vol1/bucket1/key2 /etc/hosts
   bash-4.2$ ozone sh snapshot create /vol1/bucket1 snap2
   bash-4.2$ exit
   exit
   ```
   
   check `0.0.0.0:9874/jmx` again
   ```
       "NumSnapshotActive" : 2,
       "NumSnapshotCreateFails" : 0,
       "NumSnapshotCreates" : 2,
   ```
   
   restart the OM
   ```
   ❯ docker restart ozone_om_1 
   ozone_om_1
   ```
   
   after a minute, on `0.0.0.0:9874/jmx`
   ```
       "NumSnapshotActive" : 2,
       "NumSnapshotCreateFails" : 0,
       "NumSnapshotCreates" : 0,
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xBis7 commented on pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
xBis7 commented on PR #4164:
URL: https://github.com/apache/ozone/pull/4164#issuecomment-1396881316

   @GeorgeJahad Thanks for looking into this. I've addressed you comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GeorgeJahad commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
GeorgeJahad commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1073956554


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1623,6 +1629,51 @@ public void restart() throws IOException {
     omState = State.RUNNING;
   }
 
+  /**
+   * Iterate the Snapshot table, check the status
+   * for every snapshot and update OMMetrics.
+   */
+  private void updateActiveSnapshotMetrics()
+      throws IOException {
+
+    long activeGauge = 0;
+    long deletedGauge = 0;
+    long reclaimedGauge = 0;
+
+    try (TableIterator<String, ? extends
+        KeyValue<String, SnapshotInfo>> keyIter =
+             metadataManager.getSnapshotInfoTable().iterator()) {
+      Table.KeyValue<String, SnapshotInfo> keyValue;
+
+      List<SnapshotInfo> snapshotInfoList = new ArrayList<>();
+
+      while (keyIter.hasNext()) {
+        keyValue = keyIter.next();
+        snapshotInfoList.add(keyValue.getValue());
+      }
+
+      for (SnapshotInfo info : snapshotInfoList) {
+        SnapshotInfo.SnapshotStatus snapshotStatus =
+            info.getSnapshotStatus();
+
+        if (snapshotStatus.equals(SnapshotInfo
+            .SnapshotStatus.SNAPSHOT_ACTIVE)) {
+          activeGauge++;
+        } else if (snapshotStatus.equals(SnapshotInfo
+            .SnapshotStatus.SNAPSHOT_DELETED)) {
+          deletedGauge++;
+        } else if (snapshotStatus.equals(SnapshotInfo
+            .SnapshotStatus.SNAPSHOT_RECLAIMED)) {
+          reclaimedGauge++;
+        }
+      }
+    }
+
+    metrics.setNumSnapshotActive(activeGauge);
+    metrics.setNumSnapshotDeleted(deletedGauge);
+    metrics.setNumSnapshotDeleted(reclaimedGauge);

Review Comment:
    setNumSnapshotReclaimed() ??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1083082060


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java:
##########
@@ -448,6 +451,45 @@ public void incNumSnapshotListFails() {
     numSnapshotListFails.incr();
   }
 
+  public void setNumSnapshotActive(long num) {
+    long currVal = numSnapshotActive.value();
+    numSnapshotActive.incr(num - currVal);

Review Comment:
   Got it. Sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on PR #4164:
URL: https://github.com/apache/ozone/pull/4164#issuecomment-1399023540

   Thanks @xBis7 for the metrics addition. Thanks @GeorgeJahad for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
smengcl commented on PR #4164:
URL: https://github.com/apache/ozone/pull/4164#issuecomment-1380927950

   @xBis7 Would you rebase this patch as well? Somehow the CI build is failing, even after I retrigger the whole job:
   
   https://github.com/apache/ozone/actions/runs/3904654043/jobs/6672248017
   ```
   Error:  COMPILATION ERROR : 
   [INFO] -------------------------------------------------------------
   Error:  /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmMetrics.java:[394,33] method createBucketInfo in class org.apache.hadoop.ozone.om.TestOmMetrics cannot be applied to given types;
   ```
   
   Just do a `git remote update apache && git merge apache/HDDS-6517-Snapshot` on your PR branch then push.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GeorgeJahad commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
GeorgeJahad commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1073954558


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1623,6 +1629,51 @@ public void restart() throws IOException {
     omState = State.RUNNING;
   }
 
+  /**
+   * Iterate the Snapshot table, check the status
+   * for every snapshot and update OMMetrics.
+   */
+  private void updateActiveSnapshotMetrics()
+      throws IOException {
+
+    long activeGauge = 0;
+    long deletedGauge = 0;
+    long reclaimedGauge = 0;
+
+    try (TableIterator<String, ? extends
+        KeyValue<String, SnapshotInfo>> keyIter =
+             metadataManager.getSnapshotInfoTable().iterator()) {
+      Table.KeyValue<String, SnapshotInfo> keyValue;
+
+      List<SnapshotInfo> snapshotInfoList = new ArrayList<>();
+
+      while (keyIter.hasNext()) {
+        keyValue = keyIter.next();
+        snapshotInfoList.add(keyValue.getValue());
+      }
+
+      for (SnapshotInfo info : snapshotInfoList) {

Review Comment:
   why are there two separate loops?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xBis7 commented on pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
xBis7 commented on PR #4164:
URL: https://github.com/apache/ozone/pull/4164#issuecomment-1380969693

   @smengcl There was a conflict in `TestOmMetrics`. I rebased and resolved it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xBis7 commented on pull request #4164: HDDS-7506. Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
xBis7 commented on PR #4164:
URL: https://github.com/apache/ozone/pull/4164#issuecomment-1376171912

   @GeorgeJahad @smengcl Can you take a look at this PR? I'm not that familiar with snapshot code, so I don't know if there is something I missed or another metric you would like me to add. If there is nothing else to be added here, I can convert this to an actual PR. BTW, I have a green workflow build on my fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl merged pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl merged PR #4164:
URL: https://github.com/apache/ozone/pull/4164


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] GeorgeJahad commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
GeorgeJahad commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1073965769


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1516,6 +1516,9 @@ public void start() throws IOException {
     metrics.setNumFiles(metadataManager
         .countEstimatedRowsInTable(metadataManager.getFileTable()));
 
+    // Snapshot metrics
+    updateActiveSnapshotMetrics();

Review Comment:
   instead of invoking this in 3 places, lets invoke in instantiateServices() right after the call to:
   ```
       omSnapshotManager = new OmSnapshotManager(this);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
smengcl commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1081976308


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java:
##########
@@ -448,6 +451,45 @@ public void incNumSnapshotListFails() {
     numSnapshotListFails.incr();
   }
 
+  public void setNumSnapshotActive(long num) {
+    long currVal = numSnapshotActive.value();
+    numSnapshotActive.incr(num - currVal);

Review Comment:
   Serviceable. But I wonder if there is a better alternative to `MutableCounterLong` in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xBis7 commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
xBis7 commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1081158602


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java:
##########
@@ -1623,6 +1629,51 @@ public void restart() throws IOException {
     omState = State.RUNNING;
   }
 
+  /**
+   * Iterate the Snapshot table, check the status
+   * for every snapshot and update OMMetrics.
+   */
+  private void updateActiveSnapshotMetrics()
+      throws IOException {
+
+    long activeGauge = 0;
+    long deletedGauge = 0;
+    long reclaimedGauge = 0;
+
+    try (TableIterator<String, ? extends
+        KeyValue<String, SnapshotInfo>> keyIter =
+             metadataManager.getSnapshotInfoTable().iterator()) {
+      Table.KeyValue<String, SnapshotInfo> keyValue;
+
+      List<SnapshotInfo> snapshotInfoList = new ArrayList<>();
+
+      while (keyIter.hasNext()) {
+        keyValue = keyIter.next();
+        snapshotInfoList.add(keyValue.getValue());
+      }
+
+      for (SnapshotInfo info : snapshotInfoList) {

Review Comment:
   You are right, this is unnecessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] xBis7 commented on a diff in pull request #4164: HDDS-7506. [Snapshot] Expose more snapshot metrics under OMMetrics

Posted by GitBox <gi...@apache.org>.
xBis7 commented on code in PR #4164:
URL: https://github.com/apache/ozone/pull/4164#discussion_r1082411069


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java:
##########
@@ -448,6 +451,45 @@ public void incNumSnapshotListFails() {
     numSnapshotListFails.incr();
   }
 
+  public void setNumSnapshotActive(long num) {
+    long currVal = numSnapshotActive.value();
+    numSnapshotActive.incr(num - currVal);

Review Comment:
   @smengcl In the initial commit, I declared the new metrics as [gauges](https://github.com/apache/ozone/blob/b961f153b6c0244f5d30802be118cfe780393db7/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OMMetrics.java#L124) instead of counters. I changed them to counters because it seems cleaner in cases such as creating a snapshot where we need to increment `NumSnapshotActive` or in deleting a snapshot where we need to decreament `NumSnapshotActive` and increment `NumSnapshotDeleted`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org