You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "hemantk-12 (via GitHub)" <gi...@apache.org> on 2023/11/05 21:05:12 UTC

Re: [PR] HDDS-9559. Synchronized OmSnapshotMetrics initialization [ozone]

hemantk-12 commented on PR #5512:
URL: https://github.com/apache/ozone/pull/5512#issuecomment-1793845485

   Thanks for the review @ayushtkn.
   
   > Is this just a test issue?
   
   As of now issue is faced by this test only. I guess reason could be metric is not initialized for integration test.
    
   > What are those two threads?
   
   For snapshot calculation, we load snapshott-ed metadata for source snapshot and target snapshot. So my guess is that CacheLoader loads for both source and target simultaneously.
   
   >  Is there a possibility of a poor cleanup by a previous test, which registered the metrics but didn't unregister?
   
   I doubt that. If I'm not wrong we spin up the cluster for acceptance tests and use the same for all the tests (at least for a set of tests).
   
   > Are you able to repro the issue?
   
   I used repeated test workflow to repro this, but couldn't https://github.com/hemantk-12/ozone/actions/runs/6762836386/job/18379403350
   
   > Does the test fail standalone?
   
   I didn't try that. But if my theory is correct, It would.
   
   > Anything around parallel execution, like test running in parallel, using maven parallel execution, it is a problem there (https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html#parallel-test-execution)
   
   I doubt it is related to that. Once OM is up, it doesn't matter.
   
   > I was looking through the Hadoop code, to see how these metrics are handled there, but I didn't find any doing the sync logic
   
   Based on the [documentation](https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/metrics2/MetricsSystem.html#register-java.lang.String-java.lang.String-T-), metric name is supposed to be unique.
   
   https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/DefaultMetricsSystem.java#L135
   
   > Can you share some more details over here, & steps to repro and how did you validate the fix
   
   Failed attempt to repro the issue: https://github.com/hemantk-12/ozone/actions/runs/6762836386
   Run with fixed: https://github.com/hemantk-12/ozone/actions/runs/6762837904
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org