You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "sadanand48 (via GitHub)" <gi...@apache.org> on 2023/02/06 09:21:37 UTC

[GitHub] [ozone] sadanand48 opened a new pull request, #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

sadanand48 opened a new pull request, #4249:
URL: https://github.com/apache/ozone/pull/4249

   ## What changes were proposed in this pull request?
   The timeline for CreateSnapshot request is as follows
   
   t1-> Update snapshot info in Cache and return response to the client after bucket lock released. The OM DoubleBuffer thread has not yet flushed this operation to DB so actual rocksdb is not updated and checkpoint has not yet been created.
   
   t2-> Lets say Client issues read on the renamed path, During read it finds the Snapshot info object  and the checkpoint dir location from the Snapshot Table Cache, however the checkpoint is actually created when actual RocksDB is updated in OMCreateSnapshotResponse#addToDBBatch and the read fails
   
   t3-> Add snapshotInfo to DB and create checkpoint.
   
   The  fix here is to wait for the checkpoint dir creation during read if the snapshot info is in cache.
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-7906
   
   ## How was this patch tested?
   Unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098760638


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2209,4 +2209,46 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {
+    // Init data
+    OzoneBucket bucket1 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume1Path = new Path(OZONE_URI_DELIMITER, bucket1.getVolumeName());
+    Path bucket1Path = new Path(volume1Path, bucket1.getName());
+    Path file1 = new Path(bucket1Path, "key1");
+    Path file2 = new Path(bucket1Path, "key2");
+
+    ContractTestUtils.touch(fs, file1);
+    ContractTestUtils.touch(fs, file2);
+
+    OzoneBucket bucket2 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume2Path = new Path(OZONE_URI_DELIMITER, bucket2.getVolumeName());
+    Path bucket2Path = new Path(volume2Path, bucket2.getName());
+
+    fs.mkdirs(bucket2Path);
+    Path snapPath1 = fs.createSnapshot(bucket1Path, "snap1");
+    Path snapPath2 = fs.createSnapshot(bucket2Path, "snap1");
+
+    Path file3 = new Path(bucket1Path, "key3");
+    ContractTestUtils.touch(fs, file3);
+
+    Path snapPath3 = fs.createSnapshot(bucket1Path, "snap2");
+
+    try {
+      FileStatus[] f1 = fs.listStatus(snapPath1);
+      FileStatus[] f2 = fs.listStatus(snapPath2);
+      FileStatus[] f3 = fs.listStatus(snapPath3);
+      Assert.assertEquals(2, f1.length);
+      Assert.assertEquals(0, f2.length);
+      Assert.assertEquals(3, f3.length);
+    } catch (Exception e) {
+      Assert.fail("Failed to read/list on snapshotPath, exception: " + e);
+    }
+

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113303598


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   Is this accounting for the case where the snapshot entry is a tombstone in cache (`CacheValue` is `Optional.absent()` like [here](https://github.com/apache/ozone/blob/ddbe71d3f90ae14ecf29f51c2265f1f9c3171668/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequest.java#L158-L162)), which could happen when the snapshot eventually deleted from the `snapshotInfoTable`? Doesn't look to be the case to me?
   
   Therefore, IMO in addition to checking null, we would need to check whether the value is `Optional.absent()`. If it is `Optional.absent()`, `isSnapshotInCache` should still be `false` as we won't want to wait for a checkpoint dir that is deleted. Or we would have to put checks before this logic is even reached.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113355015


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   I ran a UT and `getCacheValue` also returns `null` in this case
   
   <img width="1535" alt="cachevalue" src="https://user-images.githubusercontent.com/50227127/220413948-aa5f3b38-670e-43bb-b714-2e0efda14415.png">
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097841708


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -911,6 +918,22 @@ private boolean isKeyPresentInTableCache(String keyPrefix,
     return false;
   }
 
+
+  static boolean isSnapshotPresentInTableCache(String keyPrefix, Table table) {

Review Comment:
   1. The function name is wrong because `isSnapshotPresentInTableCache` seems specific to snapshot but we are passing table to make it generic. Either change the function name to something generic or move it to `OmSnapshotManager` .
   1. Why do we need to iterator over the table instead of just using [`get`](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L85) or [getSkipCache](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L96)?
   1. Wouldn't it always return `true` because [entry is in cache](https://github.com/hemantk-12/ozone/blob/ccc814ee7f8a7cac70b6d64dcae12056e375199d/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java#L149)? Or it is because iterator is over table not cache?
   1. One more doubt if flushing the RocksDB operations queue actually creates checkpointing dir because it could be another async operation. Please double check that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098772534


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {
+      File checkpoint = Paths.get(metaDir.toPath().toString(), dbName).toFile();
+      RDBCheckpointManager.waitForCheckpointDirectoryExist(checkpoint);

Review Comment:
   One more idea is to create the checkpoint in OmCreateSnapshotCreateRequest#validateAndUpdateCache inside the bucket lock, but the problem here is that if OM crashes/shuts down during the time the entry is in cache and not flushed to DB, a waste checkpoint will be created since it won't be accessed anymore, i.e a failed createSnapshotRequest will create checkpoints and adds up to the space. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113356832


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   Uh oops. My bad. The value inside the object is `null`. But the object itself is not `null`.
   
   @sadanand48 Would you like to post an addendum to this jira?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097841708


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -911,6 +918,22 @@ private boolean isKeyPresentInTableCache(String keyPrefix,
     return false;
   }
 
+
+  static boolean isSnapshotPresentInTableCache(String keyPrefix, Table table) {

Review Comment:
   1. The function name is wrong because `isSnapshotPresentInTableCache` seems specific to snapshot but we are passing table to make it generic. Either change the function name to something generic or move it to `OmSnapshotManager` .
   1. Why do we need to iterator over the table instead of just using [`get`](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L85) or [getSkipCache](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L96)?
   1. Wouldn't it always return `true` because [entry is in cache](https://github.com/hemantk-12/ozone/blob/ccc814ee7f8a7cac70b6d64dcae12056e375199d/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java#L149)? Please correct me if I'm wrong.
   1. One more doubt if flushing the RocksDB operations queue actually creates checkpointing dir because it could be another async operation. Please double check that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097971112


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {

Review Comment:
   I don't know how add this check is helping. Because test passes even if you remove the `if condition` and just add the wait time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113381868


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   Sure I can raise another PR with this change, Is a revert required here or should I only include this particular check in new PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1100486374


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {
+      File checkpoint = Paths.get(metaDir.toPath().toString(), dbName).toFile();
+      RDBCheckpointManager.waitForCheckpointDirectoryExist(checkpoint);

Review Comment:
   I asked whether Rocksdb createCheckpoint call  is synchronous in the rocksdb community group and this is what I got.
   > 'checkpoint is always sync  after the CreateCheckpoint is completed a point in time copy of your database exists in the output directory'
   
   That aside, the problem of unnecessary checkpoint creation when snapshot create fails is still an issue. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on PR #4249:
URL: https://github.com/apache/ozone/pull/4249#issuecomment-1421294099

   > I ran [test](https://github.com/apache/ozone/pull/4249/files#diff-195575b08642fad83fad37b0cbe2ba5fa39f9db550028ac540523de01ecff3a9R56) in IntelliJ with your changes and it failed.
   
   I have fixed the test case in the latest commit , It also doesn’t fail in the latest CI run, Although will add the check to verify whether it is returning null, Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl merged pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl merged PR #4249:
URL: https://github.com/apache/ozone/pull/4249


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113314661


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   nvm. I checked that this case is also handled by the `null` check.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   nvm. I checked that this case is also covered by the `null` check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113312060


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2213,4 +2213,43 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {

Review Comment:
   yes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on PR #4249:
URL: https://github.com/apache/ozone/pull/4249#issuecomment-1438786384

   Thanks @sadanand48 for the patch. Thanks @prashantpogde @hemantk-12 for reviewing this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098755252


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -911,6 +918,22 @@ private boolean isKeyPresentInTableCache(String keyPrefix,
     return false;
   }
 
+
+  static boolean isSnapshotPresentInTableCache(String keyPrefix, Table table) {

Review Comment:
   1,2 . Removed the method and just used getCacheValue here.
   3.  The cache gets [cleaned up here](https://github.com/apache/ozone/blob/07629324c0641456e12887e267205e9129a97895/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java#L372) after the transaction is committed and the db is updated. So once it is updated cache won't contain the value.
   4. OzoneManagerDoubleBuffer#flushBatch calls addToBatch which inturn calls checkAndUpdateDb where the actual checkpoint creation takes place in case of create snapshot.  I think creating a checkpoint i.e rocksdb's create checkpoint API too is synchronous.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097826971


##########
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/ratis/TestOzoneManagerDoubleBuffer.java:
##########
@@ -68,9 +69,9 @@ class TestOzoneManagerDoubleBuffer {
   private static OMClientResponse omBucketCreateResponse =
       mock(OMBucketCreateResponse.class);
   private static OMClientResponse omSnapshotCreateResponse1 =
-      mock(OMBucketCreateResponse.class);
+      mock(OMSnapshotCreateResponse.class);

Review Comment:
   Thanks for fixing this.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java:
##########
@@ -440,10 +440,10 @@ private String addToBatch(Queue<DoubleBufferEntry<OMClientResponse>> buffer,
       // 1. It is first element in the response,
       // 2. Current request is createSnapshot request.
       // 3. Previous request was createSnapshot request.
-      if (response.isEmpty() ||
-          omResponse.getCreateSnapshotResponse() != null ||
-          (previousOmResponse != null &&
-              previousOmResponse.getCreateSnapshotResponse() != null)) {
+      if (response.isEmpty() || omResponse.getCreateSnapshotResponse()

Review Comment:
   Wouldn't `omResponse.getCreateSnapshotResponse().hasSnapshotInfo()` throw NPE if `omResponse.getCreateSnapshotResponse()` returns null? Or that's not possible?



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -911,6 +918,22 @@ private boolean isKeyPresentInTableCache(String keyPrefix,
     return false;
   }
 
+
+  static boolean isSnapshotPresentInTableCache(String keyPrefix, Table table) {

Review Comment:
   1. The function name is wrong because `isSnapshotPresentInTableCache` seems specific to snapshot but we are passing table to make it generic. Either change the function to something generic or move it to `OmSnapshotManager` .
   1. Why do we need to iterator over the table instead of just using [`get`](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L85) or [getSkipCache](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L96)?
   1. Wouldn't it always return `true` because [entry is in cache](https://github.com/hemantk-12/ozone/blob/ccc814ee7f8a7cac70b6d64dcae12056e375199d/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java#L149? Please correct me if I'm wrong.
   1. One more doubt if flushing the RocksDB operations queue actually creates checkpointing dir because it could be another async operation. Please double check that.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2209,4 +2209,46 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {
+    // Init data
+    OzoneBucket bucket1 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume1Path = new Path(OZONE_URI_DELIMITER, bucket1.getVolumeName());
+    Path bucket1Path = new Path(volume1Path, bucket1.getName());
+    Path file1 = new Path(bucket1Path, "key1");
+    Path file2 = new Path(bucket1Path, "key2");
+
+    ContractTestUtils.touch(fs, file1);
+    ContractTestUtils.touch(fs, file2);
+
+    OzoneBucket bucket2 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume2Path = new Path(OZONE_URI_DELIMITER, bucket2.getVolumeName());
+    Path bucket2Path = new Path(volume2Path, bucket2.getName());
+
+    fs.mkdirs(bucket2Path);
+    Path snapPath1 = fs.createSnapshot(bucket1Path, "snap1");
+    Path snapPath2 = fs.createSnapshot(bucket2Path, "snap1");
+
+    Path file3 = new Path(bucket1Path, "key3");
+    ContractTestUtils.touch(fs, file3);
+
+    Path snapPath3 = fs.createSnapshot(bucket1Path, "snap2");
+
+    try {
+      FileStatus[] f1 = fs.listStatus(snapPath1);
+      FileStatus[] f2 = fs.listStatus(snapPath2);
+      FileStatus[] f3 = fs.listStatus(snapPath3);
+      Assert.assertEquals(2, f1.length);
+      Assert.assertEquals(0, f2.length);
+      Assert.assertEquals(3, f3.length);
+    } catch (Exception e) {
+      Assert.fail("Failed to read/list on snapshotPath, exception: " + e);
+    }
+
+  }
+

Review Comment:
   nit: please remove extra line.



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2209,4 +2209,46 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {
+    // Init data
+    OzoneBucket bucket1 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume1Path = new Path(OZONE_URI_DELIMITER, bucket1.getVolumeName());
+    Path bucket1Path = new Path(volume1Path, bucket1.getName());
+    Path file1 = new Path(bucket1Path, "key1");
+    Path file2 = new Path(bucket1Path, "key2");
+
+    ContractTestUtils.touch(fs, file1);
+    ContractTestUtils.touch(fs, file2);
+
+    OzoneBucket bucket2 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume2Path = new Path(OZONE_URI_DELIMITER, bucket2.getVolumeName());
+    Path bucket2Path = new Path(volume2Path, bucket2.getName());
+
+    fs.mkdirs(bucket2Path);
+    Path snapPath1 = fs.createSnapshot(bucket1Path, "snap1");
+    Path snapPath2 = fs.createSnapshot(bucket2Path, "snap1");
+
+    Path file3 = new Path(bucket1Path, "key3");
+    ContractTestUtils.touch(fs, file3);
+
+    Path snapPath3 = fs.createSnapshot(bucket1Path, "snap2");
+
+    try {
+      FileStatus[] f1 = fs.listStatus(snapPath1);
+      FileStatus[] f2 = fs.listStatus(snapPath2);
+      FileStatus[] f3 = fs.listStatus(snapPath3);
+      Assert.assertEquals(2, f1.length);
+      Assert.assertEquals(0, f2.length);
+      Assert.assertEquals(3, f3.length);
+    } catch (Exception e) {
+      Assert.fail("Failed to read/list on snapshotPath, exception: " + e);
+    }
+

Review Comment:
   nit: please remove extra line.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -322,9 +328,10 @@ private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
    * @throws IOException
    */
   public static OmMetadataManagerImpl createSnapshotMetadataManager(
-      OzoneConfiguration conf, String snapshotDirName) throws IOException {
-    OmMetadataManagerImpl smm = new OmMetadataManagerImpl(conf,
-        snapshotDirName);
+      OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
+    OmMetadataManagerImpl smm = new OmMetadataManagerImpl(conf, snapshotDirName,

Review Comment:
   nit: could be a one line.
   ```suggestion:
       return new OmMetadataManagerImpl(conf, snapshotDirName,
           isSnapshotInCache);
   ```
   
   Also why this needs to be a static function. Why can't we use constructor directly? It is misuse of factory.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {
+      File checkpoint = Paths.get(metaDir.toPath().toString(), dbName).toFile();
+      RDBCheckpointManager.waitForCheckpointDirectoryExist(checkpoint);

Review Comment:
   I doubt if this will solve this problem because it just waits for 5 secs and swallows the exception. This either doesn't make sure that checkpoint directory will exists.
   
   Ideal solution for this should be wait and notify but because dir creation happens inside RocksDB we can't notify Ozone when it gets created..



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098761122


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -322,9 +328,10 @@ private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
    * @throws IOException
    */
   public static OmMetadataManagerImpl createSnapshotMetadataManager(
-      OzoneConfiguration conf, String snapshotDirName) throws IOException {
-    OmMetadataManagerImpl smm = new OmMetadataManagerImpl(conf,
-        snapshotDirName);
+      OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
+    OmMetadataManagerImpl smm = new OmMetadataManagerImpl(conf, snapshotDirName,

Review Comment:
   Done. removed the static method and used constructor directly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] prashantpogde commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "prashantpogde (via GitHub)" <gi...@apache.org>.
prashantpogde commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097767720


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {

Review Comment:
   This is a good catch @sadanand48!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1099045421


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java:
##########
@@ -440,10 +440,10 @@ private String addToBatch(Queue<DoubleBufferEntry<OMClientResponse>> buffer,
       // 1. It is first element in the response,
       // 2. Current request is createSnapshot request.
       // 3. Previous request was createSnapshot request.
-      if (response.isEmpty() ||
-          omResponse.getCreateSnapshotResponse() != null ||
-          (previousOmResponse != null &&
-              previousOmResponse.getCreateSnapshotResponse() != null)) {
+      if (response.isEmpty() || omResponse.getCreateSnapshotResponse()

Review Comment:
   I believe that warning can be ignored in this case. Even "Java protocol buffer methods accept or return nulls", wouldn't having a null check would be more appropriate? Because when it is not set or set to null, will surely return null.
   May be something like `(omResponse.getCreateSnapshotResponse() != null && omResponse.getCreateSnapshotResponse().hasSnapshotInfo())`. Tho I think just null check should work.
   
   I ran [test](https://github.com/apache/ozone/pull/4249/files#diff-195575b08642fad83fad37b0cbe2ba5fa39f9db550028ac540523de01ecff3a9R56) in IntelliJ with your changes and it failed. Can you please double check, run it locally on your machine and verify that it passes?
   ```
   2023-02-07 10:40:46,892 [main] ERROR ratis.OzoneManagerDoubleBuffer (ExitUtils.java:terminate(133)) - Terminating with exit status 2: During flush to DB encountered error in OMDoubleBuffer flush thread main
   java.lang.NullPointerException
   	at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.splitReadyBufferAtCreateSnapshot(OzoneManagerDoubleBuffer.java:444)
   	at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushCurrentBuffer(OzoneManagerDoubleBuffer.java:291)
   	at org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBuffer.testOzoneManagerDoubleBuffer(TestOzoneManagerDoubleBuffer.java:198)
   ```
   Also I don't see if this test was run as part of unit test workflow: https://github.com/apache/ozone/actions/runs/4114612552/jobs/7102217772 . Please correct me if it is supposed to run in some other workflow.
   
   One more point if `omResponse.getCreateSnapshotResponse() == null` is always true, test case-1 should  fail where double buffer has `Arrays.asList(omKeyCreateResponse, omBucketCreateResponse)`. It should get split into two but it doesn't not.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {

Review Comment:
   It would be great if add this as a comment in code. Also mentioned "It is most likely DB entries will get flushed in this wait time.".



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {
+      File checkpoint = Paths.get(metaDir.toPath().toString(), dbName).toFile();
+      RDBCheckpointManager.waitForCheckpointDirectoryExist(checkpoint);

Review Comment:
   No sure if "checkpoint in OmCreateSnapshotCreateRequest#validateAndUpdateCache inside the bucket lock" would work since dir creation is also an async RocksDB operation. But it is worth a try.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1100478166


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1100487942


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java:
##########
@@ -440,10 +440,10 @@ private String addToBatch(Queue<DoubleBufferEntry<OMClientResponse>> buffer,
       // 1. It is first element in the response,
       // 2. Current request is createSnapshot request.
       // 3. Previous request was createSnapshot request.
-      if (response.isEmpty() ||
-          omResponse.getCreateSnapshotResponse() != null ||
-          (previousOmResponse != null &&
-              previousOmResponse.getCreateSnapshotResponse() != null)) {
+      if (response.isEmpty() || omResponse.getCreateSnapshotResponse()

Review Comment:
   > I think you can use omResponse.hasCreateSnapshotResponse(). That should remove all the confusion and concern.
   
   Done. using hasCreateSnapshotResponse() now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098760350


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {
+      File checkpoint = Paths.get(metaDir.toPath().toString(), dbName).toFile();
+      RDBCheckpointManager.waitForCheckpointDirectoryExist(checkpoint);

Review Comment:
   Yup this is only to avoid reads failing right after a createSnapshot Request, Since we don't have a cache like feature in snapshot to prevent subsequent reads till entry is actually flushed to DB, I think all we can do is wait till entry gets flushed and if entry is in cache, it is most likely it will get flushed .



##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2209,4 +2209,46 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {
+    // Init data
+    OzoneBucket bucket1 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume1Path = new Path(OZONE_URI_DELIMITER, bucket1.getVolumeName());
+    Path bucket1Path = new Path(volume1Path, bucket1.getName());
+    Path file1 = new Path(bucket1Path, "key1");
+    Path file2 = new Path(bucket1Path, "key2");
+
+    ContractTestUtils.touch(fs, file1);
+    ContractTestUtils.touch(fs, file2);
+
+    OzoneBucket bucket2 =
+        TestDataUtil.createVolumeAndBucket(cluster, bucketLayout);
+    Path volume2Path = new Path(OZONE_URI_DELIMITER, bucket2.getVolumeName());
+    Path bucket2Path = new Path(volume2Path, bucket2.getName());
+
+    fs.mkdirs(bucket2Path);
+    Path snapPath1 = fs.createSnapshot(bucket1Path, "snap1");
+    Path snapPath2 = fs.createSnapshot(bucket2Path, "snap1");
+
+    Path file3 = new Path(bucket1Path, "key3");
+    ContractTestUtils.touch(fs, file3);
+
+    Path snapPath3 = fs.createSnapshot(bucket1Path, "snap2");
+
+    try {
+      FileStatus[] f1 = fs.listStatus(snapPath1);
+      FileStatus[] f2 = fs.listStatus(snapPath2);
+      FileStatus[] f3 = fs.listStatus(snapPath3);
+      Assert.assertEquals(2, f1.length);
+      Assert.assertEquals(0, f2.length);
+      Assert.assertEquals(3, f3.length);
+    } catch (Exception e) {
+      Assert.fail("Failed to read/list on snapshotPath, exception: " + e);
+    }
+
+  }
+

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098331334


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java:
##########
@@ -440,10 +440,10 @@ private String addToBatch(Queue<DoubleBufferEntry<OMClientResponse>> buffer,
       // 1. It is first element in the response,
       // 2. Current request is createSnapshot request.
       // 3. Previous request was createSnapshot request.
-      if (response.isEmpty() ||
-          omResponse.getCreateSnapshotResponse() != null ||
-          (previousOmResponse != null &&
-              previousOmResponse.getCreateSnapshotResponse() != null)) {
+      if (response.isEmpty() || omResponse.getCreateSnapshotResponse()

Review Comment:
   I made this change after the IDE warned me that `omResponse.getCreateSnapshotResponse() != null`  would always return true. Since it is coming from proto object `OMResponse` I guess [that would be the case.](https://developers.google.com/protocol-buffers/docs/reference/java-generated)
   
   > Note that no Java protocol buffer methods accept or return nulls unless otherwise specified.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113303598


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   Is this accounting for the case where the snapshot entry is a tombstone in cache (`CacheValue` is `Optional.absent()` like [here](https://github.com/apache/ozone/blob/ddbe71d3f90ae14ecf29f51c2265f1f9c3171668/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequest.java#L158-L162)), which could happen when the snapshot eventually deleted from the snapshotInfo table? Doesn't look to be the case to me?
   
   Therefore, IMO in addition to checking null, we would need to check whether the value is `Optional.absent()`. If it is `Optional.absent()`, `isSnapshotInCache` should still be `false` as we won't want to wait for a checkpoint dir that is deleted. Or we would have to put checks before this logic is even reached.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113291290


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2213,4 +2213,43 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {

Review Comment:
   Just one question, this new UT is not guaranteed to repro the issue, correct?
   
   In other words, without the fix, this UT would be flaky?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113291290


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2213,4 +2213,43 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {

Review Comment:
   Just one question, this new UT is not guaranteed to repro the issue, correct?
   
   In other words, without the fix at least this UT will be flaky?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113471375


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   @sadanand48 No revert needed here. The addendum only needs the new changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on PR #4249:
URL: https://github.com/apache/ozone/pull/4249#issuecomment-1421481957

   > > I ran [test](https://github.com/apache/ozone/pull/4249/files#diff-195575b08642fad83fad37b0cbe2ba5fa39f9db550028ac540523de01ecff3a9R56) in IntelliJ with your changes and it failed.
   > 
   > I have fixed the test case in the latest commit , It also doesn’t fail in the latest CI run, Although will add the check to verify whether it is returning null, Thanks
   
   I think you can use `omResponse.hasCreateSnapshotResponse()`. That should remove all the confusion I guess.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1100494494


##########
hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/ratis/TestOzoneManagerDoubleBuffer.java:
##########
@@ -109,9 +110,12 @@ public void setup() throws IOException {
     when(omBucketResponse.getTraceID()).thenReturn("bucketTraceId");
     when(omSnapshotResponse1.getTraceID()).thenReturn("snapshotTraceId-1");
     when(omSnapshotResponse2.getTraceID()).thenReturn("snapshotTraceId-2");
-
-    when(omKeyResponse.getCreateSnapshotResponse()).thenReturn(null);
-    when(omBucketResponse.getCreateSnapshotResponse()).thenReturn(null);
+    when(snapshotResponse1.hasSnapshotInfo()).thenReturn(true);
+    when(snapshotResponse2.hasSnapshotInfo()).thenReturn(true);

Review Comment:
   nit: I don't think line number 113, 114, 130 and 132 are needed anymore.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] hemantk-12 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "hemantk-12 (via GitHub)" <gi...@apache.org>.
hemantk-12 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1097841708


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -911,6 +918,22 @@ private boolean isKeyPresentInTableCache(String keyPrefix,
     return false;
   }
 
+
+  static boolean isSnapshotPresentInTableCache(String keyPrefix, Table table) {

Review Comment:
   1. The function name is wrong because `isSnapshotPresentInTableCache` seems specific to snapshot but we are passing table to make it generic. Either change the function to something generic or move it to `OmSnapshotManager` .
   1. Why do we need to iterator over the table instead of just using [`get`](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L85) or [getSkipCache](https://github.com/apache/ozone/blob/074d225f2ab0c9658ffa35f021c6cba2116f40cc/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/Table.java#L96)?
   1. Wouldn't it always return `true` because [entry is in cache](https://github.com/hemantk-12/ozone/blob/ccc814ee7f8a7cac70b6d64dcae12056e375199d/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java#L149)? Please correct me if I'm wrong.
   1. One more doubt if flushing the RocksDB operations queue actually creates checkpointing dir because it could be another async operation. Please double check that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1098763159


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java:
##########
@@ -302,14 +303,19 @@ protected OmMetadataManagerImpl() {
   }
 
   // metadata constructor for snapshots
-  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName)
-      throws IOException {
+  private OmMetadataManagerImpl(OzoneConfiguration conf, String snapshotDirName,
+      boolean isSnapshotInCache) throws IOException {
     lock = new OmReadOnlyLock();
     omEpoch = 0;
     String snapshotDir = OMStorage.getOmDbDir(conf) +
         OM_KEY_PREFIX + OM_SNAPSHOT_DIR;
-    setStore(loadDB(conf, new File(snapshotDir),
-        OM_DB_NAME + snapshotDirName, true));
+    File metaDir = new File(snapshotDir);
+    String dbName = OM_DB_NAME + snapshotDirName;
+    if (isSnapshotInCache) {

Review Comment:
   Yes , The check is only to prevent every snapshot read to perform a disk IO and check if a checkpoint dir exists. If entry is present in cache, we certainly know it will get flushed in a while and therefore wait.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113291290


##########
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java:
##########
@@ -2213,4 +2213,43 @@ private void createLinkBucket(String linkVolume, String linkBucket,
     OzoneVolume ozoneVolume = objectStore.getVolume(linkVolume);
     ozoneVolume.createBucket(linkBucket, builder.build());
   }
+
+  @Test
+  public void testSnapshotRead() throws Exception {

Review Comment:
   Just one question, this new UT is not guaranteed to repro the issue, correct?
   
   In other words, without the fix this UT will be flaky?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #4249: HDDS-7906. [Snapshot] Wait for checkpoint creation if snapshot in cache and not committed to DB.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #4249:
URL: https://github.com/apache/ozone/pull/4249#discussion_r1113345785


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmSnapshotManager.java:
##########
@@ -94,6 +95,10 @@ public OmSnapshot load(@Nonnull String snapshotTableKey)
         // see if the snapshot exists
         snapshotInfo = getSnapshotInfo(snapshotTableKey);
 
+        boolean isSnapshotInCache =
+            ozoneManager.getMetadataManager().getSnapshotInfoTable()
+                .getCacheValue(new CacheKey<>(snapshotTableKey)) != null;

Review Comment:
   Thanks for finding this. Didn't get this, how is it covered by null check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org