You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "devmadhuu (via GitHub)" <gi...@apache.org> on 2023/09/04 10:37:12 UTC

[GitHub] [ozone] devmadhuu opened a new pull request, #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

devmadhuu opened a new pull request, #5244:
URL: https://github.com/apache/ozone/pull/5244

   ## What changes were proposed in this pull request?
   
   This PR is to fix flakiness in listStatus API.  It was observed that, cleanup() method in TestOzoneFileSystem (in CI integration (filesystem)), which is triggered after EACH test method is incredibly flaky. However when we add a call to wait for double buffer flush, it no longer seems flaky. 
   Issue was related to a possible race condition where table cache iterator is flushed already and isKeyDeleted check may not work as expected before putting entries in cacheKeyMap in listStatus API.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-9233
   
   ## How was this patch tested?
   
   This patch is tested with multiple job runs with 50 iterations of single test class to repro of flakiness of listStatus API which was flaky when called in deleteRootDir() method called from cleanup method. Here is the green [CI run](https://github.com/devmadhuu/ozone/actions/runs/6072078050) with multiple iterations of test run. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sumitagrawl commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sumitagrawl (via GitHub)" <gi...@apache.org>.
sumitagrawl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317242505


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   the purpose of isKeyDeleted is not check only in cache while iterating through the Table records, so no need do doubleCheck, If required, we can rename method as isKeyDeletedInCache()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1712268078

   > [Ran](https://github.com/adoroszlai/hadoop-ozone/actions/runs/6124275044) `TestOzoneFileSystem` 100x with the fix. There were only 3 failures of `testListStatusOnKeyNameContainDelimiter`, compared to 60% [failure](https://github.com/adoroszlai/hadoop-ozone/actions/runs/6125165663) rate (of various test cases) on `master`. So this is definitely a big improvement.
   > 
   > @sadanand48 @sumitagrawl please check if you're OK with the latest patch
   
   Thanks @adoroszlai . #5093 might address `testListStatusOnKeyNameContainDelimiter` flakiness.
   
   @devmadhuu Patch lgtm in its current state. I just have some nits regarding the code comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1712229677

   [Ran](https://github.com/adoroszlai/hadoop-ozone/actions/runs/6124275044) `TestOzoneFileSystem` 100x with the fix.  There were only 3 failures of `testListStatusOnKeyNameContainDelimiter`, compared to 60% [failure](https://github.com/adoroszlai/hadoop-ozone/actions/runs/6125165663) rate on `master`.  So this is definitely a big improvement.
   
   @sadanand48 @sumitagrawl please check if you're OK with the latest patch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sumitagrawl commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sumitagrawl (via GitHub)" <gi...@apache.org>.
sumitagrawl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317242505


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   the purpose of isKeyDeleted is check only in cache while iterating through the Table records, so no need do doubleCheck, If required, we can rename method as isKeyDeletedInCache()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1318110254


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneListStatusHelper.java:
##########
@@ -215,7 +215,11 @@ public Collection<OzoneFileStatus> listStatusFSO(OmKeyArgs args,
         HeapEntry entry = heapIterator.next();
         OzoneFileStatus status = entry.getStatus(prefixKey,
             scmBlockSize, volumeName, bucketName, replication);
-        map.putIfAbsent(entry.key, status);
+        // Since putIfAbsent doesn't work as expected in case of null value,
+        // so had to explicitly check using containsKey
+        if (!map.containsKey(entry.key)) {
+          map.putIfAbsent(entry.key, status);

Review Comment:
   is`putIfAbsent` needed when  we already have `containsKey` ? We can use map.put instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1711113346

   > Nit: "Investigate ..." sounds like a task, not a code change. Please update the PR title to describe the change itself.
   
   Done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1316832592


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   @sadanand48  Thanks for your review. Pls check the [changes](https://github.com/apache/ozone/pull/5244/files#diff-bde0dade7dd5ddda419499f4f999d25d40fcec1412e0ce809c36ffd1be473f22R1430-R1431) in commit where we do not want to skip deleted keys, `getIteratorForKeyInTableCache` methods gets called by `org.apache.hadoop.ozone.om.KeyManagerImpl#listStatus(org.apache.hadoop.ozone.om.helpers.OmKeyArgs, boolean, java.lang.String, long, java.lang.String, boolean)` where we call cacheIterator on keyTable and then populate cacheKeyMap in `listStatusFindKeyInTableCache` with non-deleted keys and that populated cacheKeyMap is being used in `findKeyInDbWithIterator` method for wrapping the OmKeyInfo object in OzoneFileStatus by having a check if key is not deleted. Now here `isKeyDeleted` was used to check if key is deleted or not. This code can have a race condition where keyTable cacheValue may itself may be null as none of these methods are synchronized as a whole.  Below `omKeyInfoCacheValu
 e` may be null.
   ```
   CacheValue<OmKeyInfo> omKeyInfoCacheValue
           = keyTable.getCacheValue(new CacheKey(key));
   ```
   So in `listStatusFindKeyInTableCache` method itself when iterating cacheIter , we populate cacheKeyMap with deleted keys also.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1712550337

   Thanks @devmadhuu for the patch, @sadanand48, @smengcl, @sumitagrawl for the review.  Merged, since this improves test results significantly.  Let's address any further comments in a follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317984584


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Yes @sadanand48 , you are right about above method call issue as well, however this has already been identified and being fixed in other PR where getFileStatus call flow is there 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1318232208


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneListStatusHelper.java:
##########
@@ -215,7 +215,11 @@ public Collection<OzoneFileStatus> listStatusFSO(OmKeyArgs args,
         HeapEntry entry = heapIterator.next();
         OzoneFileStatus status = entry.getStatus(prefixKey,
             scmBlockSize, volumeName, bucketName, replication);
-        map.putIfAbsent(entry.key, status);
+        // Since putIfAbsent doesn't work as expected in case of null value,
+        // so had to explicitly check using containsKey
+        if (!map.containsKey(entry.key)) {
+          map.putIfAbsent(entry.key, status);

Review Comment:
   > is`putIfAbsent` needed when we already have `containsKey` ? We can use map.put instead.
   
   Ok. done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai merged pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai merged PR #5244:
URL: https://github.com/apache/ozone/pull/5244


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1318110254


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneListStatusHelper.java:
##########
@@ -215,7 +215,11 @@ public Collection<OzoneFileStatus> listStatusFSO(OmKeyArgs args,
         HeapEntry entry = heapIterator.next();
         OzoneFileStatus status = entry.getStatus(prefixKey,
             scmBlockSize, volumeName, bucketName, replication);
-        map.putIfAbsent(entry.key, status);
+        // Since putIfAbsent doesn't work as expected in case of null value,
+        // so had to explicitly check using containsKey
+        if (!map.containsKey(entry.key)) {
+          map.putIfAbsent(entry.key, status);

Review Comment:
   is` putIfAbsent` needed when  we already have `containsKey` ? We can use map.put instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320367698


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneListStatusHelper.java:
##########
@@ -215,7 +215,11 @@ public Collection<OzoneFileStatus> listStatusFSO(OmKeyArgs args,
         HeapEntry entry = heapIterator.next();
         OzoneFileStatus status = entry.getStatus(prefixKey,
             scmBlockSize, volumeName, bucketName, replication);
-        map.putIfAbsent(entry.key, status);
+        // Since putIfAbsent doesn't work as expected in case of null value,
+        // so had to explicitly check using containsKey

Review Comment:
   Improve comment readability
   
   ```suggestion
           // Caution: DO NOT use putIfAbsent. putIfAbsent undesirably overwrites
           // the value with `status` when the existing value in the map is null.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1319393052


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null) {

Review Comment:
   @sadanand48 As discussed, I am keeping all deleted key entries in cacheKeyMap and not going to filter based on startKey due to unpredictable nature of cache flush race condition, and because of filtering deleted entries from cache based on startKey, `org.apache.hadoop.ozone.om.TestKeyManagerImpl#testListStatusWithTableCacheRecursive` were failing pagination test consistently.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317105641


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null) {

Review Comment:
   ``` java
   else if (cacheOmKeyInfo == null &&   && cacheKey.startsWith(startCacheKey)
             && cacheKey.compareTo(startCacheKey) >= 0) ) {
   ```
   I think we should also check the key prefix as cacheKeyMap is only constructed for key with given prefix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1705144241

   @smengcl @sumitagrawl Pls review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1315954826


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Could you please outline the exact sequence of the suspected race condition here/description.
   1. If cache is flushed after obtaining the iterator reference of the table that has the deleted key , then how will the value be null , It should be the value of the deleted key right?
   2. Also even if OmKeyInfo is null, the fileStatus won't be null as we are always constructing a new `OzoneFileStatus` Object while adding to the cacheKeyMap, It should be fileStatus.getOmKeyInfo that should be null
   ` cacheKeyMap.put(entryInDb,
                         new OzoneFileStatus(omKeyInfo, 0, true));`
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1315954826


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Could you please outline the exact sequence of the suspected race condition here/description.
   1. If cache is flushed after obtaining the iterator reference of the table that has the deleted key , then how will the value be null , It should be the value of the deleted key right?
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1318086380


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Cool thanks



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1709861613

   Test failure seems related:
   
   ```
   [INFO] Running org.apache.hadoop.ozone.om.TestKeyManagerImpl
   Error:  Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.742 s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestKeyManagerImpl
   Error:  org.apache.hadoop.ozone.om.TestKeyManagerImpl.testListStatusWithDeletedEntriesInCache  Time elapsed: 0.122 s  <<< FAILURE!
   org.opentest4j.AssertionFailedError: expected: <[key-10, key-12, key-16, key-2, key-22, key-24, key-26, key-28, key-30, key-32, key-34, key-36, key-38, key-4, key-40, key-42, key-44, key-46, key-48, key-50, key-52, key-54, key-56, key-58, key-6, key-62, key-66, key-70, key-72, key-74, key-76, key-78, key-8, key-80, key-82, key-84, key-86, key-88, key-90, key-92, key-94, key-96, key-98]> but was: <[key-10, key-12, key-16, key-2, key-22, key-26, key-30, key-34, key-38, key-40, key-44, key-48, key-52, key-56, key-6, key-62, key-66, key-70, key-74, key-78, key-80, key-84, key-88, key-92, key-96]>
   	at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
   	at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
   	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
   	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
   	at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
   	at org.apache.hadoop.ozone.om.TestKeyManagerImpl.testListStatusWithDeletedEntriesInCache(TestKeyManagerImpl.java:1180)
   ```
   
   https://github.com/devmadhuu/ozone/actions/runs/6107000532/job/16573141510#step:5:3771


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320370450


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1626,14 +1635,18 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
                 .getImmediateChild(entryKeyName, keyName);
             boolean isFile = OzoneFSUtils.isFile(immediateChild);
             if (isFile) {
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey

Review Comment:
   The comment doesn't apply here either



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1611,9 +1619,10 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
           String entryKeyName = omKeyInfo.getKeyName();
           if (recursive) {
             // for recursive list all the entries
-
-            if (!isKeyDeleted(entryInDb, keyTable)) {
-              cacheKeyMap.putIfAbsent(entryInDb, new OzoneFileStatus(omKeyInfo,
+            // Since putIfAbsent doesn't work as expected in case of null value,
+            // so had to explicitly check using containsKey

Review Comment:
   Same as https://github.com/apache/ozone/pull/5244/files#r1320367698



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320374690


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null && !cacheKeyMap.containsKey(cacheKey)) {
+        cacheKeyMap.put(cacheKey, null);
       }

Review Comment:
   Would you add some comments explaining the change in this block as well?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1712549966

   Full _integration (filesystem)_ test also [passed](https://github.com/adoroszlai/hadoop-ozone/actions/runs/6130990577) 10x.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1711113535

   > Test failure seems related:
   > 
   > ```
   > [INFO] Running org.apache.hadoop.ozone.om.TestKeyManagerImpl
   > Error:  Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.742 s <<< FAILURE! - in org.apache.hadoop.ozone.om.TestKeyManagerImpl
   > Error:  org.apache.hadoop.ozone.om.TestKeyManagerImpl.testListStatusWithDeletedEntriesInCache  Time elapsed: 0.122 s  <<< FAILURE!
   > org.opentest4j.AssertionFailedError: expected: <[key-10, key-12, key-16, key-2, key-22, key-24, key-26, key-28, key-30, key-32, key-34, key-36, key-38, key-4, key-40, key-42, key-44, key-46, key-48, key-50, key-52, key-54, key-56, key-58, key-6, key-62, key-66, key-70, key-72, key-74, key-76, key-78, key-8, key-80, key-82, key-84, key-86, key-88, key-90, key-92, key-94, key-96, key-98]> but was: <[key-10, key-12, key-16, key-2, key-22, key-26, key-30, key-34, key-38, key-40, key-44, key-48, key-52, key-56, key-6, key-62, key-66, key-70, key-74, key-78, key-80, key-84, key-88, key-92, key-96]>
   > 	at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55)
   > 	at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62)
   > 	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
   > 	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
   > 	at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1141)
   > 	at org.apache.hadoop.ozone.om.TestKeyManagerImpl.testListStatusWithDeletedEntriesInCache(TestKeyManagerImpl.java:1180)
   > ```
   > 
   > https://github.com/devmadhuu/ozone/actions/runs/6107000532/job/16573141510#step:5:3771
   
   This is fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1318213731


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1389,11 +1389,14 @@ public void refresh(OmKeyInfo key) throws IOException {
     refreshPipeline(Arrays.asList(key));
   }
 
-  public static boolean isKeyDeleted(String key, Table keyTable) {
+  public static boolean isKeyDeleted(String key, Table keyTable)

Review Comment:
   > The purpose of method is only check in cache, not in table. And as per caller, its checking in cache while iterate the table keys.
   
   I have reverted the change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317101959


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Thanks @devmadhuu for explaining, missed the part where you added the null value. makes sense to avoid the isKeyDeleted  check. We should check other usages of `isKeyDeleted` if it causes similar issues elsewhere.
   
   Maybe the correct way to check isKeyDeleted would be to check the existence of the key in table if no value is present in cache.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317984584


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Yes @sadanand48 , you are right about above method call issue as well, however this has already been identified and being fixed in other [PR](https://github.com/apache/ozone/pull/5252) where getFileStatus call flow is there , so will see the usage of `isKeyDeleted` at other places once this and #5252 PR gets merged. If `isKeyDeleted` usage is incorrect, we are fixing using this PR and #5252  PR and later will remove this method if not being used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "adoroszlai (via GitHub)" <gi...@apache.org>.
adoroszlai commented on PR #5244:
URL: https://github.com/apache/ozone/pull/5244#issuecomment-1709673464

   Nit: "Investigate ..." sounds like a task, not a code change.  Please update the PR title to describe the change itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320490252


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneListStatusHelper.java:
##########
@@ -215,7 +215,11 @@ public Collection<OzoneFileStatus> listStatusFSO(OmKeyArgs args,
         HeapEntry entry = heapIterator.next();
         OzoneFileStatus status = entry.getStatus(prefixKey,
             scmBlockSize, volumeName, bucketName, replication);
-        map.putIfAbsent(entry.key, status);
+        // Since putIfAbsent doesn't work as expected in case of null value,
+        // so had to explicitly check using containsKey

Review Comment:
   Ok sure. Done.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1626,14 +1635,18 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
                 .getImmediateChild(entryKeyName, keyName);
             boolean isFile = OzoneFSUtils.isFile(immediateChild);
             if (isFile) {
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey
+              if (!cacheKeyMap.containsKey(entryInDb)) {
                 cacheKeyMap.put(entryInDb,
                     new OzoneFileStatus(omKeyInfo, scmBlockSize, !isFile));
                 countEntries++;
               }
             } else {
               // if entry is a directory
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey

Review Comment:
   Yes. Removed.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1626,14 +1635,18 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
                 .getImmediateChild(entryKeyName, keyName);
             boolean isFile = OzoneFSUtils.isFile(immediateChild);
             if (isFile) {
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey

Review Comment:
   Yes. Removed.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1611,9 +1619,10 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
           String entryKeyName = omKeyInfo.getKeyName();
           if (recursive) {
             // for recursive list all the entries
-
-            if (!isKeyDeleted(entryInDb, keyTable)) {
-              cacheKeyMap.putIfAbsent(entryInDb, new OzoneFileStatus(omKeyInfo,
+            // Since putIfAbsent doesn't work as expected in case of null value,
+            // so had to explicitly check using containsKey

Review Comment:
   Yes. Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317464774


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   If the problem about cache flush after obtaining the iterator reference of the table is valid (which is the race condition mentioned here in the PR) , I can see this call as problematic.
   
   Let's take example of the caller `createFakeDirIfShould` , it returns a fake dir instance to getFileStatus call if the key exists and if intermediate paths don't. Here it decides whether it should create a fake dir or not based on whether a key beyond the prefix exists in the table. If it exists, we also check if it's deleted based on isKeyDeleted check. Now if it is deleted, we will find the value in cache and fake dir is not created and getfileStatus will return null. But if there is a cache flush, then we won't find any entry in cache and this check will return false and create a fake dir but the key itself doesn't exist.  This can cause getFileStatus to be flaky. If we are not fixing this method, we should probably evaluate its usage and address problems it can cause.
   ```java
    if (key.startsWith(targetKey)) {
               if (!Objects.equals(key, targetKey)
                   && !isKeyDeleted(key, keyTable)) {
                 LOG.debug("Fake dir {} required for {}", targetKey, key);
                 return createDirectoryKey(keyValue.getValue(), dirKey);
               }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1315954826


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Could you please outline the exact sequence of the suspected race condition here/description.
   1. If cache is flushed after obtaining the iterator reference of the table that has the deleted key , then how will the value be null , It should be the value of the deleted key right?
   2.Also even if OmKeyInfo is null, the fileStatus won't be null as we are always constructing a new `OzoneFilesystem` Object while adding to the cacheKeyMap, It should be fileStatus.getOmKeyInfo that should be null
   ` cacheKeyMap.put(entryInDb,
                         new OzoneFileStatus(omKeyInfo, 0, true));`
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sumitagrawl commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sumitagrawl (via GitHub)" <gi...@apache.org>.
sumitagrawl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317234437


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1389,11 +1389,14 @@ public void refresh(OmKeyInfo key) throws IOException {
     refreshPipeline(Arrays.asList(key));
   }
 
-  public static boolean isKeyDeleted(String key, Table keyTable) {
+  public static boolean isKeyDeleted(String key, Table keyTable)

Review Comment:
   The purpose of method is only check in cache,  not in table. And as per caller, its checking in cache while iterate the table keys.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1315954826


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Could you please outline the exact sequence of the suspected race condition here/description.
   1. If cache is flushed after obtaining the iterator reference of the table that has the deleted key , then how will the value be null , It should be the value of the deleted key right?
   2.Also even if OmKeyInfo is null, the fileStatus won't be null as we are always constructing a new `OzoneFileStatus` Object while adding to the cacheKeyMap, It should be fileStatus.getOmKeyInfo that should be null
   ` cacheKeyMap.put(entryInDb,
                         new OzoneFileStatus(omKeyInfo, 0, true));`
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317177251


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   > Thanks @devmadhuu for explaining, missed the part where you added the null value. makes sense to avoid the isKeyDeleted check. We should check other usages of `isKeyDeleted` if it causes similar issues elsewhere.
   > 
   > Maybe the correct way to check isKeyDeleted would be to check the existence of the key in table if no value is present in cache.
   
   Thanks @sadanand48 , I have handled the check inside isKeyDeleted method to check the existence of the key in table if no value is present in cache. Few other methods are calling this method.



##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null) {

Review Comment:
   > ```java
   > else if (cacheOmKeyInfo == null &&   && cacheKey.startsWith(startCacheKey)
   >           && cacheKey.compareTo(startCacheKey) >= 0) ) {
   > ```
   > 
   > I think we should also check the key prefix as cacheKeyMap is only constructed for key with given prefix.
   
   Thanks @sadanand48 . I have handled the suggested change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. Investigate whether listStatus() is correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1317984584


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1540,8 +1542,14 @@ public List<OzoneFileStatus> listStatus(OmKeyArgs args, boolean recursive,
     countEntries = 0;
     // Convert results in cacheKeyMap to List
     for (OzoneFileStatus fileStatus : cacheKeyMap.values()) {
-      // No need to check if a key is deleted or not here, this is handled
-      // when adding entries to cacheKeyMap from DB.
+      // Here need to check if a key is deleted as cacheKeyMap will contain
+      // deleted entries as well. Adding deleted entries in cacheKeyMap is done
+      // as there is a possible race condition where table cache iterator is
+      // flushed already and isKeyDeleted check may not work as expected

Review Comment:
   Yes @sadanand48 , you are right about above method call issue as well, however this has already been identified and being fixed in other [PR](https://github.com/apache/ozone/pull/5252) where getFileStatus call flow is there 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sadanand48 commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "sadanand48 (via GitHub)" <gi...@apache.org>.
sadanand48 commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1319493345


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null) {

Review Comment:
   Sure , looks like we need to keep track of all deleted keys after the startKey but there is no way to seek in the map. Anyways we can still load all deleted keys as it  is just a cache and is limited by cache size. Thanks for looking into it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] devmadhuu commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "devmadhuu (via GitHub)" <gi...@apache.org>.
devmadhuu commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320490311


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1426,7 +1426,9 @@ private void listStatusFindKeyInTableCache(
         }
         OzoneFileStatus fileStatus = new OzoneFileStatus(
             cacheOmKeyInfo, scmBlockSize, !OzoneFSUtils.isFile(cacheKey));
-        cacheKeyMap.put(cacheKey, fileStatus);
+        cacheKeyMap.putIfAbsent(cacheKey, fileStatus);
+      } else if (cacheOmKeyInfo == null && !cacheKeyMap.containsKey(cacheKey)) {
+        cacheKeyMap.put(cacheKey, null);
       }

Review Comment:
   Ok sure. Added the comments for reason of change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] smengcl commented on a diff in pull request #5244: HDDS-9233. listStatus() API is not correctly iterating cache.

Posted by "smengcl (via GitHub)" <gi...@apache.org>.
smengcl commented on code in PR #5244:
URL: https://github.com/apache/ozone/pull/5244#discussion_r1320369516


##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -1626,14 +1635,18 @@ private void findKeyInDbWithIterator(boolean recursive, String startKey,
                 .getImmediateChild(entryKeyName, keyName);
             boolean isFile = OzoneFSUtils.isFile(immediateChild);
             if (isFile) {
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey
+              if (!cacheKeyMap.containsKey(entryInDb)) {
                 cacheKeyMap.put(entryInDb,
                     new OzoneFileStatus(omKeyInfo, scmBlockSize, !isFile));
                 countEntries++;
               }
             } else {
               // if entry is a directory
-              if (!isKeyDeleted(entryInDb, keyTable)) {
+              // Since putIfAbsent doesn't work as expected in case of null
+              // value, so had to explicitly check using containsKey

Review Comment:
   The comment doesn't apply here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org