You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2022/01/26 13:58:45 UTC
[GitHub] [hadoop] sodonnel commented on pull request #3928: HDFS-16438. Avoid holding read locks for a long time when scanDatanodeStorage

sodonnel commented on pull request #3928:
URL: https://github.com/apache/hadoop/pull/3928#issuecomment-1022221913


   I am not sure if it is safe to drop the read lock while iterating over the blocks in a StorageInfo as the blocks could change when you drop the lock.
   
   Looking at the code in DatanodeStorageInfo, we can add a block to the list, which always adds to the head of the list. The worst case here, is we don't consider some block, but it would be caught by the rescan at the end, so that is not a big problem.
   
   However we can remove a block. Lets say we stop iterating at blockX and drop the lock.
   
   Then a IBR comes in and gets processed, removing that block from the storage.
   
   The iterator will be holding a reference to that blockInfo object as `current`, and it is expected to call next on it to get the next block Info.
   
   The remove block code in BlockInfo.java looks like:
   
   ```  
   BlockInfo listRemove(BlockInfo head, DatanodeStorageInfo storage) {
       if (head == null) {
         return null;
       }
       int dnIndex = this.findStorageInfo(storage);
       if (dnIndex < 0) { // this block is not on the data-node list
         return head;
       }
   
       BlockInfo next = this.getNext(dnIndex);
       BlockInfo prev = this.getPrevious(dnIndex);
       this.setNext(dnIndex, null);
       this.setPrevious(dnIndex, null);
       if (prev != null) {
         prev.setNext(prev.findStorageInfo(storage), next);
       }
       if (next != null) {
         next.setPrevious(next.findStorageInfo(storage), prev);
       }
       if (this == head) { // removing the head
         head = next;
       }
       return head;
     }
   ```
   
   It is basically unlinking itself from the linked list, and then sets its own prev and next to null. Then the iterator calls `getNext` on it:
   
   ```
     BlockInfo getNext(int index) {
       assert this.triplets != null : "BlockInfo is not initialized";
       assert index >= 0 && index * 3 + 2 < triplets.length : "Index is out of bound";
       BlockInfo info = (BlockInfo)triplets[index * 3 + 2];
       assert info == null || info.getClass().getName().startsWith(
           BlockInfo.class.getName()) :
           "BlockInfo is expected at " + (index * 3 + 2);
       return info;
     }
   ```
   
   If the case of the current blockInfo having a null next, the iterator will get null, and think it has reached the end of the list and exit wrongly.
   
   This would be a rare bug to hit, as you would need to drop the lock and then the next block to consume from the list would need to be removed from the list, but it is one of those things that will happen from time to time and be very hard to explain.
   
   5M blocks on a single storage is a lot - dropping and re-locking at the storage level was supposed to handle DNs with millions of blocks, but each storage only having a small number.
   
   Does this pause occur once per storage at the start and end of decommission for each node?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org