You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/10/14 03:02:24 UTC

[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3811: HDDS-7300. Race condition between full data scan and block deletion

JacksonYao287 commented on code in PR #3811:
URL: https://github.com/apache/ozone/pull/3811#discussion_r995294754


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java:
##########
@@ -230,44 +232,98 @@ private void scanData(DataTransferThrottler throttler, Canceler canceler)
 
     onDiskContainerData.setDbFile(dbFile);
 
-    ContainerLayoutVersion layout = onDiskContainerData.getLayoutVersion();
-
     try (DBHandle db = BlockUtils.getDB(onDiskContainerData, checkConfig);
         BlockIterator<BlockData> kvIter = db.getStore().getBlockIterator(
             onDiskContainerData.getContainerID(),
             onDiskContainerData.getUnprefixedKeyFilter())) {
 
       while (kvIter.hasNext()) {
         BlockData block = kvIter.nextBlock();
-        for (ContainerProtos.ChunkInfo chunk : block.getChunks()) {
-          File chunkFile = layout.getChunkFile(onDiskContainerData,
-              block.getBlockID(), ChunkInfo.getFromProtoBuf(chunk));
-
-          if (!chunkFile.exists()) {
-            // concurrent mutation in Block DB? lookup the block again.
-            String blockKey =
-                onDiskContainerData.blockKey(block.getBlockID().getLocalID());
-            BlockData bdata = db.getStore()
-                    .getBlockDataTable()
-                    .get(blockKey);
-            // In EC, client may write empty putBlock in padding block nodes.
-            // So, we need to make sure, chunk length > 0, before declaring
-            // the missing chunk file.
-            if (bdata != null && bdata.getChunks().size() > 0 && bdata
-                .getChunks().get(0).getLen() > 0) {
-              throw new IOException(
-                  "Missing chunk file " + chunkFile.getAbsolutePath());
+
+        // If holding read lock for the entire duration, including wait() calls
+        // in DataTransferThrottler, would effectively make other threads
+        // throttled.
+        // Here try optimistically and retry with the container lock to
+        // make sure reading the latest record. If the record is just removed,
+        // the block should be skipped to scan.
+        try {
+          scanBlock(block, throttler, canceler);
+        } catch (IOException ex) {

Review Comment:
   there are two case we can catch IOException here, one is chunkFile not found , the other is verifyChecksum fail。so we need to check if ex is an instance of OzoneChecksumException. if yes, it means verifyChecksum error , we just throw it directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org