You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/08/17 06:36:43 UTC

[GitHub] [ozone] szetszwo commented on a change in pull request #2538: HDDS-5619. Ozone data corruption issue on follower node.

szetszwo commented on a change in pull request #2538:
URL: https://github.com/apache/ozone/pull/2538#discussion_r690075053



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java
##########
@@ -183,17 +194,45 @@ private static long writeDataToChannel(FileChannel channel, ChunkBuffer data,
   public static void readData(File file, ByteBuffer[] buffers,
       long offset, long len, HddsVolume volume)
       throws StorageContainerException {
+    readData(file, null, buffers, offset, len, volume);
+  }
+
+  public static void readData(File file, FileChannel channel,
+      ByteBuffer[] buffers, long offset, long len, HddsVolume volume)
+      throws StorageContainerException {
 
     final Path path = file.toPath();
     final long startTime = Time.monotonicNow();
     final long bytesRead;
+    final long endOffsetToRead = offset + len;
 
     try {
+      if (file.length() > 0) {
+        Preconditions.checkArgument(file.length() >= endOffsetToRead,
+            "file length should atleast match the last offset to read");
+      }
       bytesRead = processFileExclusively(path, () -> {
-        try (FileChannel channel = open(path, READ_OPTIONS, NO_ATTRIBUTES);
-             FileLock ignored = channel.lock(offset, len, true)) {
+        // if the file handle is already cached, just use that for reads.
+        if (channel != null) {
+          FileLock lock = null;
+          try {
+            lock = channel.lock(offset, len, true);

Review comment:
       Similar to above, we may use try-with-resource here.

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java
##########
@@ -163,14 +161,27 @@ private static long writeDataToFile(File file, ChunkBuffer data,
 
   private static long writeDataToChannel(FileChannel channel, ChunkBuffer data,
       long offset) {
+    FileLock lock = null;
     try {
+      lock = channel.lock(offset, data.remaining(), false);

Review comment:
       We may use try-with-resource as below.  Then, we can remove the releaseLock(..) method.
   ```
     private static long writeDataToChannel(FileChannel channel, ChunkBuffer data,
         long offset) {
       try(FileLock lock = channel.lock(offset, data.remaining(), false)) {
         channel.position(offset);
         return data.writeTo(channel);
       } catch (IOException e) {
         throw new UncheckedIOException(e);
       }
     }
   ```
   

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
##########
@@ -648,25 +651,38 @@ private ByteString readStateMachineData(
       // readStateMachineData should only be called for "write" to Ratis.
       Preconditions.checkArgument(!HddsUtils.isReadOnly(requestProto));
       if (requestProto.getCmdType() == Type.WriteChunk) {
-        final CompletableFuture<ByteString> future = new CompletableFuture<>();
-        ByteString data = stateMachineDataCache.get(entry.getIndex());
+        final CompletableFuture<ByteString> readStateMachineFuture =
+            new CompletableFuture<>();
+        long index = entry.getIndex();
+        ByteString data = stateMachineDataCache.get(index);
         if (data != null) {
-          future.complete(data);
-          return future;
+          readStateMachineFuture.complete(data);
+          return readStateMachineFuture;
         }
 
-        CompletableFuture.supplyAsync(() -> {
-          try {
-            future.complete(
-                readStateMachineData(requestProto, entry.getTerm(),
-                    entry.getIndex()));
-          } catch (IOException e) {
-            metrics.incNumReadStateMachineFails();
-            future.completeExceptionally(e);
-          }
-          return future;
-        }, getChunkExecutor(requestProto.getWriteChunk()));
-        return future;
+        // Queue the readStateMachine data readStateMachineFuture behind the
+        // write chunk for the same log index if it's under progress.
+        // Otherwise, issue the read
+        // chunk call over any chunk executor thread. Idea here is to ensure,
+        // read chunk always starts once the write chunk for the same log index
+        // finishes.
+        CompletableFuture<Message> response =
+            writeChunkFutureMap.computeIfPresent(index, (key, val) -> {

Review comment:
       computeIfPresent(..) will put a read future to the writeChunkFutureMap, which should only contain write futures.  We may change the code as below
   ```
         if (requestProto.getCmdType() == Type.WriteChunk) {
           final long index = entry.getIndex();
           final ByteString data = stateMachineDataCache.get(index);
           if (data != null) {
             return CompletableFuture.completedFuture(data);
           }
   
           final CompletableFuture<Message> writeFuture
               = Optional.ofNullable(writeChunkFutureMap.get(index))
               .orElse(CompletableFuture.completedFuture(null));
   
           // Queue the readStateMachine data readStateMachineFuture behind the
           // write chunk for the same log index if it's under progress.
           // Otherwise, issue the read
           // chunk call over any chunk executor thread. Idea here is to ensure,
           // read chunk always starts once the write chunk for the same log index
           // finishes.
           return writeFuture.thenCompose(
               w -> getReadStateMachineFuture(requestProto, entry));
         } else {
   ```
   

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java
##########
@@ -678,6 +694,19 @@ private ByteString readStateMachineData(
     }
   }
 
+  private void getReadStateMachineFuture(
+      CompletableFuture<ByteString> readStateMachineFuture,
+      ContainerCommandRequestProto requestProto, LogEntryProto entry) {
+    try {
+      readStateMachineFuture.complete(
+          readStateMachineData(requestProto, entry.getTerm(),
+              entry.getIndex()));
+    } catch (IOException e) {
+      metrics.incNumReadStateMachineFails();
+      readStateMachineFuture.completeExceptionally(e);
+    }
+  }

Review comment:
       We should make this async as below:
   ```
     private CompletableFuture<ByteString> getReadStateMachineFuture(
         ContainerCommandRequestProto request, LogEntryProto entry) {
       return CompletableFuture.supplyAsync(() -> {
         try {
           return readStateMachineData(request, entry.getTerm(),
               entry.getIndex());
         } catch (IOException e) {
           metrics.incNumReadStateMachineFails();
           throw new CompletionException(
               "Failed to request=" + request + ", entry=" + entry, e);
         }
       }, chunkExecutor);
     }
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org