You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/23 00:11:09 UTC

[GitHub] [ozone] umamaheswararao commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

umamaheswararao commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812391990



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECBlockOutputStreamEntry.java
##########
@@ -64,7 +64,7 @@
   private final long length;
 
   private ECBlockOutputStream[] blockOutputStreams;

Review comment:
       Looks like this file changes are just variable renames. I don't think it will make big difference using idx or index. Since this patch has logic changes a bit, I would suggest avoid nit changes, which will make diff big. If you want to do this changes, I would suggest to do in separate JIRA ( Probably repurpose the JIRA HDDS-6354)

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       Why len is defined as long here and casting it again?

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -50,12 +50,11 @@
  * block output streams chunk by chunk.
  */
 public class ECKeyOutputStream extends KeyOutputStream {

Review comment:
       Let's stick to the actual refactor of write method and avoid unrelated changes than what JIRA states

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {

Review comment:
       I am wondering, whether this allocateBlockINeeded() can be under currentStreamEntry().getRemaining()>=0 condition ? That can avoid at least couple of if checks every time.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {
+        blockOutputStreamEntryPool.allocateBlockIfNeeded();
+        int currentStreamIndex = blockOutputStreamEntryPool
+            .getCurrentStreamEntry().getCurrentStreamIndex();
+        int currentRem =
+            ecChunkBufferCache.dataBuffers[currentStreamIndex].remaining();
+        int expectedWriteLen = Math.min((int) len,
+            Math.min(currentRem, ecChunkSize));
+        long currentPos =
+            ecChunkBufferCache.dataBuffers[currentStreamIndex].position();
+        int pos =
+            handleDataWrite(currentStreamIndex, b, off, expectedWriteLen,
+                currentPos + expectedWriteLen == ecChunkSize);
+        checkAndWriteParityCells(pos);
+        long writtenLength = pos - currentPos;
+        len -= writtenLength;
+        off += writtenLength;
+      } catch (Exception e) {
+        markStreamClosed();
+        throw new IOException(e.getMessage());
+      }
     }
-    writeOffset += len;
   }
 
   private StripeWriteStatus rewriteStripeToNewBlockGroup(
-      int failedStripeDataSize, boolean allocateBlockIfFull, boolean close)
-      throws IOException {
-    long[] failedDataStripeChunkLens = new long[numDataBlks];
-    long[] failedParityStripeChunkLens = new long[numParityBlks];
+      int failedStripeDataSize, boolean close) throws IOException {
+    long[] failedDataStripeChunkLens = new long[numDataBlocks];

Review comment:
       Same as above let's avoid unrelated changes here. I am ok one or two, but too many var changes making diff bigger here.

##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -573,8 +573,8 @@ public void testStripeWriteRetriesOn2Failures() throws IOException {
         nodesIndexesToMarkFailure);

Review comment:
       I find all of these changes are unrelated as well. I would suggest to repurpose other JIRA as cleanups task and do this changes. Hope that make sense to you.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 

Review comment:
       Please make handleWrite API inline with write itself. We have already have API handleDataWrites. Just for the sake the code to keep similar to KeyOutputStream, we don;t need to have the private APIs names same.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -540,26 +515,23 @@ public void close() throws IOException {
   }
 
   private void handleStripeFailure(int lastStripeSize,
-      boolean allocateBlockIfFull, boolean isClose)
-      throws IOException {
+      boolean isClose) throws IOException {

Review comment:
       After var removal, this lines may need format?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org