You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/02/21 12:16:29 UTC

[GitHub] [ozone] kaijchen opened a new pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

kaijchen opened a new pull request #3120:
URL: https://github.com/apache/ozone/pull/3120


   ## What changes were proposed in this pull request?
   
   EC: Refactor ECKeyOutputStream#write() to make it cleaner and similar to the KeyOutputStream.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6358
   
   ## How was this patch tested?
   
   Unit tests.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812532746



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       If we make KeyOutputStream#handleWrite() protected. We can override it here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao edited a comment on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao edited a comment on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1047903513


   Thanks for filing this @kaijchen
   Initially journey was started with this single [loop](https://github.com/apache/ozone/blob/f9a0782ae4efebabb55127df3954745a3c4a0889/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java) 
   However for easy trace tracking at which position write impacted, I have decided to split into 3 sections as you got that in summary comments. 
   I think now, we have evolved enough and bit stable compared to starting point where we need lot of debug help info to track down issues. ( When it has 3 sections I found helpful as the trace indicates which position of data sizes it's causing issues. We may need to add more debug info as well to improve debug-ability.)
   
   Overall patch looks good to me. I will pass one more round as this is in key path and provide my feedback if any.
   
   Thanks a lot for working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r813574158



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {

Review comment:
       Few if checks might not contribute to significant improvement, but this is key path and saving such thing also helps. But doing that checks needs to modify BlockOutputStreamEntryPool, that should be done in master I feel as that is common. So, I am ok to keep it here now as it is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] guihecheng commented on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
guihecheng commented on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1047373846


   Hi @kaijchen the overall patch LGTM!
   It's better to have several lines of summary about the changes made here to be easier for reviewing, I believe that there are certain points to be mentioned that is more than cleanup.
   cc @umamaheswararao


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812482884



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       I think we can probably override KeyOutputStream#handleWrite() someday.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812533378



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {

Review comment:
       Yes. But will there be noticeable performance difference? Because `KeyOutputStream` is doing the same thing.
   If we need to add this check, can we add it inside `allocateBlockIfNeeded()`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812391990



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECBlockOutputStreamEntry.java
##########
@@ -64,7 +64,7 @@
   private final long length;
 
   private ECBlockOutputStream[] blockOutputStreams;

Review comment:
       Looks like this file changes are just variable renames. I don't think it will make big difference using idx or index. Since this patch has logic changes a bit, I would suggest avoid nit changes, which will make diff big. If you want to do this changes, I would suggest to do in separate JIRA ( Probably repurpose the JIRA HDDS-6354)

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       Why len is defined as long here and casting it again?

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -50,12 +50,11 @@
  * block output streams chunk by chunk.
  */
 public class ECKeyOutputStream extends KeyOutputStream {

Review comment:
       Let's stick to the actual refactor of write method and avoid unrelated changes than what JIRA states

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {

Review comment:
       I am wondering, whether this allocateBlockINeeded() can be under currentStreamEntry().getRemaining()>=0 condition ? That can avoid at least couple of if checks every time.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)
+      throws IOException {
+    while (len > 0) {
+      try {
+        blockOutputStreamEntryPool.allocateBlockIfNeeded();
+        int currentStreamIndex = blockOutputStreamEntryPool
+            .getCurrentStreamEntry().getCurrentStreamIndex();
+        int currentRem =
+            ecChunkBufferCache.dataBuffers[currentStreamIndex].remaining();
+        int expectedWriteLen = Math.min((int) len,
+            Math.min(currentRem, ecChunkSize));
+        long currentPos =
+            ecChunkBufferCache.dataBuffers[currentStreamIndex].position();
+        int pos =
+            handleDataWrite(currentStreamIndex, b, off, expectedWriteLen,
+                currentPos + expectedWriteLen == ecChunkSize);
+        checkAndWriteParityCells(pos);
+        long writtenLength = pos - currentPos;
+        len -= writtenLength;
+        off += writtenLength;
+      } catch (Exception e) {
+        markStreamClosed();
+        throw new IOException(e.getMessage());
+      }
     }
-    writeOffset += len;
   }
 
   private StripeWriteStatus rewriteStripeToNewBlockGroup(
-      int failedStripeDataSize, boolean allocateBlockIfFull, boolean close)
-      throws IOException {
-    long[] failedDataStripeChunkLens = new long[numDataBlks];
-    long[] failedParityStripeChunkLens = new long[numParityBlks];
+      int failedStripeDataSize, boolean close) throws IOException {
+    long[] failedDataStripeChunkLens = new long[numDataBlocks];

Review comment:
       Same as above let's avoid unrelated changes here. I am ok one or two, but too many var changes making diff bigger here.

##########
File path: hadoop-ozone/client/src/test/java/org/apache/hadoop/ozone/client/TestOzoneECClient.java
##########
@@ -573,8 +573,8 @@ public void testStripeWriteRetriesOn2Failures() throws IOException {
         nodesIndexesToMarkFailure);

Review comment:
       I find all of these changes are unrelated as well. I would suggest to repurpose other JIRA as cleanups task and do this changes. Hope that make sense to you.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 

Review comment:
       Please make handleWrite API inline with write itself. We have already have API handleDataWrites. Just for the sake the code to keep similar to KeyOutputStream, we don;t need to have the private APIs names same.

##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -540,26 +515,23 @@ public void close() throws IOException {
   }
 
   private void handleStripeFailure(int lastStripeSize,
-      boolean allocateBlockIfFull, boolean isClose)
-      throws IOException {
+      boolean isClose) throws IOException {

Review comment:
       After var removal, this lines may need format?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812482884



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       I think we can probably override KeyOutputStream#handleWrite() someday.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812523344



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 

Review comment:
       Can we make KeyOutputStream#handleWrite() protected and override it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1047378629


   @guihecheng Thanks for reviewing, I have edited the PR message to include a summary of changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen removed a comment on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen removed a comment on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1048416765


   Hi @umamaheswararao, thanks for reviewing. I have reverted most of the unrelated changes.
   Can we override `KeyOutputStream#handleWrite()` and remove the `write()` here?
   Because currently `ECKeyOutputStream#write()` is exactly same as `KeyOutputStream#write()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao commented on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao commented on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1047903513


   Thanks for filing this @kaijchen
   Initially journey was started with this single [loop | https://github.com/apache/ozone/blob/f9a0782ae4efebabb55127df3954745a3c4a0889/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java] 
   However for easy trace tracking at which position write impacted, I have decided to split into 3 sections as you got that in summary comments. 
   I think now, we have evolved enough and bit stable compared to starting point where we need lot of debug help info to track down issues. ( When it has 3 sections I found helpful as the trace indicates which position of data sizes it's causing issues. We may need to add more debug info as well to improve debug-ability.)
   
   Overall patch looks good to me. I will pass one more round as this is in key path and provide my feedback if any.
   
   Thanks a lot for working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812482884



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       I think we can probably override KeyOutputStream#handleWrite().




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1048416765


   Hi @umamaheswararao, thanks for reviewing. I have reverted most of the unrelated changes.
   Can we override `KeyOutputStream#handleWrite()` and remove the `write()` here?
   Because currently `ECKeyOutputStream#write()` is exactly same as `KeyOutputStream#write()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao merged pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao merged pull request #3120:
URL: https://github.com/apache/ozone/pull/3120


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] umamaheswararao edited a comment on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
umamaheswararao edited a comment on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1047903513


   Thanks for filing this @kaijchen
   Initially journey was started with this single [loop](https://github.com/apache/ozone/blob/f9a0782ae4efebabb55127df3954745a3c4a0889/hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java) 
   However for easy trace tracking at which position write impacted, I have decided to split into 3 sections as you got that in summary comments. 
   I think now, we have evolved enough and bit stable compared to starting point where we need lot of debug help info to track down issues. ( When it has 3 sections I found helpful as the trace indicates which position of data sizes it's causing issues. We may need to add more debug info as well to improve debug-ability.)
   
    I will review it soon and provide my feedback if any.
   
   Thanks a lot for working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#issuecomment-1048541574


   Hi @umamaheswararao, thanks for the explanation and the review.
   I have updated the patch to resolve most of the comments. And there is one needs further discussion.
   I think we can let block allocation happen here instead of inside `handleDataWrite` or `handleParityWrite`.
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812532746



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 
-    int currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-        .getCurrentStreamIdx();
-    int currentChunkBufferRemainingLength =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx].remaining();
-    int currentChunkBufferLen =
-        ecChunkBufferCache.dataBuffers[currentStreamIdx]
-            .position();
-    int maxLenToCurrChunkBuffer = Math.min(len, ecChunkSize);
-    int currentWriterChunkLenToWrite =
-        Math.min(currentChunkBufferRemainingLength, maxLenToCurrChunkBuffer);
-    int pos = handleDataWrite(currentStreamIdx, b, off,
-        currentWriterChunkLenToWrite,
-        currentChunkBufferLen + currentWriterChunkLenToWrite == ecChunkSize);
-    checkAndWriteParityCells(pos, false);
-    int remLen = len - currentWriterChunkLenToWrite;
-    int iters = remLen / ecChunkSize;
-    int lastCellSize = remLen % ecChunkSize;
-    off += currentWriterChunkLenToWrite;
-
-    while (iters > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off, ecChunkSize, true);
-      off += ecChunkSize;
-      iters--;
-      checkAndWriteParityCells(pos, iters > 0 || remLen > 0);
-    }
-
-    if (lastCellSize > 0) {
-      currentStreamIdx = blockOutputStreamEntryPool.getCurrentStreamEntry()
-          .getCurrentStreamIdx();
-      pos = handleDataWrite(currentStreamIdx, b, off,
-          lastCellSize, false);
-      checkAndWriteParityCells(pos, false);
+  private void handleWrite(byte[] b, int off, long len, boolean retry)

Review comment:
       If we make KeyOutputStream#handleWrite() protected. We can override it here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] kaijchen commented on a change in pull request #3120: HDDS-6358. EC: Refactor ECKeyOutputStream#write()

Posted by GitBox <gi...@apache.org>.
kaijchen commented on a change in pull request #3120:
URL: https://github.com/apache/ozone/pull/3120#discussion_r812523344



##########
File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/io/ECKeyOutputStream.java
##########
@@ -157,57 +155,47 @@ public void write(byte[] b, int off, int len) throws IOException {
     if (len == 0) {
       return;
     }
-    blockOutputStreamEntryPool.allocateBlockIfNeeded();
+    handleWrite(b, off, len, false);
+    writeOffset += len;
+  }
 

Review comment:
       Can we make KeyOutputStream#handleWrite() protected and override it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org