You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/12/10 20:14:25 UTC

[GitHub] [ozone] hanishakoneru opened a new pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

hanishakoneru opened a new pull request #1685:
URL: https://github.com/apache/ozone/pull/1685


   ## What changes were proposed in this pull request?
   
   When a ReadChunk operation is performed, all the data to be read from one chunk is read into a single ByteBuffer. 
   
   ```
   #ChunkUtils#readData()
   public static void readData(File file, ByteBuffer buf,
       long offset, long len, VolumeIOStats volumeIOStats)
       throws StorageContainerException {
     .....
     try {
       bytesRead = processFileExclusively(path, () -> {
         try (FileChannel channel = open(path, READ_OPTIONS, NO_ATTRIBUTES);
              FileLock ignored = channel.lock(offset, len, true)) {
           return channel.read(buf, offset);
         } catch (IOException e) {
           throw new UncheckedIOException(e);
         }
       });
     } catch (UncheckedIOException e) {
       throw wrapInStorageContainerException(e.getCause());
     }
     .....
     .....
   ```
   This Jira proposes to read the data from the channel and put it into an array of ByteBuffers each with a set capacity. This capacity can be configurable by client. 
   
   This would help with optimizing Ozone InputStreams in terms of cached memory. Currently, data in ChunkInputStream is cached till either the stream is closed or the chunk EOF is reached. This sometimes leads to upto 4MB (default ChunkSize) of data being cached in memory per ChunkInputStream.
   
   After the proposed change, we can optimize ChunkInputStream to release a ByteBuffer as soon as that ByteBuffer is read instead of waiting to read the whole chunk (HDDS-4553). Read I/O performance will not be affected as the read from DN still returns the requested length of data at one go. Only difference would be that the data would be returned in an array of ByteBuffer instead of a single ByteBuffer.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4552
   
   ## How was this patch tested?
   
   Added integration test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-797067588


   Thanks @adoroszlai 
   
   > Just an idea: we have a ChunkBuffer implementation which wraps a list of ByteBuffers, and ByteString#asReadOnlyByteBufferList provides exactly that list. I'm not sure it fully addresses all your needs.
   
   Yes, using ChunkBuffer implementation of ByteBufferList in this PR to wrap the list of buffers.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r548377345



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       What if we read in small buffers on the server side itself and send it across as a list of bytestrings to the client? 
   Copying a big buffer on the client read path will be slowing down the read.  Probably we should do some benchmarking to understand the effects of all these.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-743514672


   Moved the read error handling by refreshing pipeline from BlockInput to ChunkInputStream. 
   Let's say ChunkInputStream already has 1MB of data in its buffers when the DNs in the pipeline shutdown. If client tries to read 2MB from this ChunkInputStream, then it can read 1MB from the existing buffers and gets StorageContainerException when trying to read the next 1MB of data from the DN. In this case, the first 1MB which was already read will be discarded by BlockInputStream. But the ChunkInputStream would have advanced its position to 1MB and will start reading from offset 1MB instead of 0 after acquiring new client.
   
   cc. @adoroszlai 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-781632359


   Changed the design to avoid buffer copying. Instead of copying read chunk data into smaller buffers on client side, readChunk response will return data as a list of smaller ByteStrings. Please refer to the PR description for more details. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-749128717


   @bshashikant, can you please take a look at this PR when you get a chance. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

adoroszlai commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-796504948


   > ByteString concat could work. But the problem is ChunkBuffer wraps around ByteBuffers. And checksums are calculated on ChunkBuffers. We would have to change the whole ChunkBuffer model (or change Checksum computations to use ByteStrings instead).
   
   Just an idea: we have a `ChunkBuffer` implementation which wraps a list of `ByteBuffer`s, and `ByteString#asReadOnlyByteBufferList` provides exactly that list.  I'm not sure it fully addresses all your needs.
   
   > Also, are there any acceptance tests for new clients talking to old servers?
   
   Yes, the same `xcompat` tests this situation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-803171604


   Thanks for the review @adoroszlai. I will address them in HDDS-4553 (https://github.com/apache/ozone/pull/2062) which is a follow up of this Jira.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-801287718


   > Also, do we need to add this to ozone documentation somewhere?
   Sure, we can open a doc Jira to get this documented. Do you know where we can document this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-802321498


   Thank you @bshashikant. I will merge this shortly. 
   
   We currently do not have docs explaining client read/ write path. Do you propose we add docs for that? As this is an internal feature (not configurable), do we want to add it to docs or the javadocs in the code would suffice?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r553025593



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       ```
   What if we read in small buffers on the server side itself and send it across as a list of bytestrings to the client?
   ```
   This might work but would require a change in the DN-Client protocol. Would have to analyze the compatibility issues and how to address them.
   
   ```
   In case, this turns out to be unavoidable, we can also think about doing bytebuffer.compact() which also does an intrinsic buffer copy to release the buffers but the logic would be more simpler
   ```
   I am not sure if there is much gain in doing this. The code changes in this PR were required because the logic was inaccurate. It was working because there was always only one ByteBuffer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r548377730



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       In case, this turns out to be unavoidable, we can also think about doing bytebuffer.compact()  which also does an intrinsic buffer copy to release thebuffers but the logic would be more simpler.

##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       In case, this turns out to be unavoidable, we can also think about doing bytebuffer.compact()  which also does an intrinsic buffer copy to release the buffers but the logic would be more simpler.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-801052623


   Thanks @hanishakoneru. Can you please rebase? Also, do we need to add this to ozone documentation somewhere?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru merged pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru merged pull request #1685:
URL: https://github.com/apache/ozone/pull/1685


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-801795836


   Thanks @hanishakoneru . For documentation, you can refer https://issues.apache.org/jira/browse/HDDS-4948 for example.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-796074810


   The existing xcompat acceptance tests added as part of HDDS-4731 should cover most of the testing required for this change.
   Thanks @adoroszlai for adding the compatibility tests. In these tests, there is an old client (v1.0.0) talking to a new server and testing read write compatibility with different combinations. Please correct me if this is incorrect. 
   Also, are there any acceptance tests for new clients talking to old servers?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r594065241



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -87,7 +105,8 @@
   ChunkInputStream(ChunkInfo chunkInfo, BlockID blockId,
       XceiverClientFactory xceiverClientFactory,
       Supplier<Pipeline> pipelineSupplier,
-      boolean verifyChecksum, Token<? extends TokenIdentifier> token) {
+      boolean verifyChecksum,
+      Token<? extends TokenIdentifier> token) {

Review comment:
       Unintended change?

##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/OzoneClientConfig.java
##########
@@ -109,13 +109,13 @@
   private String checksumType = ChecksumType.CRC32.name();
 
   @Config(key = "bytes.per.checksum",
-      defaultValue = "1MB",
+      defaultValue = "256KB",

Review comment:
       Let's not change the default for now. We can change once we do some tests and analyze performance .

##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -66,13 +76,21 @@
 
   // Index of the buffers corresponding to the current position of the buffers
   private int bufferIndex;
+  // bufferOffsets[i] stores the index of the first data byte in buffer i
+  // (buffers.get(i)) w.r.t first byte in the buffers.
+  // Let's each buffer has a capacity of 40 bytes. The bufferOffset for

Review comment:
       Let's -> "Let's say"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-784584849


   > I think we should use the read default buffer size irrespective of whether checksum is disabled or not. It can be same as the checksum boundary by default.
   
   The problem with that would be while verifying the checksums. Let's say a chunk has checksum boundary at every 256KB and we set the default read buffer size to 64KB. To calculate checksum, we would need to combine 4 buffers of 64KB each and create a read only buffer of 256KB which will be passed to Checksum.verifyChecksum. This would result in buffer copy which we were trying to avoid in the previous deisgn.
   
   > Can we add few acceptance tests to test the compatibility?
   Yes I am working on adding more tests here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-788858804


   > > I think we should use the read default buffer size irrespective of whether checksum is disabled or not. It can be same as the checksum boundary by default.
   > 
   > The problem with that would be while verifying the checksums. Let's say a chunk has checksum boundary at every 256KB and we set the default read buffer size to 64KB. To calculate checksum, we would need to combine 4 buffers of 64KB each and create a read only buffer of 256KB which will be passed to Checksum.verifyChecksum. This would result in buffer copy which we were trying to avoid in the previous design.
   
   @hanishakoneru , can we do byteString.concat in cases where bytes.per. Checksum < read buffer size? It doesn't do buffer copy as per the documentation here: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/ByteString.
   "Concatenation is likewise supported without copying (long strings) by building a tree of pieces in RopeByteString".


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r553719956



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       The basic problem we are trying to solve here is to minimize the memory overhead in the client. In order to solve this, adding an extra buffer copy overhead(with the patch) does not seem to be a reasonable idea to me. Let's discuss it in some more detail on how to address this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r548377730



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       In case, this turns out to be unavoidable, we can also think about doing bytebuffer.compact() which also does an intrinsic buffer copy but the logic would be more simpler.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r553025593



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       ```
   What if we read in small buffers on the server side itself and send it across as a list of bytestrings to the client?
   ```
   This might work but would require a change in the DN-Client protocol. Would have to analyze the compatibility issues and how to address them.
   
   `In case, this turns out to be unavoidable, we can also think about doing bytebuffer.compact() which also does an intrinsic buffer copy to release the buffers but the logic would be more simpler`
   I am not sure if there is much gain in doing this. The code changes in this PR were required because the logic was inaccurate. It was working because there was always only one ByteBuffer.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r594642359



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/OzoneClientConfig.java
##########
@@ -109,13 +109,13 @@
   private String checksumType = ChecksumType.CRC32.name();
 
   @Config(key = "bytes.per.checksum",
-      defaultValue = "1MB",
+      defaultValue = "256KB",

Review comment:
       Sure. Reverted back to 1MB. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru edited a comment on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru edited a comment on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-784584849


   > I think we should use the read default buffer size irrespective of whether checksum is disabled or not. It can be same as the checksum boundary by default.
   
   The problem with that would be while verifying the checksums. Let's say a chunk has checksum boundary at every 256KB and we set the default read buffer size to 64KB. To calculate checksum, we would need to combine 4 buffers of 64KB each and create a read only buffer of 256KB which will be passed to Checksum.verifyChecksum. This would result in buffer copy which we were trying to avoid in the previous deisgn.
   
   > Can we add few acceptance tests to test the compatibility?
   
   Yes I am working on adding more tests here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] bshashikant commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

bshashikant commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r546567517



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockInputStream.java
##########
@@ -134,7 +135,12 @@ public synchronized void initialize() throws IOException {
     try {
       chunks = getChunkInfos();
     } catch (ContainerNotFoundException ioEx) {
-      refreshPipeline(ioEx);
+      LOG.info("Unable to read information for block {} from pipeline {}: {}." +

Review comment:
       LOG.info -> LOG.warn?

##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       This will do an additional buffer copy here. Let's see if we can explore anything here to avoid buffer copy here:
   https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/UnsafeByteOperations




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-796055514


   > can we do byteString.concat in cases where bytes.per. Checksum < read buffer size? It doesn't do buffer copy as per the documentation here: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/ByteString.
   > "Concatenation is likewise supported without copying (long strings) by building a tree of pieces in RopeByteString".
   
   ByteString concat could work. But the problem is ChunkBuffer wraps around ByteBuffers. And checksums are calculated on ChunkBuffers. We would have to change the whole ChunkBuffer model (or change Checksum computations to use ByteStrings instead). I think the change would get very complicated. Also, with concating ByteStrings, we would have to keep track of position, limit etc. separately to track checksum boundaries. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru edited a comment on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru edited a comment on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-801287718


   > Also, do we need to add this to ozone documentation somewhere?
   
   Sure, we can open a doc Jira to get this documented. Do you know where we can document this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] adoroszlai commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

adoroszlai commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r597616900



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -364,7 +404,18 @@ protected ByteString readChunk(ChunkInfo readChunkInfo) throws IOException {
       throw new IOException("Unexpected OzoneException: " + e.toString(), e);
     }
 
-    return readChunkResponse.getData();
+    if (readChunkResponse.hasData()) {
+      return readChunkResponse.getData().asReadOnlyByteBufferList();
+    } else if (readChunkResponse.hasDataBuffers()) {
+      List<ByteString> buffersList = readChunkResponse.getDataBuffers()
+          .getBuffersList();
+      return buffersList.stream()
+          .map(ByteString::asReadOnlyByteBuffer)
+          .collect(Collectors.toList());

Review comment:
       I think we should avoid streams on read/write path.  Earlier these were found to cause CPU usage hotspots.  See eg. HDDS-3702.
   
   (Also in few other instances below.)

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/FilePerChunkStrategy.java
##########
@@ -222,7 +228,33 @@ public ChunkBuffer readChunk(Container container, BlockID blockID,
     }
 
     long len = info.getLen();
-    ByteBuffer data = ByteBuffer.allocate((int) len);
+
+    long bufferCapacity = 0;
+    if (info.isReadDataIntoSingleBuffer()) {
+      // Older client - read all chunk data into one single buffer.
+      bufferCapacity = len;
+    } else {
+      // Set buffer capacity to checksum boundary size so that each buffer
+      // corresponds to one checksum. If checksum is NONE, then set buffer
+      // capacity to default (OZONE_CHUNK_READ_BUFFER_DEFAULT_SIZE_KEY = 64KB).
+      ChecksumData checksumData = info.getChecksumData();
+
+      if (checksumData != null) {
+        if (checksumData.getChecksumType() ==
+            ContainerProtos.ChecksumType.NONE) {
+          bufferCapacity = defaultReadBufferCapacity;
+        } else {
+          bufferCapacity = checksumData.getBytesPerChecksum();
+        }
+      }
+    }
+    // If the buffer capacity is 0, set all the data into one ByteBuffer
+    if (bufferCapacity == 0) {
+      bufferCapacity = len;
+    }

Review comment:
       This block seems to be duplicated from `FilePerBlock...`.  Can it be extracted?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/FilePerChunkStrategy.java
##########
@@ -255,16 +287,16 @@ public ChunkBuffer readChunk(Container container, BlockID blockID,
         if (file.exists()) {
           long offset = info.getOffset() - chunkFileOffset;
           Preconditions.checkState(offset >= 0);
-          ChunkUtils.readData(file, data, offset, len, volumeIOStats);
-          return ChunkBuffer.wrap(data);
+          ChunkUtils.readData(file, dataBuffers, offset, len, volumeIOStats);
+          return ChunkBuffer.wrap(Lists.newArrayList(dataBuffers));
         }
       } catch (StorageContainerException ex) {
         //UNABLE TO FIND chunk is not a problem as we will try with the
         //next possible location
         if (ex.getResult() != UNABLE_TO_FIND_CHUNK) {
           throw ex;
         }
-        data.clear();
+        dataBuffers = null;

Review comment:
       If read from first location fails and we have to fall back to the temp chunk file, this would cause exception.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r594640598



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -87,7 +105,8 @@
   ChunkInputStream(ChunkInfo chunkInfo, BlockID blockId,
       XceiverClientFactory xceiverClientFactory,
       Supplier<Pipeline> pipelineSupplier,
-      boolean verifyChecksum, Token<? extends TokenIdentifier> token) {
+      boolean verifyChecksum,
+      Token<? extends TokenIdentifier> token) {

Review comment:
       Yup. Reverted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#issuecomment-783547199


   @bshashikant can you please take a look at the updated patch. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org

[GitHub] [ozone] hanishakoneru commented on a change in pull request #1685: HDDS-4552. Read data from chunk into ByteBuffer[] instead of single ByteBuffer.

Posted by GitBox <gi...@apache.org>.

hanishakoneru commented on a change in pull request #1685:
URL: https://github.com/apache/ozone/pull/1685#discussion_r548257142



##########
File path: hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/ChunkInputStream.java
##########
@@ -301,36 +348,38 @@ private synchronized void readChunkFromContainer(int len) throws IOException {
       startByteIndex = chunkPosition;
     } else {
       // Start reading the chunk from the last chunkPosition onwards.
-      startByteIndex = bufferOffset + bufferLength;
+      startByteIndex = bufferOffsetWrtChunkData + bufferLength;
     }
 
-    // bufferOffset and bufferLength are updated below, but if read fails
+    // bufferOffsetWrtChunkData and bufferLength are updated after the data
+    // is read from Container and put into the buffers, but if read fails
     // and is retried, we need the previous position.  Position is reset after
     // successful read in adjustBufferPosition()
     storePosition();
 
+    long adjustedBuffersOffset, adjustedBuffersLen;
     if (verifyChecksum) {
-      // Update the bufferOffset and bufferLength as per the checksum
-      // boundary requirement.
-      computeChecksumBoundaries(startByteIndex, len);
+      // Adjust the chunk offset and length to include required checksum
+      // boundaries
+      Pair<Long, Long> adjustedOffsetAndLength =
+          computeChecksumBoundaries(startByteIndex, len);
+      adjustedBuffersOffset = adjustedOffsetAndLength.getLeft();
+      adjustedBuffersLen = adjustedOffsetAndLength.getRight();
     } else {
       // Read from the startByteIndex
-      bufferOffset = startByteIndex;
-      bufferLength = len;
+      adjustedBuffersOffset = startByteIndex;
+      adjustedBuffersLen = len;
     }
 
     // Adjust the chunkInfo so that only the required bytes are read from
     // the chunk.
     final ChunkInfo adjustedChunkInfo = ChunkInfo.newBuilder(chunkInfo)
-        .setOffset(bufferOffset + chunkInfo.getOffset())
-        .setLen(bufferLength)
+        .setOffset(adjustedBuffersOffset + chunkInfo.getOffset())
+        .setLen(adjustedBuffersLen)
         .build();
 
-    ByteString byteString = readChunk(adjustedChunkInfo);
-
-    buffers = byteString.asReadOnlyByteBufferList();
-    bufferIndex = 0;
-    allocated = true;
+    byte[] chunkData = readChunk(adjustedChunkInfo).toByteArray();

Review comment:
       We get ByteString from the response. But the returned ByteString does not have the underlying buffer boundary information. Hence ByteString#asReadOnlyByteBufferList() will return only one ByteBuffer with all the data irrespective of the backing arrays used to construct the ByteString. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org