You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Hanisha Koneru (Jira)" <ji...@apache.org> on 2020/12/04 23:50:00 UTC

[jira] [Created] (HDDS-4552) Read data from chunk into ByteBuffer[] instead of single ByteBuffer

Hanisha Koneru created HDDS-4552:
------------------------------------

             Summary: Read data from chunk into ByteBuffer[] instead of single ByteBuffer
                 Key: HDDS-4552
                 URL: https://issues.apache.org/jira/browse/HDDS-4552
             Project: Hadoop Distributed Data Store
          Issue Type: Improvement
            Reporter: Hanisha Koneru


When a ReadChunk operation is performed, all the data to be read from one chunk is read into a single ByteBuffer. 
{code:java}
#ChunkUtils#readData()
public static void readData(File file, ByteBuffer buf,
    long offset, long len, VolumeIOStats volumeIOStats)
    throws StorageContainerException {
  .....
  try {
    bytesRead = processFileExclusively(path, () -> {
      try (FileChannel channel = open(path, READ_OPTIONS, NO_ATTRIBUTES);
           FileLock ignored = channel.lock(offset, len, true)) {

        return channel.read(buf, offset);
      } catch (IOException e) {
        throw new UncheckedIOException(e);
      }
    });
  } catch (UncheckedIOException e) {
    throw wrapInStorageContainerException(e.getCause());
  }
  .....
  .....{code}
This Jira proposes to read the data from the channel and put it into an array of ByteBuffers each with a set capacity. This capacity can be configurable. 

This would help with optimizing Ozone InputStreams in terms of cached memory. Currently, data in ChunkInputStream is cached till either the stream is closed or the chunk EOF is reached. This sometimes leads to upto 4MB (default ChunkSize) of data being cached in memory per ChunkInputStream. 

After the proposed change, we can optimize ChunkInputStream to release a ByteBuffer as soon as that ByteBuffer is read instead of waiting to read the whole chunk. Read I/O performance will not be affected as the read from DN still returns the requested length of data at one go. Only difference would be that the data would be returned in an array of ByteBuffer instead of a single ByteBuffer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org