You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2020/04/21 05:40:00 UTC
[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #843: HDDS-3223. Improve s3g read 1GB object efficiency by 100 times

bharatviswa504 commented on a change in pull request #843:
URL: https://github.com/apache/hadoop-ozone/pull/843#discussion_r411883841



##########
File path: hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/io/S3WrapperInputStream.java
##########
@@ -76,4 +79,87 @@ public long getPos() throws IOException {
   public boolean seekToNewSource(long targetPos) throws IOException {
     return false;
   }
+
+  /**
+   * Copies some or all bytes from a large (over 2GB) <code>InputStream</code>
+   * to an <code>OutputStream</code>, optionally skipping input bytes.
+   * <p>
+   * Copy the method from IOUtils of commons-io to reimplement skip by seek
+   * rather than read. The reason why IOUtils of commons-io implement skip
+   * by read can be found at
+   * <a href="https://issues.apache.org/jira/browse/IO-203">IO-203</a>.
+   * </p>
+   * <p>
+   * This method buffers the input internally, so there is no need to use a
+   * <code>BufferedInputStream</code>.
+   * </p>
+   * The buffer size is given by {@link #DEFAULT_BUFFER_SIZE}.
+   *
+   * @param output the <code>OutputStream</code> to write to
+   * @param inputOffset : number of bytes to skip from input before copying
+   * -ve values are ignored
+   * @param length : number of bytes to copy. -ve means all
+   * @return the number of bytes copied
+   * @throws NullPointerException if the input or output is null
+   * @throws IOException          if an I/O error occurs
+   */
+  public long copyLarge(final OutputStream output, final long inputOffset,
+      final long length) throws IOException {
+    return copyLarge(output, inputOffset, length,
+        new byte[DEFAULT_BUFFER_SIZE]);
+  }
+
+  /**
+   * Copies some or all bytes from a large (over 2GB) <code>InputStream</code>
+   * to an <code>OutputStream</code>, optionally skipping input bytes.
+   * <p>
+   * Copy the method from IOUtils of commons-io to reimplement skip by seek
+   * rather than read. The reason why IOUtils of commons-io implement skip
+   * by read can be found at
+   * <a href="https://issues.apache.org/jira/browse/IO-203">IO-203</a>.
+   * </p>
+   * <p>
+   * This method uses the provided buffer, so there is no need to use a
+   * <code>BufferedInputStream</code>.
+   * </p>
+   *
+   * @param output the <code>OutputStream</code> to write to
+   * @param inputOffset : number of bytes to skip from input before copying
+   * -ve values are ignored
+   * @param length : number of bytes to copy. -ve means all

Review comment:
       -ve means all. But I don't see that handled in the code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org