You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2021/03/06 15:48:01 UTC
[GitHub] [drill] luocooong commented on a change in pull request #2186: DRILL-7874: Ensure DrillFSDataInputStream.read populates byte array of the requested length

luocooong commented on a change in pull request #2186:
URL: https://github.com/apache/drill/pull/2186#discussion_r588896282



##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/DrillFSDataInputStream.java
##########
@@ -223,12 +224,28 @@ public int read(byte[] b, int off, int len) throws IOException {
     public int read(byte[] b) throws IOException {
       operatorStats.startWait();
       try {
-        return is.read(b);
+        return readBytes(b, 0, b.length);
       } finally {
         operatorStats.stopWait();
       }
     }
 
+    /**
+     * Reads up to {@code len} bytes of data from the input stream into an array of bytes.
+     * This method guarantees that regardless of the underlying stream implementation,
+     * the byte array will be populated with either {@code len} bytes or
+     * all available in stream bytes if they are less than {@code len}.
+     */
+    private int readBytes(byte[] b, int off, int len) throws IOException {

Review comment:
       I think we used a new solution to revert the `openDecompressedInputStream` in the follow case?
   ```
   Some parsers, particularly those that read raw bytes, generate errors when passed Hadoop ZipCompressed InputStreams.
   ```

##########
File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/DrillFileSystem.java
##########
@@ -788,51 +785,18 @@ public void removeXAttr(final Path path, final String name) throws IOException {
 
   /**
    * Returns an InputStream from a Hadoop path. If the data is compressed, this method will return a compressed
-   * InputStream depending on the codec.  Note that if the results of this method are sent to a third party parser
-   * that works with bytes or individual characters directly, you should use the openDecompressedInputStream method.
+   * InputStream depending on the codec.
    * @param path Input file path
    * @return InputStream of opened file path
    * @throws IOException If the file is unreachable, unavailable or otherwise unreadable
    */
   public InputStream openPossiblyCompressedStream(Path path) throws IOException {
-    CompressionCodec codec = codecFactory.getCodec(path); // infers from file ext.
+    CompressionCodec codec = getCodec(path); // infers from file ext.
+    InputStream inputStream = open(path);

Review comment:
       @vvysotskyi Hello. Is a better way to reduce mem usage when `codec` is not null?
   ```java
   if (codec != null) {
     return codec.createInputStream(open(path));
   } else {
     return open(path);
   }
   ```
   instead of
   ```java
   CompressionCodec codec = getCodec(path);
   InputStream inputStream = open(path);  // 1st
   if (codec != null) {
     inputStream = codec.createInputStream(inputStream);  // 2nd
   }
   ...
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org