You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2022/09/07 20:55:14 UTC

[GitHub] [hadoop] mukund-thakur commented on a diff in pull request #4862: HADOOP-18439. Fix VectoredIO for LocalFileSystem when checksum is enabled.

mukund-thakur commented on code in PR #4862:
URL: https://github.com/apache/hadoop/pull/4862#discussion_r965289294


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java:
##########
@@ -396,11 +407,33 @@ static ByteBuffer checkBytes(ByteBuffer sumsBytes,
       return data;
     }
 
+    /**
+     * Validates range parameters.
+     * In case of CheckSum FS, we already have calculated
+     * fileLength so failing fast here.
+     * @param ranges requested ranges.
+     * @param fileLen length of file.
+     * @throws EOFException end of file exception.
+     */
+    private void validateRangeRequest(List<? extends FileRange> ranges, long fileLen) throws EOFException {
+      for (FileRange range : ranges) {
+        VectoredReadUtils.validateRangeRequest(range);
+        if (range.getOffset() + range.getLength() > fileLen) {
+          LOG.warn("Requested range [{}, {}) is beyond EOF for path {}",
+                  range.getOffset(), range.getLength(), file);
+          throw new EOFException("Requested range [" + range.getOffset() + ", "
+                  + range.getLength() + ") is beyond EOF for path " + file);
+        }
+      }
+    }
+
     @Override
     public void readVectored(List<? extends FileRange> ranges,
                              IntFunction<ByteBuffer> allocate) throws IOException {
+      long length = fs.getFileStatus(file).getLen();

Review Comment:
   Are you talking about FSDataBoundedInputStream#getFileLength() ? It is private method in that inner class and can't/shouldn't be used by ChecksumFSInputChecker? Should we create the same method here with just one line?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org