You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2021/08/25 07:45:43 UTC

[GitHub] [hbase] anoopsjohn commented on a change in pull request #3596: HBASE-23584 cache filestatus in storefileinfo

anoopsjohn commented on a change in pull request #3596:
URL: https://github.com/apache/hbase/pull/3596#discussion_r695459214



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
##########
@@ -163,22 +168,57 @@ private StoreFileInfo(final Configuration conf, final FileSystem fs, final FileS
       }
       LOG.trace("{} is a {} reference to {}", p, reference.getFileRegion(), referencePath);
     } else if (isHFile(p)) {
-      // HFile
-      if (fileStatus != null) {
-        this.createdTimestamp = fileStatus.getModificationTime();
-        this.size = fileStatus.getLen();
-      } else {
-        FileStatus fStatus = fs.getFileStatus(initialPath);
-        this.createdTimestamp = fStatus.getModificationTime();
-        this.size = fStatus.getLen();
-      }
+      // If a fileStatus, use it. It is pointed at the hfile, not at a link or reference.
+      this.cachedFileStatus = fileStatus != null? fileStatus: fs.getFileStatus(p);
+      this.createdTimestamp = this.cachedFileStatus.getModificationTime();
+      this.size = this.cachedFileStatus.getLen();
       this.reference = null;
       this.link = null;
     } else {
       throw new IOException("path=" + p + " doesn't look like a valid StoreFile");
     }
   }
 
+  /**
+   * @return A new FNFE with <param>fnfe</param> as cause but including info if reference or link.
+   */
+  private FileNotFoundException decorateFileNotFoundException(FileNotFoundException fnfe) {
+    FileNotFoundException newFnfe = new FileNotFoundException(toString());
+    newFnfe.initCause(fnfe);
+    return newFnfe;
+  }
+
+  /**
+   * @return FileStatus for the linked or referenced file or if not a reference, then the hfile
+   */
+  private FileStatus loadFileStatus(final FileSystem fs) throws IOException {

Review comment:
       Can avoid passing 'fs' as we can refer that here from this.fs

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
##########
@@ -45,11 +43,18 @@
 import org.slf4j.LoggerFactory;
 
 /**
- * Describe a StoreFile (hfile, reference, link)
+ * Describe a StoreFile (hfile, reference, link).
  */
 @InterfaceAudience.Private
 public class StoreFileInfo {
   private static final Logger LOG = LoggerFactory.getLogger(StoreFileInfo.class);
+  /**
+   * Cached fileStatus for file or if a link or a reference, then the fileStatus of the referred-to
+   * file. We cache filestatus rather than ask NN each time we need it. We set it after construction

Review comment:
       nit:  "Rather than ask **FS** each time " ?  Its not NN always :-) 

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
##########
@@ -45,11 +43,18 @@
 import org.slf4j.LoggerFactory;
 
 /**
- * Describe a StoreFile (hfile, reference, link)
+ * Describe a StoreFile (hfile, reference, link).
  */
 @InterfaceAudience.Private
 public class StoreFileInfo {
   private static final Logger LOG = LoggerFactory.getLogger(StoreFileInfo.class);
+  /**
+   * Cached fileStatus for file or if a link or a reference, then the fileStatus of the referred-to
+   * file. We cache filestatus rather than ask NN each time we need it. We set it after construction

Review comment:
       Minor :  Also can we move this member variable below the static members so that all instance members together? That is something we normally follow as coding style

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
##########
@@ -306,16 +344,13 @@ ReaderContext createReaderContext(boolean doDropBehind, long readahead, ReaderTy
         // Intercept the exception so can insert more info about the Reference; otherwise
         // exception just complains about some random file -- operator doesn't realize it
         // other end of a Reference
-        FileNotFoundException newFnfe = new FileNotFoundException(toString());
-        newFnfe.initCause(fnfe);
-        throw newFnfe;
+        throw decorateFileNotFoundException(fnfe);
       }
-      status = fs.getFileStatus(referencePath);
     } else {
       in = new FSDataInputStreamWrapper(fs, this.getPath(), doDropBehind, readahead);
-      status = fs.getFileStatus(initialPath);
     }
-    long length = status.getLen();
+    getFileStatus();

Review comment:
       +1

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileInfo.java
##########
@@ -163,22 +168,57 @@ private StoreFileInfo(final Configuration conf, final FileSystem fs, final FileS
       }
       LOG.trace("{} is a {} reference to {}", p, reference.getFileRegion(), referencePath);
     } else if (isHFile(p)) {
-      // HFile
-      if (fileStatus != null) {
-        this.createdTimestamp = fileStatus.getModificationTime();
-        this.size = fileStatus.getLen();
-      } else {
-        FileStatus fStatus = fs.getFileStatus(initialPath);
-        this.createdTimestamp = fStatus.getModificationTime();
-        this.size = fStatus.getLen();
-      }
+      // If a fileStatus, use it. It is pointed at the hfile, not at a link or reference.
+      this.cachedFileStatus = fileStatus != null? fileStatus: fs.getFileStatus(p);

Review comment:
       So only in case of this is an HFile (not a split ref file and link ), we will keep the cachedFileStatus.
   Not sure in such cases what status it will be passing.   Should be status of the ref file or link file not the referred one or linked one right.. Checking deeply that we wont break anything in that regard with this patch.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org