You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/01/10 20:46:29 UTC

[GitHub] [pinot] snleee commented on a change in pull request #7969: refactor segment loading logic in BaseTableDataManager to decouple it with local segment directory

snleee commented on a change in pull request #7969:
URL: https://github.com/apache/pinot/pull/7969#discussion_r781531660



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java
##########
@@ -277,49 +282,36 @@ public void addSegmentError(String segmentName, SegmentErrorInfo segmentErrorInf
   public void reloadSegment(String segmentName, IndexLoadingConfig indexLoadingConfig, SegmentZKMetadata zkMetadata,
       SegmentMetadata localMetadata, @Nullable Schema schema, boolean forceDownload)
       throws Exception {
-    File indexDir = localMetadata.getIndexDir();
-    Preconditions.checkState(indexDir.isDirectory(), "Index directory: %s is not a directory", indexDir);
-
-    File parentFile = indexDir.getParentFile();
-    File segmentBackupDir =
-        new File(parentFile, indexDir.getName() + CommonConstants.Segment.SEGMENT_BACKUP_DIR_SUFFIX);
-
+    File indexDir = getSegmentDataDir(segmentName);
     try {
-      // First rename index directory to segment backup directory so that original segment have all file descriptors
-      // point to the segment backup directory to ensure original segment serves queries properly
+      // Create backup directory to handle failure of segment reloading.

Review comment:
       Add more information to the comment that we are actually `renaming` the `indexDir` to `indexBackupDir`.

##########
File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/BaseTableDataManager.java
##########
@@ -277,49 +282,36 @@ public void addSegmentError(String segmentName, SegmentErrorInfo segmentErrorInf
   public void reloadSegment(String segmentName, IndexLoadingConfig indexLoadingConfig, SegmentZKMetadata zkMetadata,
       SegmentMetadata localMetadata, @Nullable Schema schema, boolean forceDownload)
       throws Exception {
-    File indexDir = localMetadata.getIndexDir();
-    Preconditions.checkState(indexDir.isDirectory(), "Index directory: %s is not a directory", indexDir);
-
-    File parentFile = indexDir.getParentFile();
-    File segmentBackupDir =
-        new File(parentFile, indexDir.getName() + CommonConstants.Segment.SEGMENT_BACKUP_DIR_SUFFIX);
-
+    File indexDir = getSegmentDataDir(segmentName);
     try {
-      // First rename index directory to segment backup directory so that original segment have all file descriptors
-      // point to the segment backup directory to ensure original segment serves queries properly
+      // Create backup directory to handle failure of segment reloading.
+      createBackup(indexDir);
 
-      // Rename index directory to segment backup directory (atomic)
-      Preconditions.checkState(indexDir.renameTo(segmentBackupDir),
-          "Failed to rename index directory: %s to segment backup directory: %s", indexDir, segmentBackupDir);
-
-      // Download from remote or copy from local backup directory into index directory,
-      // and then continue to load the segment from index directory.
+      // Download segment from deep store if CRC changes or forced to download;
+      // otherwise, copy backup directory back to the original index directory.
+      // And then continue to load the segment from the index directory.
       boolean shouldDownload = forceDownload || !hasSameCRC(zkMetadata, localMetadata);
       if (shouldDownload && allowDownload(segmentName, zkMetadata)) {
         if (forceDownload) {
           LOGGER.info("Segment: {} of table: {} is forced to download", segmentName, _tableNameWithType);
         } else {
-          LOGGER.info("Download segment:{} of table: {} as local crc: {} mismatches remote crc: {}", segmentName,
+          LOGGER.info("Download segment:{} of table: {} as crc changes from: {} to: {}", segmentName,
               _tableNameWithType, localMetadata.getCrc(), zkMetadata.getCrc());
         }
         indexDir = downloadSegment(segmentName, zkMetadata);
       } else {
-        LOGGER.info("Reload the local copy of segment: {} of table: {}", segmentName, _tableNameWithType);
-        FileUtils.copyDirectory(segmentBackupDir, indexDir);
+        LOGGER.info("Reload existing segment: {} of table: {}", segmentName, _tableNameWithType);
+        try (SegmentDirectory segmentDirectory = initSegmentDirectory(segmentName, indexLoadingConfig)) {

Review comment:
       Let's say 
   
   indexDir = original segments dir (/segments/table/segment_name)
   backupDir = backup dir (/segmetns/table/segment_name.segment.bak)
   
   The previous logic was:
   ```
   1. copy backup dir -> index dir
   2. load index dir
   ```
   
   The current logic looks that:
   
   ```
   1. copy indexDir -> indexDir (but indexDir is empty so fall back to backup in `copyTo()` and this makes `backupDir -> indexDir`)
   2. load index dir
   ```
   
   I get the point the the logic is the same; however, I would prefer to see that you use `indexDir` and `indexBackupDir` variables directly in `reload()` function to have much better readability. I had to read all the sub-function details to understand the existing logic. How do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org