You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "jasonk000 (via GitHub)" <gi...@apache.org> on 2023/05/13 03:35:09 UTC

[GitHub] [druid] jasonk000 commented on a diff in pull request #11878: Load cache files in parallel to speed up historical startup

jasonk000 commented on code in PR #11878:
URL: https://github.com/apache/druid/pull/11878#discussion_r1192909390


##########
server/src/main/java/org/apache/druid/server/coordination/SegmentLoadDropHandler.java:
##########
@@ -222,40 +224,37 @@ private void loadLocalCache()
       }
     }
 
-    List<DataSegment> cachedSegments = new ArrayList<>();
+    List<DataSegment> cachedSegments = Collections.synchronizedList(new ArrayList<>());
     File[] segmentsToLoad = baseDir.listFiles();
-    int ignored = 0;
-    for (int i = 0; i < segmentsToLoad.length; i++) {
-      File file = segmentsToLoad[i];
-      log.info("Loading segment cache file [%d/%d][%s].", i + 1, segmentsToLoad.length, file);
-      try {
-        final DataSegment segment = jsonMapper.readValue(file, DataSegment.class);
-
-        if (!segment.getId().toString().equals(file.getName())) {
-          log.warn("Ignoring cache file[%s] for segment[%s].", file.getPath(), segment.getId());
-          ignored++;
-        } else if (segmentCacheManager.isSegmentCached(segment)) {
-          cachedSegments.add(segment);
-        } else {
-          log.warn("Unable to find cache file for %s. Deleting lookup entry", segment.getId());
-
-          File segmentInfoCacheFile = new File(baseDir, segment.getId().toString());
-          if (!segmentInfoCacheFile.delete()) {
-            log.warn("Unable to delete segmentInfoCacheFile[%s]", segmentInfoCacheFile);
-          }
-        }
-      }
-      catch (Exception e) {
-        log.makeAlert(e, "Failed to load segment from segmentInfo file")
-           .addData("file", file)
-           .emit();
+    AtomicInteger ignored = new AtomicInteger();
+    AtomicInteger count = new AtomicInteger();
+    ExecutorService loadingExecutor = Execs.multiThreaded(config.getNumCacheLoadThreads(), "Segment-CacheFile-Startup-%s");

Review Comment:
   This pattern is a little confusing to me. It seems to me it would be simpler to create an executor and submit all `segmentsToLoad` tasks, instead of explicitly managing the queue. Was there a reason I missed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org