You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by GitBox <gi...@apache.org> on 2022/11/30 08:28:46 UTC

[GitHub] [jackrabbit-oak] amit-jain commented on a diff in pull request #771: OAK-9975: [DSGC] Report cummulative size of referenced blobs during Mark phase

amit-jain commented on code in PR #771:
URL: https://github.com/apache/jackrabbit-oak/pull/771#discussion_r1035663780


##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -405,13 +405,35 @@ protected void mark(GarbageCollectorFileState fs) throws IOException, DataStoreE
 
         // Mark all used references
         iterateNodeTree(fs, false);
-
+        
+        // Get size
+        sizeBlobStoreReferences(fs, stats);
+        
         // Move the marked references file to the data store meta area if applicable
         GarbageCollectionType.get(blobStore).addMarked(blobStore, fs, repoId, uniqueSuffix);
 
         LOG.debug("Ending mark phase of the garbage collector");
     }
 
+    private static void sizeBlobStoreReferences(GarbageCollectorFileState fs, GarbageCollectionOperationStats stats)
+        throws IOException {
+        try (LineIterator lineIterator = new LineIterator(new FileReader(fs.getMarkedRefs()))) {
+            lineIterator.forEachRemaining(line -> {
+                String id = line.split(DELIM)[0];

Review Comment:
   This method uses the same file that Mark phase uses where its essential to bail out if an error. Here though not very essential but not sure if a good idea to report inaccurate size. In rare case (as havent seen that kind of corrption in practice) that happens i think its ok to throw an error. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org