You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Piercarlo Slavazza <pi...@gmail.com> on 2022/05/06 13:09:39 UTC

How to run Garabge Collection of BlobStore in Jackrabbit Oak?

Hi,

using Oak version 1.42 (with java 17), I created a segment NodeStore, 
backed by a FileBlobStore, like this:

FileBlobStore fileBlobStore = new FileBlobStore("…");
SegmentGCOptions gcOptions = 
SegmentGCOptions.defaultGCOptions().setEstimationDisabled(true);
FileStore fileStore = FileStoreBuilder.
fileStoreBuilder(new File("…")).
withBlobStore(blobStore).
withGCOptions(gcOptions).
build();
Repository repository = new Jcr(new Oak(nodeStore)).createRepository()

Then, I create a blob and later I delete it.

The file associated to the blob is still on the file system, and, in my 
understanding, a garbage collection is needed in order to actually have 
it removed.

What's the proper way to run the Garbage Collector?

By reading the documentation - and taking into account that the whole 
thing is NOT run within an Osgi container - it seems that you should 
call MarkSweepGarbageCollector#collectGarbage(false): however, the blob 
is still on the file system.

In order to make the problem clear and reproducible, I made a tiny 
program and created a Github project 
<https://github.com/PiercarloSlavazza/oak-garbage-collection-test>.

By debugging step-by-step I found that in the "mark" phase (that is when 
the GC looks for blobs that are actually still "in use"), the deleted 
blob reference is still there, i.e. it is still present in the 
BinaryReferencesIndex; therefore, it is marked, and consequently, begin 
both marked and available, it is not even considered a "candidate" for 
sweeping.

I think that maybe this could have something to do with the way I add 
the blob: code follows, please refer to the above github for full context:

Node rootFolder = session.getRootNode();
Node fileNode = rootFolder.addNode(temporaryFile.getName(), "nt:file");
fileNode.addMixin("mix:referenceable");
Node fileContentNode = fileNode.addNode("jcr:content", "nt:resource");
fileContentNode.setProperty("jcr:data", "");
session.save();

Blob blob = nodeStore.createBlob(FileUtils.openInputStream(temporaryFile));
NodeBuilder rootBuilder = nodeStore.getRoot().builder();
NodeBuilder fileContentNodeBuilder = getNodeBuilder(fileContentNode, 
rootBuilder);
fileContentNodeBuilder.setProperty("jcr:data", blob);
nodeStore.merge(rootBuilder, EmptyHook.INSTANCE, CommitInfo.EMPTY);
session.save();

Any help will be greatly appreciated.

If you a solid solution/fix of my code, please consider posting it on 
Stackoverflow as answer to this question 
<https://stackoverflow.com/questions/72132489>.

Thanks,
Regards
Piercarlo