You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Piercarlo Slavazza (Jira)" <ji...@apache.org> on 2022/05/07 13:15:00 UTC

[jira] [Commented] (OAK-9765) Garbage Collection does not remove blobs file from the file system

    [ https://issues.apache.org/jira/browse/OAK-9765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533263#comment-17533263 ] 

Piercarlo Slavazza commented on OAK-9765:
-----------------------------------------

I would like to add comments derived from some more tests that I did:
 * I had this (naive?) idea of retrieving the Blobs in the NodeStore by searching via a SQL2 query that searches for nodes that have {{{}jcr:data{}}}: this way, the used Blobs are properly marked, and eventually swept by the GC
 * I also found another issue: the {{MarkSweepGarbageCollector}} raises an error when there are no Blobs marked; this seems wrong to me… maybe I miss some piece of the while thing?

I added these patches to the Github repo linked above.

If someone could confirm the fitness of my findings/solution, I would be happy to write a patch and a PR (in case, I would need some help in understanding the best way to write unit tests).

> Garbage Collection does not remove blobs file from the file system
> ------------------------------------------------------------------
>
>                 Key: OAK-9765
>                 URL: https://issues.apache.org/jira/browse/OAK-9765
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>    Affects Versions: 1.42.0
>            Reporter: Piercarlo Slavazza
>            Priority: Blocker
>
> Using a NodeStore backed by a FileStore, with a blob store of type FileBlobStore:
>  # (having configured GC with estimation {_}disabled{_})
>  # a file is added as a blob
>  # then the node where the blob is references is _removed_
>  # then the GC is run
>  # expected behaviour: the node is no more accessible, _and_ no chunk of the blob is present on the file system
>  # actual behaviour: the node is no more accessible BUT all the chunks are still present on the file system
> Steps to reproduce: execute the (really tiny) main in [https://github.com/PiercarloSlavazza/oak-garbage-collection-test/] (instructions in the readme)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)