You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2013/12/04 13:22:36 UTC

[jira] [Commented] (OAK-377) Data store garbage collection

    [ https://issues.apache.org/jira/browse/OAK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838849#comment-13838849 ] 

Thomas Mueller commented on OAK-377:
------------------------------------

> Does it make sense to make the DSGC specific to MKs?

Yes, that makes sense. In Jackrabbit 2.x, the garbage collection algorithm was traversing the tree though all nodes, but it turned out this is much slower than reading all nodes in the order they are stored in the backend storage. I will investigate how fast this is for the MongoMK case, as this is relatively simple to implement. If it is still too slow, an special data structure or index could be added, but that would require some changes in the MongoMK (updating that data structure), which I would like to avoid because it is added complexity.

> Data store garbage collection
> -----------------------------
>
>                 Key: OAK-377
>                 URL: https://issues.apache.org/jira/browse/OAK-377
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 0.13
>
>
> Unused binaries in the data store need to be garbage collected.
> There is a partial implementation in oak-mk, however it is currently not run (not run automatically, and I think there is no way to run it manually).
> Also, we might want to investigate in faster garbage collection algorithms: young generation garbage collection, or garbage collection using reference counting (for example using an index of references to the data store).



--
This message was sent by Atlassian JIRA
(v6.1#6144)