You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Amit Jain (JIRA)" <ji...@apache.org> on 2014/02/04 06:24:09 UTC

[jira] [Commented] (OAK-377) Data store garbage collection

    [ https://issues.apache.org/jira/browse/OAK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890396#comment-13890396 ] 

Amit Jain commented on OAK-377:
-------------------------------

Have the implementation ready for an external mark and sweep blob garbage collector using DocumentMK. The gc works by iterating all the nodes for identifying the referenced blobs and then calculating the set difference with blobs in the blob store and then deleting them. Uses the existing BlobReferenceIterator for iterating the tree.

High level change log as follows:
* Interface added - BlobGarbageCollector 
** public void garbageCollect(NodeStore nodeStore) throws Exception;
* GC Implementation - MarkSweepGarbageCollector
** public void garbageCollect(NodeStore nodeStore) throws Exception;
** protected void markAndSweep() throws Exception;
** protected void mark() throws Exception;
** protected void sweep() throws Exception;
* Helper Class - GarbageCollectorFileState
* Added the following methods to the GarbageCollectableBlobStore:
** Iterator<String> getAllChunkIds(long maxLastModifiedTime) throws Exception;
** boolean deleteChunk(String chunkId) throws Exception;
** Iterator<String> resolveChunks(String blobId) throws IOException;
* Added resolveChunks() implementation to AbstractBlobStore
* Added implementations for deleteChunk() and getAllChunkIds() to the following BlobStore implementations:
** FileBlobStore
** MemoryBlobStore
** DbBlobStore
** MongoBlobStore
** RDBBlobStore
** CloudBlobStore - OAK-1157
** DataStoreBlobStore - OAK-1157

> Data store garbage collection
> -----------------------------
>
>                 Key: OAK-377
>                 URL: https://issues.apache.org/jira/browse/OAK-377
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, mk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 0.16
>
>
> Unused binaries in the data store need to be garbage collected.
> There is a partial implementation in oak-mk, however it is currently not run (not run automatically, and I think there is no way to run it manually).
> Also, we might want to investigate in faster garbage collection algorithms: young generation garbage collection, or garbage collection using reference counting (for example using an index of references to the data store).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)