You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Vikas Saurabh (JIRA)" <ji...@apache.org> on 2017/05/16 16:21:04 UTC

[jira] [Comment Edited] (OAK-6227) There should be a way to retrieve oldest timestamp to keep from nodestores

    [ https://issues.apache.org/jira/browse/OAK-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012656#comment-16012656 ] 

Vikas Saurabh edited comment on OAK-6227 at 5/16/17 4:20 PM:
-------------------------------------------------------------

[~mduerig],
bq. at least in the TarMK case the checkpoints already have a property created. See LockBasedScheduler.CPCreator#call
Nice.. so, for tar we can get that data even today.

For doc-mk, currently, the cp data ({{rv, expiry, <custom props>}}) are stored as siblings. AsyncIndexUpdate checkpoint already passes along a {{created}} property - so, maybe, we can do {{created}} in doc mk checkpoint call and remove it from async index update. [~chetanm], [~mreutegg], thoughts?

About rollback: [~mduerig],
bq. Regarding rollback: journal.log now includes time stamps. So everybody who knows what he is doing can already roll the repository back.
Do you mean that it's ok to delete blobs which could still be potentially referred from a past rev still listed in journal.log (as the person editing the journal already knows that it might be a tricky operation?)
OR, do you meant that the implementation of {{getOldestSafeTimestamp}} should read journal log to get the first timestamp? 
(I'm guessing you meant latter)


was (Author: catholicon):
@

> There should be a way to retrieve oldest timestamp to keep from nodestores
> --------------------------------------------------------------------------
>
>                 Key: OAK-6227
>                 URL: https://issues.apache.org/jira/browse/OAK-6227
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Vikas Saurabh
>            Assignee: Vikas Saurabh
>              Labels: datastore, performance
>             Fix For: 1.8
>
>
> For implementing OAK-2808 (eager/unsafe blob garbage collection approach), we need a way for nodestores to expost last safe timestamp such that blobs deleted before that timestamp can be eagerly collected (uniqueness of blob and that it won't be resurrected elsewhere is assumed to be guaranteed elsewhere e.g. OakDirectory's blobs have randomly generated bytes as content).
> What we want to ensure in this task is that the garbage collection shouldn't collect stuff that could still be retrieved back - for example checkpoints.
> [~chetanm] suggested that it might be an overkill to have this API in NodeStore - but maybe, it's ok to expose it in NodeStore mbean (where the impl specific mbeans known implementation detail of the nodestore to expose such data).
> The mbean just needs to expose the safe oldest timestamp (UTC epoch!?).
> Another thing that is potentially done in repositories (albeit not really supported afaik) is rolling back repository head state by say offline journal edit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)