You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Michael Dürig (JIRA)" <ji...@apache.org> on 2016/01/09 00:39:39 UTC

[jira] [Commented] (OAK-3348) Cross gc sessions might introduce references to pre-compacted segments

    [ https://issues.apache.org/jira/browse/OAK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090206#comment-15090206 ] 

Michael Dürig commented on OAK-3348:
------------------------------------

At https://github.com/mduerig/jackrabbit-oak/commits/OAK-3348 I started implementing a POC for above approach for 2):

* Prevent back references by flushing segment node builders into 2 sets of segments: free and merged. A segment is free if it has been created by a builder and only references free segments. Otherwise a segment is merged. 

* When rebasing a builder during merge:
** Link to records in free segments and mark those segments as merged.
** Clone all records in cross gc merged segments before linking to them.  (Optimally there would be no such records (i.e. optimally all references  would point into free segments). Note: if this builder contains references to records in segments of other builders, those segments would also become merged along with all segments referencing them. 

I structured the commits such that it should be relatively easy to follow. See FIXME tags for what is still missing and what needs cleaning up.

cc [~frm], [~alex.parvulescu]



> Cross gc sessions might introduce references to pre-compacted segments
> ----------------------------------------------------------------------
>
>                 Key: OAK-3348
>                 URL: https://issues.apache.org/jira/browse/OAK-3348
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segmentmk
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: candidate_oak_1_0, candidate_oak_1_2, cleanup, compaction, gc
>             Fix For: 1.4
>
>         Attachments: OAK-3348-1.patch, OAK-3348-2.patch, OAK-3348.patch, cross-gc-refs.pdf, image.png
>
>
> I suspect that certain write operations during compaction can cause references from compacted segments to pre-compacted ones. This would effectively prevent the pre-compacted segments from getting evicted in subsequent cleanup phases. 
> The scenario is as follows:
> * A session is opened and a lot of content is written to it such that the update limit is exceeded. This causes the changes to be written to disk. 
> * Revision gc runs causing a new, compacted root node state to be written to disk.
> * The session saves its changes. This causes rebasing of its changes onto the current root (the compacted one). At this point any node that has been added will be added again in the sub-tree rooted at the current root. Such nodes however might have been written to disk *before* revision gc ran and might thus be contained in pre-compacted segments. As I suspect the node-add operation in the rebasing process *not* to create a deep copy of such nodes but to rather create a *reference* to them, a reference to a pre-compacted segment is introduced here. 
> Going forward we need to validate above hypothesis, assess its impact if necessary come up with a solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)