You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2016/06/09 09:50:21 UTC
[jira] [Commented] (OAK-3797) SegmentTracker#collectBlobReferences
should retain fewer SegmentId instances
[ https://issues.apache.org/jira/browse/OAK-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15322264#comment-15322264 ]
Alex Parvulescu commented on OAK-3797:
--------------------------------------
We've recently seen an issue with the queue mechanism where it would overflow the max capacity of the queue: {{Blob garbage collection failed: Sorry, deque too big}}.
I'm submitting a patch for review to severely reduce the working set of the queue by enforcing the de-duplication of recordids early, at {{queue#add}} time instead of at processing time. this makes an important difference as the queue size could explode by adding each non-processed's references that may already be there, just to skip them later as the have been processed in the meantime.
Bonus points, turned the {{Queue<SegmentId}} into a {{Queue<UUID>}}.
Patch is for {{segmentmk}} because that's where the ongoing problem is, but it should also be applied to {{segment-tar}}.
[~mduerig] [~amitjain] feedback appreciated!
> SegmentTracker#collectBlobReferences should retain fewer SegmentId instances
> ----------------------------------------------------------------------------
>
> Key: OAK-3797
> URL: https://issues.apache.org/jira/browse/OAK-3797
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-tar, segmentmk
> Reporter: Michael Dürig
> Assignee: Alex Parvulescu
> Labels: datastore, gc
> Fix For: 1.6
>
> Attachments: OAK-3797-segmentmk.patch
>
>
> {{SegmentTracker#collectBlobReferences}} currently keeps a queue of yet unprocessed {{SegmentId}} instances internally. This potentially impacts the system as those instances are also tracked in the segment tracker's segment id tables. I think we should improve the implementation to not retain so many {{SegmentId}} instances and rely on arrays of {{msb}}, {{lsb}} instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)