You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Wim Symons (JIRA)" <ji...@apache.org> on 2019/03/26 14:28:00 UTC

[jira] [Commented] (OAK-8170) oak-run datastorecheck and online consistency check falsely report missing blobs

    [ https://issues.apache.org/jira/browse/OAK-8170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801779#comment-16801779 ] 

Wim Symons commented on OAK-8170:
---------------------------------

Hi [~amitjain],

Thanks for your update. The innards of Oak are still a mystery to me.

So my question still stands: do (you think) we actually have missing blobs in our repository (from an old node revision)?

And if so, maybe are we doing things in the wrong order? Our current order is:

t1 - backup repository using AEM online backup

t2 - check consistency of backup segment store (oak-run check)

t3 - check consistency of backup segment store vs datastore (oak-run datastorecheck)

t4 - run online tail/full compaction

t5 - run DSGC on a shared S3 datastore

And we repeat this daily.

Should we change this?

In regards to the documentation, the last line on [https://jackrabbit.apache.org/oak/docs/plugins/blobstore.html]: 

 
{noformat}
The details on how to execute the command and the different parameters can be checked in the readme for the oak-run module.{noformat}
That is enough, but maybe the information about the head state should be added to the explanation of the --verbose option on [https://github.com/apache/jackrabbit-oak/tree/trunk/oak-run#oak-datastore-check]?

 

Hope to hear from you soon.

Kind regards

Wim

 

> oak-run datastorecheck and online consistency check falsely report missing blobs
> --------------------------------------------------------------------------------
>
>                 Key: OAK-8170
>                 URL: https://issues.apache.org/jira/browse/OAK-8170
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>    Affects Versions: 1.8.9
>            Reporter: Wim Symons
>            Priority: Major
>         Attachments: output.txt
>
>
> Hi,
> We found that oak-run datastorecheck falsely reports missing blobs when running datastorecheck without the --verbose option.
> Even the online datastore consistency check falsely reports the same missing blobs.
> This is related due to the fact that the standard blob reference collector in oak-run datastorecheck looks at *all* compaction generations in the segment store instead of only the last one.
> After running an offline compaction, and thus keeping only 1 generation, the correct number of blob references and missing blobs is reported by oak-run datastorecheck.
> The bug on the 1.8 branch comes from org.apache.jackrabbit.oak.plugins.blob.BlobReferenceRetriever#collectReferences (line 429) and by following that you arrive at org.apache.jackrabbit.oak.segment.file.FileStore#tarFiles (line 1013) stating:
> tarFiles.collectBlobReferences(collector,
>  newOldReclaimer(lastCompactionType, getGcGeneration(), gcOptions.getRetainedGenerations()));
> I'm not familiar enough with this source code, so I won't attempt adding a patch.
> I did double-check trunk and saw the same line of code there: org.apache.jackrabbit.oak.segment.file.GarbageCollector#collectBlobReferences (line 324).
> I attached a text file with the outputs of the commands I ran.
> We currently use Oak 1.8.9 using AEM 6.4.3.0 and oak-blob-cloud 1.8.9 from the 1.8.3 AEM S3 connector.
> Regards
> Wim



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)