You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Tomek Rękawek (JIRA)" <ji...@apache.org> on 2018/05/14 08:00:00 UTC

[jira] [Updated] (OAK-7339) Fix all sidegrades breaking with UnsupportedOperationException on MissingBlobStore by introducing LoopbackBlobStore

     [ https://issues.apache.org/jira/browse/OAK-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tomek Rękawek updated OAK-7339:
-------------------------------
    Fix Version/s: 1.8.4

> Fix all sidegrades breaking with UnsupportedOperationException on MissingBlobStore by introducing LoopbackBlobStore
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7339
>                 URL: https://issues.apache.org/jira/browse/OAK-7339
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: upgrade
>    Affects Versions: 1.6.0, 1.8.0
>            Reporter: Arek Kita
>            Assignee: Tomek Rękawek
>            Priority: Major
>              Labels: candidate_oak_1_2, candidate_oak_1_4, candidate_oak_1_6, candidate_oak_1_8
>             Fix For: 1.9.0, 1.10, 1.8.4
>
>         Attachments: OAK-7339-jenkins-xml-encoding-issue.patch, OAK-7339.patch
>
>
> h4. Problem
> In some edge cases when the binary under the same path (/content/asset1) is modified by 2 independent checkpoints: A & B the sidegrade without providing DataStore might fail with the following error:
> {noformat:title=An exception thrown by oak-upgrade tool}
> Caused by: java.lang.UnsupportedOperationException: null
>     at org.apache.jackrabbit.oak.upgrade.cli.blob.MissingBlobStore.getInputStream(MissingBlobStore.java:62)
>     at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:276)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:86)
>     at org.apache.jackrabbit.oak.plugins.memory.AbstractBlob$1.openStream(AbstractBlob.java:44)
>     at com.google.common.io.ByteSource.contentEquals(ByteSource.java:344)
>     at org.apache.jackrabbit.oak.plugins.memory.AbstractBlob.equal(AbstractBlob.java:67)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.equals(SegmentBlob.java:227)
>     at com.google.common.base.Objects.equal(Objects.java:60)
>     at org.apache.jackrabbit.oak.plugins.memory.AbstractPropertyState.equal(AbstractPropertyState.java:59)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentPropertyState.equals(SegmentPropertyState.java:242)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareProperties(SegmentNodeState.java:617)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:511)
> (the same nested methods)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:604)
>     at org.apache.jackrabbit.oak.upgrade.PersistingDiff.diff(PersistingDiff.java:139)
>     at org.apache.jackrabbit.oak.upgrade.PersistingDiff.childNodeChanged(PersistingDiff.java:191)
>     at org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:440)
>     at org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:483)
>     at org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:432)
>     at org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:604)
>     at org.apache.jackrabbit.oak.upgrade.PersistingDiff.diff(PersistingDiff.java:139)
>     at org.apache.jackrabbit.oak.upgrade.PersistingDiff.applyDiffOnNodeState(PersistingDiff.java:106)
>     at org.apache.jackrabbit.oak.upgrade.RepositorySidegrade.copyDiffToTarget(RepositorySidegrade.java:403)
>     at org.apache.jackrabbit.oak.upgrade.RepositorySidegrade.migrateWithCheckpoints(RepositorySidegrade.java:347)
> {noformat}
>  
> h4. Abstract of proposed solution
> The idea for migration is simple: instead of failing on:
> {code:java}
> public InputStream getInputStream(String blobId) throws IOException;
> {code}
> or
> {code:java}
> public int readBlob(String blobId, long pos, byte[] buff, int off, int length) throws IOException;
> {code}
> lets introduce a BlobStore implementation that acts similarly as a *localhost* interface that what you sent it will resend back to this interface.
> h4. How it works
> It works as a *localhost* interface, the same way: when *{{blobId}}* is requested... then *{{blobId}}* is served as a binary content instead *of throwing*: {{UnsupportedOperationException}}.
> This allows to act quickly on migrations that requires to compare binaries in order to satisfy requirements for checkpoints to be rewritten, copied from scratch.
> h4. Pros
>  * simplifies simple sidegrade migration use cases: you do not need anymore to include your DataStore (which slows that migration not necessary) on the command line when the migration is failing
>  * speeds up the migration as it doesn't require to reference BlobStore implementation in cases where binary references are only copied together with NodeStore inlined binaries
>  * you're always copying checkpoints which means (no need anymore for {{--skip-checkpoints}} option) that no full re-indexes are happening anymore after migration on migrated repository
> h4. Cons and risks
>  * *Low risk:* not visible effect to user that for a specific migration a DataStore is needed (whether you are running into this specific edge case)
>  * *Medium risk:* NodeStore storage overhead for checkpoints if compared binaries across checkpoints have different blob IDs (in example different algorithms SHA256 vs SHA512). *This we'll lead in comparison to not equal evaluation and the node will be rewritten for the checkpoint.*
>  * *Low risk:* Currently we're accepting in {{readBlob}} requests to copy smaller binaries whilst the caller expects higher length of binary (it might be the case if {{getBlobLength}} is not used for some reasons apriori to the {{readBlob}} and the original length of the binary (that is really placed in real DataStore) is kept somewhere in cache. The API here informs anyway how many bytes were read and it recommends to caller to check for that value.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)