You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Tomek Rękawek (JIRA)" <ji...@apache.org> on 2015/08/27 13:54:46 UTC

[jira] [Commented] (OAK-3148) Online migration process for the binaries

    [ https://issues.apache.org/jira/browse/OAK-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716524#comment-14716524 ] 

Tomek Rękawek commented on OAK-3148:
------------------------------------

I prepared a patch with this feature. I tested it with the recent AEM and it took 10 minutes to migrate blobs from {{FileDataStore}} to {{FileBlobStore}}. More information can be found in the following migration scenario for an OSGi-based installation:

*Requirements*
* An OSGi-based Oak installation (eg. Sling or AEM).
* Node store should be configured to use an external blob store (customBlobStore=true).

*1. Enabling SplitBlobStore*
Steps:
# Add {{split.blobstore=old}} OSGi property to the source blob store.
# Configure the destination blob store and add {{split.blobstore=new}} property to its OSGi configuration.
# Create a configuration for the {{org.apache.jackrabbit.oak.spi.blob.osgi.SplitBlobStoreService}}.
#* It may be empty or contain just one parameter:
{code}
repository.home=crx-quickstart/repository
{code}
#* The directory is used to save the {{migrated_blobs.txt}} file.
# (optional) Restart the instance

After starting the instance, the {{SplitBlobStoreService}} will wait until blob stores with {{split.blobstore}} properties (the {{old}} and the {{new}}) are available. They will be bound and the {{SplitBlobStore}} will be registered in the OSGi. On the other hand, the {{NodeStoreService}} will ignore blob stores configured with the {{split.blobstore}} property and will wait until the {{SplitBlobStore}} is available.

>From this point, all the new blobs will be saved in the new blob store. Binaries from the old blob store will be available to read.

The {{split.blobstore}} property support was added to {{FileBlobStore}}, {{AbstractDataStoreService}} (handling all Jackrabbit data stores), {{DocumentNodeStoreService}} and {{SegmentNodeStoreService}}.

*2. Migration*
Steps:
# Find BlobMigration JMX bean in the Felix console.
# Run {{startBlobMigration(false)}} operation

The migration can be stopped using {{stopBlobMigration()}} and then resumed with {{startBlobMigration(true)}}. The current stats are available via the JMX as well:

* last processed path,
* number of migrated nodes.

*3. Switching to the new blob store*
When the migration is finished, it's possible to completely switch to the new blob store:
# Remove the configuration for the old blob store.
# Remove the configuration for the {{SplitBlobStoreService}}
# Remove the {{split.blobstore=new}} OSGi property from the new blob store, so it can be find by the {{NodeStoreService}}.
# (optional) Restart the instance, so there are no JCR sessions bound to the old {{NodeState}}.

Migration is complete!

> Online migration process for the binaries
> -----------------------------------------
>
>                 Key: OAK-3148
>                 URL: https://issues.apache.org/jira/browse/OAK-3148
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob, upgrade
>            Reporter: Tomek Rękawek
>            Priority: Minor
>
> For clients that want to migrate their blob stores, let's add a new feature that allows copy them in the background.
> AC:
> # SplitBlobStore
> ## Administrator can configure Oak to use the {{SplitBlobStore}} that references the source (old) and the destination (new) blob store.
> ## Data stores can be used as well via the {{DataStoreBlobStore}}.
> ## On the read operation, if the requested blob exists on the new store, SplitBlobStore will return it.
> ## Otherwise, SplitBlobStore will try to read the blob from the old store.
> ## All write requests will be directed to the new blob store.
> # Copy process
> ## Administrator can start, stop and resume the copy process using JMX command.
> ## Administrator can see the progress in JMX and logs
> ## The process will read the {{SplitBlobStore}} configuration and copy the binaries from source to destination
> ## Once a binary is moved, its reference in the {{NodeStore}} is updated and commited.
> ## Only the head revision has to be updated.
> The idea is that after all binaries are copied, the old revisions will be gradually removed by the compaction mechanisms and then binaries will be removed from the source store by the blob garbage collector. Future improvements are possible, eg. to invoke the compaction and GC manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)