You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by ch...@apache.org on 2017/03/23 10:26:56 UTC

svn commit: r1788213 - in /jackrabbit/oak/trunk/oak-doc/src/site: ./ markdown/ markdown/nodestore/ markdown/nodestore/document/

Author: chetanm
Date: Thu Mar 23 10:26:55 2017
New Revision: 1788213

URL: http://svn.apache.org/viewvc?rev=1788213&view=rev
Log:
OAK-5918 - Document enhancements in DocumentNodeStore in 1.6
OAK-4180 - Use another NodeStore as a local cache for a remote Document store

Document Secondary NodeStore feature

Added:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.png   (with props)
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.puml
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.png   (with props)
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.puml
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.md
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.png   (with props)
Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/osgi_config.md
    jackrabbit/oak/trunk/oak-doc/src/site/site.xml

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.png
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.png?rev=1788213&view=auto
==============================================================================
Binary file - no diff available.

Propchange: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.puml
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.puml?rev=1788213&view=auto
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.puml (added)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-read.puml Thu Mar 23 10:26:55 2017
@@ -0,0 +1,43 @@
+@startuml
+
+title Read flow for Secondary NodeStore
+autonumber
+hide footbox
+
+participant "NodeStore\nClient" as NS
+participant "Document\nNodeStore" as DNS
+database "Cache"
+database "Secondary Store" as SS
+database Mongo
+
+NS -> DNS : Read /a/b@r1
+
+DNS -> SS : Read /a/b@r1
+alt Found in Secondary Store
+  SS -> DNS
+  note over Cache
+     Secondary Store has nodes under
+     /a/b for revisions <= r1
+  end note
+  DNS -> NS : Read done from \nSecondary Store
+  note left
+      Further reads
+      of children under
+      /a/b would be done
+      from Secondary Store
+  end note
+else Not in Secondary Store
+  SS -> DNS :
+  DNS -> Cache : Try to read from cache \n else load from Mongo
+  alt Found in cache
+   Cache -> DNS    : /a/b@r1 already in cache
+   DNS -> NS       : Read done \n from memory cache
+   else Not found in cache
+    Cache -> Mongo : Read 2:/a/b from Mongo
+    Mongo -> Cache : Return 2:/a/b
+    Cache -> DNS   : Construct /a/b@r1\n and cache it
+    DNS -> NS      : Read done \n from Mongo
+    end
+end
+
+@enduml
\ No newline at end of file

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.png
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.png?rev=1788213&view=auto
==============================================================================
Binary file - no diff available.

Propchange: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.puml
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.puml?rev=1788213&view=auto
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.puml (added)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store-write.puml Thu Mar 23 10:26:55 2017
@@ -0,0 +1,41 @@
+@startuml
+
+title Write or Update flow for Secondary NodeStore
+autonumber
+
+
+participant "NodeStore\nClient" as NS
+participant "Document\nNodeStore" as DNS
+database "Cache"
+database "Secondary Store" as SS
+database Mongo
+
+NS -> DNS : Write /a/b@r1
+
+DNS -> Mongo : Write /a/b@r1
+DNS -> Cache : Cache /a/b@r1
+DNS -> NS    : Commit completed
+
+...
+autonumber 1
+== Local Change Event==
+DNS -> SS    : Content changed event /a/b
+SS ->  SS    : Commit changes \n to local store and \n update head revision
+
+...
+autonumber 1
+== External Change Event==
+DNS -> Mongo : Background read for head revision
+DNS -> Mongo : Read Journal entry for changes \n done from other cluster nodes
+Mongo -> DNS : Changed paths
+DNS -> SS    : Content changed event for external changes
+SS ->  SS    : Commit changes \n to local store and \n update head revision
+
+...
+autonumber 1
+== Startup Sync==
+DNS -> SS    : Inform observer about current head revision at startup
+SS  -> DNS   : Perform diff between local store \n head revision and current head revision
+SS ->  SS    : Apply diff \n to local store and \n update head revision
+
+@enduml
\ No newline at end of file

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.md?rev=1788213&view=auto
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.md (added)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.md Thu Mar 23 10:26:55 2017
@@ -0,0 +1,179 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+  -->
+  
+# <a name="secondary-node-store"></a> Secondary NodeStore
+ 
+* [Secondary NodeStore](#secondary-node-store)
+   * [Read Flow](#read-flow)
+   * [Write Flow](#write-flow)
+       * [Local Changes](#write-flow-local-changes)
+       * [External Changes](#write-flow-external-changes)
+       * [Startup Sync](#write-flow-startup-sync)
+   * [Setup ](#usage)
+   * [Setup Considerations](#setup-considerations)
+   * [Administration](#administration)
+       * [Maintenance](#secondary-store-maintenance)
+       * [New Cluster Member](#secondary-store-cluster)
+         
+`@since Oak 1.6`
+`Experimental Feature`
+ 
+Compared to SegmentNodeStore DocumentNodeStore has higher latency for reads for the data not present in the cache. 
+This happens due to multiple round trips required to serve a hierarchical read access over remote storage. 
+For e.g. reading content of path _/content/assets/nature/sunrise.jpg_ would require around 4 remote calls if the path 
+content is not present in local cache. 
+[Persistent Cache](../persistent-cache.html) helped in improving this by enabling caching lot more content off heap 
+compared to limited inmemory cache. 
+ 
+With new [Secondary NodeStore][OAK-1312] support its now possible to configure a SegmentNodeStore as a secondary store to 
+store content under certain set of paths locally. 
+SegmentNodeStore act a local copy of remote repository (secondary store) more like a local git repo which gets updated 
+from primary store (remote Mongo storage) via observation. 
+Writes are still routed to primary store but reads can be served from local secondary store.
+  
+![Secondary Store Setup](secondary-store.png)
+
+In above setup 2 Oak Cluser nodes connect to same Mongo server. In each Oak instance a SegmentNodeStore is configured
+as secondary store. This store gets updated by observer.
+
+**Experimental Feature**
+
+This feature is currently experimental. Following feature item is currently pending
+
+* [OAK-5352][OAK-5352] - Support for maintenance task for secondary NodeStore
+
+## <a name="read-flow"></a> Read Flow
+
+Reading /a/b at revision r1 would happen like below
+
+![Secondary Store Read Flow](secondary-store-read.png)  
+
+
+Key points here
+
+* Secondary NodeStore can be configured to only stored content under certain paths
+* Read would be first attempted from any configured secondary NodeStore. 
+    1. Secondary NodeStore would check if it stores content for that path. Note that it can be configured with path 
+       inclusions
+    2. It then checks whether its root revision is later than revision at which read is being requested
+    3. If NodeState at given path and revision is found then its returned 
+* If read is not possible from secondary NodeStore then read is done from in memory which may in turn read from remote
+   Mongo in case of cache miss
+* If read is successful from Secondary NodeStore which is based on SegmentNodeStore then further child read from 
+   that path would be directly handled by SegmentNodeStore by passing DocumentNodeStore altogether. So if /a/b@r1 is 
+   found in secondary then read for /a/b/c@r1 would be directly handled by secondary store
+    
+Note that even if root revision of secondary store is lagging behind current head its possible that read for /a/b can
+be handled by secondary store if /a has not been modified recently. So those parts of repo which have not been recently 
+modified would most likely be served from Secondary NodeStore and avoid remote calls to Mongo.
+ 
+## <a name="write-flow"></a> Write Flow
+
+Updates to secondary store happen in 3 ways
+
+![Secondary Store Write Flow](secondary-store-write.png)  
+
+Key points here
+
+* Writes done by NodeStore caller i.e. JCR layer are always done on primary store i.e. Mongo
+* Secondary NodeStore is updated via Oak Observation support and NodeState diff
+* Secondary NodeStore can be configured with path filter and then it would only be interested in 
+  changes for configured paths
+
+
+### <a name="write-flow-local-changes"></a> Local Changes
+
+For local changes done on that cluster nodes the writes are applied as part of Observation call where DocumentNodeStore
+send content change callback to all registered observers. Here Secondary NodeStore registers itself as an Observer
+and listed for such callback.
+
+So upon any local change it gets a callback with latest state of root paths. There it performs a diff between local 
+head revision and new head revision and applies the changes onto local store
+
+### <a name="write-flow-external-changes"></a> External Changes
+
+DocumentNodeStore periodically performs [background reads](../documentmk.html#bg-read) to pickup changes from other
+cluster node. Such a change is then pushed to registered observer as an external change. Secondary NodeStore uses 
+same flow as for local changes to update its state.
+
+This diff based update mechanism would then only read content from remote for the changed paths and further only for
+paths in which secondary NodeStore is interested
+
+### <a name="write-flow-startup-sync"></a> Startup Sync
+
+If the cluster node is shutdown and later started then at time of start the secondary NodeStore would try to synchronize
+its state with remote storage again by performing diff between local head revision and remote head revision. This
+is done asynchronously and does not block the startup
+
+
+## <a name="usage"></a> Setup 
+
+For enabling this feature following OSGi configurations have to be done
+
+**1. Configure SegmentNodeStore in secondary role**
+
+Create an OSGi config file `org.apache.jackrabbit.oak.segment.SegmentNodeStoreFactory-secondary.config` with following
+content
+
+    role="secondary"
+    
+This would create a SegmentNodeStore in secondary role and uses default `segmentstore-secondary` directory to store 
+the segment files. Refer to [config options](../../osgi_config.html#config-SegmentNodeStoreService) for more details.
+Note all the options for `SegmentNodeStoreService` are applicable for `SegmentNodeStoreFactory`
+
+**2. Configure SecondaryStoreCacheService (optional)**
+
+By default secondary NodeStore would be activated based on previous config only. However it can be tweaked further
+by creating an OSGi config file `org.apache.jackrabbit.oak.plugins.document.secondary.SecondaryStoreCacheService.config`
+
+    includedPaths=[ \
+      "/libs", \
+      "/apps", \
+      "/content"
+      ]
+
+Above config would enable secondary NodeStore for paths '/libs, /apps and /content'
+
+## <a name="setup-considerations"></a> Setup Considerations
+
+While enabling secondary NodeStore feature following aspects needs to be considered
+
+* SegmentNodeStore used as secondary NodeStore would compete with system resource like memory along with 
+  in memory caches of DocumentNodeStore and Lucene index files. So system must have sufficient memory to
+  for all these 3 components
+* SegmentNodeStore can be copied from any existing cluster node to a new node. 
+* If this is being enabled for existing setup then initial sync would take some time. So take that into account
+  while planning to enable this feature
+* For best performance include those paths of the repository which are accessed by end user. Specially those
+  content paths where read to write ratio is high.
+
+## <a name="administration"></a> Administration
+
+### <a name="secondary-store-maintenance"></a> Maintenance
+
+Certain maintenance like [online RevisionGC](../segment/overview.html#garbage-collection) for secondary 
+NodeStore i.e. SegmentNodeStore need to be enabled. (This feature is currently pending [OAK-5352][OAK-5352]).
+
+This would ensure that older revision gets garbage collected
+
+### <a name="secondary-store-cluster"></a> New Cluster Member
+
+If a new Oak server is joined to cluster then it should be done by cloning the secondary NodeStore from some existing
+cluster member otherwise the initial sync would take a long time to complete
+
+[OAK-5352]: https://issues.apache.org/jira/browse/OAK-5352
+[OAK-1312]: https://issues.apache.org/jira/browse/OAK-1312

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.png
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.png?rev=1788213&view=auto
==============================================================================
Binary file - no diff available.

Propchange: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/document/secondary-store.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md?rev=1788213&r1=1788212&r2=1788213&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md Thu Mar 23 10:26:55 2017
@@ -51,6 +51,7 @@ The document storage optionally uses the
 ## <a name="new-1.6"></a> New in 1.6
 
 * [Node Bundling](#node-bundling)
+* [Secondary Store](#secondary-store)
 
 
 ## <a name="backend-implementations"></a> Backend implementations
@@ -617,6 +618,12 @@ example unlocks an upgrade to 1.8 with a
 Please note that unlocking an upgrade is only possible when all cluster nodes
 are inactive, otherwise the command will refuse to change the format version.
 
+## <a name="secondary-store"></a> Secondary Store
+
+`@since Oak 1.6`
+
+Refer to [Secondary Store](document/secondary-store.html)
+
 [1]: http://docs.mongodb.org/manual/core/read-preference/
 [2]: http://docs.mongodb.org/manual/core/write-concern/
 [3]: http://docs.mongodb.org/manual/reference/connection-string/#read-preference-options

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/osgi_config.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/osgi_config.md?rev=1788213&r1=1788212&r2=1788213&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/osgi_config.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/osgi_config.md Thu Mar 23 10:26:55 2017
@@ -49,7 +49,7 @@ This implementation is the newest and th
 The second and last configuration, identified by `org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService`, refers to the old implementation of the Node Store provided by the `oak-segment` bundle.
 This implementation has been deprecated, will not receive any further improvements and should not be used, if possible.
 
-##### org.apache.jackrabbit.oak.segment.SegmentNodeStoreService
+##### <a name="config-SegmentNodeStoreService"></a> org.apache.jackrabbit.oak.segment.SegmentNodeStoreService
 
 repository.home (string) - repository
 : A path on the file system where repository data will be stored.

Modified: jackrabbit/oak/trunk/oak-doc/src/site/site.xml
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/site.xml?rev=1788213&r1=1788212&r2=1788213&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/site.xml (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/site.xml Thu Mar 23 10:26:55 2017
@@ -41,6 +41,7 @@ under the License.
       <item href="nodestore/overview.html" name="Node Storage" collapse="false">
         <item href="nodestore/documentmk.html" name="Document NodeStore" collapse="false">
           <item href="nodestore/document/node-bundling.html" name="Node Bundling" />
+          <item href="nodestore/document/secondary-store.html" name="Secondary Store" />
           <item href="nodestore/persistent-cache.html" name="Persistent Cache" />
           <item href="clustering.html" name="Clustering" />
         </item>