You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by am...@apache.org on 2016/07/21 06:24:56 UTC

svn commit: r1753641 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md

Author: amitj
Date: Thu Jul 21 06:24:56 2016
New Revision: 1753641

URL: http://svn.apache.org/viewvc?rev=1753641&view=rev
Log:
OAK-301: Document Oak

Documentation for blob GC #checkConsistency and #getGlobalGCStatus

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md?rev=1753641&r1=1753640&r2=1753641&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/plugins/blobstore.md Thu Jul 21 06:24:56 2016
@@ -144,6 +144,9 @@ The garbage collection can be triggered
 
 * `MarkSweepGarbageCollector#collectGarbage()` (Oak 1.0.x)
 * `MarkSweepGarbageCollector#collectGarbage(false)` (Oak 1.2.x)
+* If the MBeans are registered in the MBeanServer then the following can also be used to trigger GC:
+    * `BlobGC#startBlobGC()` which takes in a `markOnly` boolean parameter to indicate mark only or complete gc
+
  
 #### Shared DataStore Blob Garbage Collection (Since 1.2.0)
 
@@ -175,6 +178,105 @@ The shared DataStore garbage collection
 * SharedS3DataStore - Extends the S3DataStore to enable sharing of the data store with
                         multiple repositories                        
  
+##### Checking GC status for Shared DataStore Garbage Collection
+
+The status of the GC operations on all the repositories connected to the DataStore can be checked by calling:
+
+* `MarkSweepGarbageCollector#getStats()` which returns a list of `GarbageCollectionRepoStats` objects having the following fields:
+    * repositoryId - The repositoryId of the repository
+    * local - Indicates whether the repositoryId is of local instance where the operation ran
+    * startTime - Start time of the mark operation on the repository
+    * endTime - End time of the mark operation on the repository
+    * length - Size of the references file created
+    * numLines - Number of references available
+* If the MBeans are registered in the MBeanServer then the following can also be used to retrieve the status:
+    * `BlobGC#getBlobGCStatus()` which returns a CompositeData with the above fields.
+    
+This operation can also be used to ascertain when the 'Mark' phase has executed successfully on all the repositories, as part of the steps to automate the GC in the Shared DataStore configuration.
+It should be a sufficient condition to check that the references file is available on all repositories.
+If the server running Oak has remote JMX connection enabled the following code example can be used to connect remotely and check if the mark phase has concluded on all repository instances.
+
+
+```java
+import java.util.Hashtable;
+
+import javax.management.openmbean.TabularData;
+import javax.management.MBeanServerConnection;
+import javax.management.MBeanServerInvocationHandler;
+import javax.management.ObjectName;
+import javax.management.remote.JMXConnectorFactory;
+import javax.management.remote.JMXServiceURL;
+import javax.management.openmbean.CompositeData;
+
+
+/**
+ * Checks the status of the mark operation on all instances sharing the DataStore.
+ */
+public class GetGCStats {
+
+    public static void main(String[] args) throws Exception {
+        String userid = "<user>";
+        String password = "<password>";
+        String serverUrl = "service:jmx:rmi:///jndi/rmi://<host:port>/jmxrmi";
+        String OBJECT_NAME = "org.apache.jackrabbit.oak:name=Document node store blob garbage collection,type=BlobGarbageCollection";
+        String[] buffer = new String[] {userid, password};
+        Hashtable<String, String[]> attributes = new Hashtable<String, String[]>();
+        attributes.put("jmx.remote.credentials", buffer);
+        MBeanServerConnection server = JMXConnectorFactory
+            .connect(new JMXServiceURL(serverUrl), attributes).getMBeanServerConnection();
+        ObjectName name = new ObjectName(OBJECT_NAME);
+        BlobGCMBean gcBean = MBeanServerInvocationHandler
+            .newProxyInstance(server, name, BlobGCMBean.class, false);
+
+        boolean markDone = checkMarkDone("GlobalMarkStats", gcBean.getGlobalMarkStats());
+        System.out.println("Mark done on all instances - " + markDone);
+    }
+
+    public static boolean checkMarkDone(String operation, TabularData data) {
+        System.out.println("-----Operation " + operation + "--------------");
+
+        boolean markDoneOnOthers = true;
+        try {
+            System.out.println("Number of instances " + data.size());
+
+            for (Object o : data.values()) {
+                CompositeData row = (CompositeData) o;
+                String repositoryId = row.get("repositoryId").toString();
+                System.out.println("Repository  " + repositoryId);
+
+                if ((!row.containsKey("markEndTime")
+                        || row.get("markEndTime") == null
+                        || row.get("markEndTime").toString().length() == 0)) {
+                    markDoneOnOthers = false;
+                    System.out.println("Mark not done on repository : " + repositoryId);
+                }
+            }
+        } catch (Exception e) {
+            System.out.println(
+                "-----Error during operation " + operation + "--------------" + e.getMessage());
+        }
+        System.out.println("-----Completed " + operation + "--------------");
+
+        return markDoneOnOthers;
+    }
+}
+```
+
+#### Consistency Check
+The data store consistency check will report any data store binaries that are missing but are still referenced. The consistency check can be triggered by:
+
+* `MarkSweepGarbageCollector#checkConsistency` 
+* If the MBeans are registered in the MBeanServer then the following can also be used:
+    * `BlobGCMbean#checkConsistency`
+
+After the consistency check is complete, a message will show the number of binaries reported as missing. If the number is greater than 0, check the logs configured for `org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector` for more details on the missing binaries. 
+
+Below is an example of how the missing binaries are reported in the logs:
+>
+> 11:32:39.673 INFO [main] MarkSweepGarbageCollector.java:600 Consistency check found [1] missing blobs
+> 11:32:39.673 WARN [main] MarkSweepGarbageCollector.java:602 Consistency check failure in the the blob store : DataStore backed BlobStore [org.apache.jackrabbit.oak.plugins.blob.datastore.OakFileDataStore], check missing candidates in file /tmp/gcworkdir-1467352959243/gccand-1467352959243
+
+
 
 [1]: http://serverfault.com/questions/52861/how-does-dropbox-version-upload-large-files
 [2]: http://wiki.apache.org/jackrabbit/DataStore