You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by md...@apache.org on 2014/06/27 17:29:28 UTC
svn commit: r1606126 - /jackrabbit/oak/trunk/oak-run/README.md

Author: mduerig
Date: Fri Jun 27 15:29:28 2014
New Revision: 1606126

URL: http://svn.apache.org/r1606126
Log:
OAK-1806: Benchmark for blob upload and search longevity
Document the scalability suite in the readme. Credits to Ami Jain for the patch

Modified:
    jackrabbit/oak/trunk/oak-run/README.md

Modified: jackrabbit/oak/trunk/oak-run/README.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-run/README.md?rev=1606126&r1=1606125&r2=1606126&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-run/README.md (original)
+++ jackrabbit/oak/trunk/oak-run/README.md Fri Jun 27 15:29:28 2014
@@ -364,3 +364,188 @@ executing. The relevant background threa
 As you can see, the `run()` method of the background task gets invoked
 repeatedly. Such threads will automatically close once all test iterations
 are done, before the `afterSuite()` method is called.
+
+Scalability mode
+--------------
+
+The scalability mode is used for executing various scalability suites to test the 
+performance of various associated tests. It can be invoked like this:
+
+    $ java -jar oak-run-*.jar scalability [options] [suites] [fixtures]
+
+The following scalability options (with default values) are currently supported:
+
+    --host localhost       - MongoDB host
+    --port 27101           - MongoDB port
+    --db <name>            - MongoDB database (default is a generated name)
+    --dropDBAfterTest true - Whether to drop the MongoDB database after the test
+    --base target          - Path to the base file (Tar and H2 setup),
+    --mmap <64bit?>        - TarMK memory mapping (the default on 64 bit JVMs)
+    --cache 100            - cache size (in MB)
+    --csvFile <file>       - Optional csv file to report the benchmark results
+    --rdbjdbcuri           - JDBC URL for RDB persistence (defaults to local file-based H2)
+    --rdbjdbcuser          - JDBC username (defaults to "")
+    --rdbjdbcpasswd        - JDBC password (defaults to "")
+
+These options are passed to the various suites and repository fixtures
+that need them. For example the the MongoDB address information by the
+MongoMK and SegmentMK -based repository fixtures. The cache setting
+controls the KernelNodeState cache size in MongoMK and the default H2 MK, and the 
+segment cache size in SegmentMK.
+
+You can use extra JVM options like `-Xmx` settings to better control the
+scalability suite test environment. It's also possible to attach the JVM to a
+profiler to better understand benchmark results. For example, I'm
+using `-agentlib:hprof=cpu=samples,depth=100` as a basic profiling
+tool, whose results can be processed with `perl analyze-hprof.pl
+java.hprof.txt` to produce a somewhat easier-to-read top-down and
+bottom-up summaries of how the execution time is distributed across
+the benchmarked codebase.
+
+The scalability suite creates the relevant repository load before starting the tests. 
+Each test case tries to benchmark and profile a specific aspect of the repository.
+
+Each scalability suite is configured to run a number of related tests which require the 
+same base load to be available in the repository.
+Either the entire suite can be executed or individual tests within the suite can be run. 
+If the suite names are specified like `ScalabilityBlobSearchSuite` then all the tests 
+configured for the suite are executed. To execute particular tests in the 
+suite, suite names appended with tests of the form `suite:test1,test2` must be specified like 
+`ScalabilityBlobSearchSuite:FormatSearcher,NodeTypeSearcher`. You can specify one or more 
+suites in the scalability command line, and oak-run will execute each suite in sequence.
+
+Finally the scalability runner supports the following repository fixtures:
+
+| Fixture       | Description                                           |
+|---------------|-------------------------------------------------------|
+| Oak-Memory    | Oak with default in-memory storage                    |
+| Oak-MemoryNS  | Oak with default in-memory NodeStore                  |
+| Oak-MemoryMK  | Oak with default in-memory MicroKernel                |
+| Oak-Mongo     | Oak with the default Mongo backend                    |
+| Oak-Mongo-FDS | Oak with the default Mongo backend and FileDataStore  |
+| Oak-MongoNS   | Oak with the Mongo NodeStore                          |
+| Oak-MongoMK   | Oak with the Mongo MicroKernel                        |
+| Oak-Tar       | Oak with the Tar backend (aka Segment NodeStore)      |
+| Oak-H2        | Oak with the MK using embedded H2 database            |
+| Oak-RDB       | Oak with the DocumentMK/RDB persistence               |
+
+(Note that for Oak-RDB, the required JDBC drivers either need to be embedded
+into oak-run, or be specified separately in the class path.)
+
+Once started, the scalability runner will execute each listed suite against all the listed 
+repository fixtures. After starting up the repository and preparing the test environment, 
+the scalability suite executes all the configured tests to warm up caches before measurements 
+are started. Then each configured test within the suite are run and the number of 
+milliseconds used by each execution is recorded. Once done, the following statistics are 
+computed and reported:
+
+| Column      | Description                                           |
+|-------------|-------------------------------------------------------|
+| min         | minimum time (in ms) taken by a test run              |
+| 10%         | time (in ms) in which the fastest 10% of test runs    |
+| 50%         | time (in ms) taken by the median test run             |
+| 90%         | time (in ms) in which the fastest 90% of test runs    |
+| max         | maximum time (in ms) taken by a test run              |
+| N           | total number of test runs in one minute (or more)     |
+
+Also, for each test, the execution times are reported for each iteration/load configured.
+
+| Column      | Description                                           |
+|-------------|-------------------------------------------------------|
+| Load        | time (in ms) taken by a test run              |
+
+The latter is more useful of these numbers as it shows how the individual execution 
+times are scaling for each load.
+
+How to add a new scalability suite
+--------------------------
+The scalability code is
+located under `org.apache.jackrabbit.oak.scalabiity` in the oak-run
+component. 
+
+To add a new scalability suite, you'll need to implement
+the `ScalabilitySuite` interface and add an instance of the new suite to the
+`allSuites` array in the `ScalabilityRunner` class, along with the test benchmarks,
+in the `org.apache.jackrabbit.oak.scalability` package.
+To implement the test benchmarks, it is required to extend the `ScalabilityBenchmark` 
+abstract class and implement the `execute()` method.
+
+The best way to implement the `ScalabilitySuite` interface is to extend the
+`ScalabilityAbstractSuite` base class that takes care of most of the benchmarking
+details. The outline of such a suite is:
+
+    class MyTestSuite extends ScalabilityAbstractSuite {
+        @Override
+        protected void beforeSuite() throws Exception {
+            // optional, run once before all the iterations,
+            // not included in the performance measurements
+        }
+        @Override
+        protected void beforeIteration(ExecutionContext) throws Exception {
+            // optional, Typically, this can be configured to create additional 
+            // loads for each iteration.
+            // This method will be called before each test iteration begins
+        }
+
+        @Override
+        protected void executeBenchmark(ScalabilityBenchmark benchmark,
+            ExecutionContext context) throws Exception {
+            // required, executes the specified benchmark
+        }
+        
+        @Override
+        protected void afterIteration() throws Exception {
+            // optional, executed after runIteration(),
+            // but not included in the performance measurements
+        }
+        @Override
+        protected void afterSuite() throws Exception {
+            // optional, run once after all the iterations are complete,
+            // not included in the performance measurements
+        }
+    }
+
+The rough outline of how the individual suite will be run is:
+
+    test.beforeSuite();
+    for (iteration...) {
+        test.beforeIteration();
+        for (benchmarks...) {
+              recordStartTime();
+              test.executeBenchmark();
+              recordEndTime();
+        }
+        test.afterIteration();
+    }
+    test.afterSuite(); 
+
+You can specify any context information to the test benchmarks using the ExecutionContext 
+object passed as parameter to the `beforeIteration()` and the `executeBenchmark()` methods.
+`ExecutionBenchmark` exposes two methods `getMap()` and `setMap()` which can be used to 
+pass context information.
+
+You can use the `loginWriter()` and `loginReader()` methods to create admin
+and anonymous sessions. There's no need to logout those sessions (unless doing
+so is relevant to the test) as they will automatically be closed after
+the suite is complete and the `afterSuite()` method has been called.
+
+Similarly, you can use the `addBackgroundJob(Runnable)` method to add
+background tasks that will be run concurrently while the test benchmark is
+executing. The relevant background thread works like this:
+
+    while (running) {
+        runnable.run();
+        Thread.yield();
+    }
+
+As you can see, the `run()` method of the background task gets invoked
+repeatedly. Such threads will automatically close once all test iterations
+are done, before the `afterSuite()` method is called.
+
+`ScalabilityAbstractSuite` defines some system properties which are used to control the 
+suites extending from it :
+
+    -Dincrements=10,100,1000,1000     - defines the varying loads for each test iteration
+    -Dprofile=true                    - to collect and print profiling data
+    -Ddebug=true                      - to output any intermediate results during the suite 
+                                        run
\ No newline at end of file