You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by ch...@apache.org on 2017/07/25 08:29:08 UTC

svn commit: r1802899 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md

Author: chetanm
Date: Tue Jul 25 08:29:08 2017
New Revision: 1802899

URL: http://svn.apache.org/viewvc?rev=1802899&view=rev
Log:
OAK-6471 - Support adding or updating index definitions via oak-run

Update docs

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md?rev=1802899&r1=1802898&r2=1802899&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/oak-run-indexing.md Tue Jul 25 08:29:08 2017
@@ -34,6 +34,8 @@
         * [B - Online indexing](#online-indexing)
             * [Step 1 - Text PreExtraction](#online-indexing-pre-extract)
             * [Step 2 - Perform reindexing](#online-indexing-perform-reindex)
+        * [Updating or Adding New Index Definitions](#index-definition-updates)
+        * [JSON File Format](#json-file-format)
         * [Tika Setup](#tika-setup)
 
 `@since Oak 1.7.0`
@@ -150,6 +152,7 @@ Here following options can be used
 * `--index-paths` - This command requires an explicit set of index paths which need to be indexed (required)
 * `--checkpoint` - The checkpoint up to which the index is updated, when indexing in read only mode. For
   testing purpose, it can be set to 'head' to indicate that the head state should be used. (required)
+* `-index-definitions-file` - json file file path which contains updated index definitions
   
 If the index does not support fulltext indexing then you can omit providing BlobStore details
   
@@ -199,7 +202,116 @@ In this step we configure oak-run to con
 checkpoint creation, indexing and import
 
     java -jar oak-run*.jar index --reindex --index-paths=/oak:index/lucene --read-write --fds-path=/path/to/datastore /path/to/segmentstore
+
+### <a name="index-definition-updates"></a> Updating or Adding New Index Definitions
+
+`@since Oak 1.7.5`
+
+Index tooling support updating and adding new index definitions to existing setups. This can be done by passing 
+in path of a json file which contains index definitions
+
+    java -jar oak-run*.jar index index --reindex --index-paths=/oak:index/newAssetIndex \
+    --index-definitions-file=index-definitions.json \
+    --fds-path=/path/to/datastore /path/to/segmentstore  
+   
+Where index-definitions.json has following structure
+
+    {
+      "/oak:index/newAssetIndex": {
+        "evaluatePathRestrictions": true,
+        "compatVersion": 2,
+        "type": "lucene",
+        "async": "async",
+        "jcr:primaryType": "oak:QueryIndexDefinition",
+        "indexRules": {
+          "jcr:primaryType": "nt:unstructured",
+          "dam:Asset": {
+            "jcr:primaryType": "nt:unstructured",
+            "properties": {
+              "jcr:primaryType": "nt:unstructured",
+              "valid": {
+                "name": "valid",
+                "propertyIndex": true,
+                "jcr:primaryType": "nt:unstructured",
+                "notNullCheckEnabled": true
+              },
+              "mimetype": {
+                "name": "mimetype",
+                "analyzed": true,
+                "jcr:primaryType": "nt:unstructured"
+              }
+            }
+          }
+        }
+      }
+    }
     
+Some points to note about this json file
+* Each key of top level object refers to the index path
+* The value of each such key refers to complete index definition 
+* If the index path is not present in existing repository then it would result in a new index being created
+* In case of new index it must be ensured that parent path structure must already exist in repository. 
+  So if a new index is being created at `/content/en/oak:index/contentIndex` then path upto  `/content/en/oak:index`
+  should already exist in repository
+
+You can also use the json file generated from [Oakutils](http://oakutils.appspot.com/generate/index). It needs to be 
+modified to confirm to above structure i.e. enclose the whole definition under the intended index path key.
+
+In general the index definitions does not need any special encoding of values as Index definitions in Oak use
+only String, Long and Double types mostly. However if the index refers to binary config like Tika config then
+the binary data would need to encoded. Refer to next section for more details.
+    
+This option is supported in both online and out-of-band indexing.
+
+For more details refer to [OAK-6471][OAK-6471]
+    
+### <a name="json-file-format"></a> JSON File Format
+
+Some of the standard types used in Oak are not supported directly by JSON like names, blobs etc. Those would need to be 
+encoded in a specific format.
+
+Below are the encoding rules
+
+LONG
+: No encoding required
+: _"compatVersion": 2_
+
+BOOLEAN
+: No encoding required
+: _"propertyIndex": true,_
+
+DOUBLE
+: No encoding required
+: _"weight": 1.5_ 
+
+STRING
+: Prefix the value with `str:`
+: Generally the value need not be encoded. Encoding is only required if the string starts with 3 letters and then colon
+: _"pathPropertyName": "str:jcr:path"_  
+
+DATE
+: Prefix the value with `dat:`. The value is ISO8601 formatted date string
+: _"created": "dat:2017-07-20T13:23:21.196+05:30"_  
+
+NAME
+: Prefix the value with `nam:`.
+: For `jcr:primaryType` and `jcr:mixins` no encoding is required. Any property with these names would be converted to
+  NAME type
+: _"nodetype": "nam:nt:base"_ 
+
+PATH
+: Prefix the value with `pat:`
+: _"imagePath": "pat:/content/assets/book.jpg"_  
+
+URI
+: Prefix the value with `uri:`
+: _"serverURI": "uri:http://foo"_  
+
+BINARY
+: By default the binary values are encoded as Base64 string if the binary is less than 1 MB size. The encoded value is 
+  prefixed with `:blobId:`
+: _"jcr:data": ":blobId:axygz="_  
+
 
 ### <a name="tika-setup"></a> Tika Setup
 
@@ -215,5 +327,4 @@ Then modify the index command like below
     java -cp oak-run.jar:tika-app-1.15.jar org.apache.jackrabbit.oak.run.Main index
     
 
-
-
+[OAK-6471]: https://issues.apache.org/jira/browse/OAK-6471
\ No newline at end of file