You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by th...@apache.org on 2017/03/29 11:47:20 UTC

svn commit: r1789340 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md

Author: thomasm
Date: Wed Mar 29 11:47:19 2017
New Revision: 1789340

URL: http://svn.apache.org/viewvc?rev=1789340&view=rev
Log:
OAK-5946 - Document reindexing

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md?rev=1789340&r1=1789339&r2=1789340&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/indexing.md Wed Mar 29 11:47:19 2017
@@ -423,48 +423,68 @@ NRT indexing expose a few configuration
 
 ## <a name="reindexing"></a> Reindexing
 
-Reindexing of existing indexes is required in the following scenarios:
-
-* Incompatible changes in the index definition - 
-  Needed after adding a property to an index definition, 
-  if content nodes with this property are already present.
-* Corrupted Index - If the index is corrupt, and `AsyncIndexUpdate` run fails 
-  with an exception pointing to index being corrupt.
-  
-Reindexing does not resolve other problems, such that queries not returning data. 
-For such cases, it is _not_ recommended to reindex (also because this can be very slow and use a lot of temporary disk space).
+Reindexing rarely solves problems. 
+Specially, it does not typically make queries return the expected result. 
+For such cases, it is _not_ recommended to reindex,
+also because reindex can be very slow (sometimes multiple days), 
+and use a lot of temporary disk space.
+Note that removing checkpoints, and removing the hidden `:async` node
+will  cause a full reindex, so doing this is not recommended either.
 If queries don't return the right data, then possibly the index is [not yet up-to-date][OAK-5159],
 or the query is incorrect, or included/excluded path settings are wrong (for Lucene indexes). 
 Instead of reindexing, it is suggested to first check the log file, 
-modify the query so it uses a different index or traversal and run the query again.
-One case where reindexing can help is if the query engine picks a very slow index for some queries because the counter index 
-[got out of sync after adding and removing lots of nodes many times (fixed in recent version)][OAK-4065].
-For this case, it is recommended to verify the contents of the counter index first,
-and upgrade Oak before reindexing.
+modify the query so it uses a different index or traversal, and run the query again.
+
+Reindexing of existing indexes is required in the following scenarios:
 
-Also, note that with Oak 1.6, for Lucene indexes, changes in the index definition are only effective 
-[post reindexing](lucene.html#stored-index-definition).
+* A: In case a _property_ index configuration was changed,
+  such that the index is used for queries, but doesn't contain some of the nodes.
+  Nodes that existed _before_ the index configuration was changed, are not indexed.
+  A workaround is to change ('touch') the affected nodes.
+* B: Prior to Oak 1.6, in case a _Lucene_ index definition was changed (same as A).
+  In Oak 1.6 and newer, queries will use the old index definition
+  until the index is [reindexed](lucene.html#stored-index-definition).
+* C: Prior to Oak 1.6, in case the query engine picks a very slow index 
+  for some queries because the counter index 
+  [got out of sync after adding and removing lots of nodes many times][OAK-4065].
+  For this case, it is recommended to verify the contents of the counter index first,
+  and upgrade Oak before reindexing.
+  Only the `counter` index needs to be reindexed in this case.
+  The workaround (to avoid reindexing) is to manually tweak index configurations
+  using manually set `entryCount` in the index configuration.
+* D: In case a binary of a Lucene index is missing, for example
+  because the binary is not available in the datastore.
+  This can happen in case the datastore is misconfigured
+  such that garbage collection removed a binary that is still required.
+  In such cases, other binaries might be missing as well;
+  it is best to traverse all nodes of the repository to ensure this is not the case.
+* E: In case a binary of a Lucene index is corrupt.
+  If the index is corrupt, an `AsyncIndexUpdate` run will fail 
+  with an exception pointing to the index being corrupt.
+  In such a case, first verify that the following procedure doesn't resolve
+  the issue: stop Oak, remove the local copy of the Lucene index (directory `index`),
+  and restart. If the index is still corrupt after this, then reindexing is needed.
+  In such cases, please file an Oak issue.
 
 To reindex, set the `reindex` property to `true` in the respective index definition:
 
     /oak:index/userIndex
-      - jcr:primaryType = "oak:QueryIndexDefinition"
-      - async = ['async']
       - reindex = true
       
-Once changes are saved, the index is reindexed. For synchronous indexes, 
-the reindexing is done as part of save (or commit) itself. 
-While for asynchronous indexes, reindex starts with the next async indexing cycle. 
+Once changes are saved, the index is reindexed. 
+For asynchronous indexes, reindex starts with the next async indexing cycle.
+For synchronous indexes, the reindexing is done as part of save (or commit) itself.
+For a (synchronous) property index, 
+as an alternative you can use the `PropertyIndexAsyncReindexMBean`; 
+see the [reindeinxing property indexes](property-index.html#reindexing) section for more details on that.
+
 Once reindexing starts, the following log entries can be seen in the log:
 
     [async-index-update-async] o.a.j.o.p.i.IndexUpdate Reindexing will be performed for following indexes: [/oak:index/userIndex]
     [async-index-update-async] o.a.j.o.p.i.IndexUpdate Reindexing Traversed #100000 /home/user/admin 
     [async-index-update-async] o.a.j.o.p.i.AsyncIndexUpdate [async] Reindexing completed for indexes: [/oak:index/userIndex*(4407016)] in 30 min 
 
-In both cases, once reindexing is complete, the `reindex` flag is removed.
-
-For a property index, you can also make use of the `PropertyIndexAsyncReindexMBean`. 
-See also the [reindeinxing property indexes](property-index.html#reindexing) section for more details on that.
+Once reindexing is complete, the `reindex` flag is set to `false`.
 
 
 [OAK-5159]: https://issues.apache.org/jira/browse/OAK-5159