You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by th...@apache.org on 2016/03/30 12:17:36 UTC

svn commit: r1737090 - /jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md

Author: thomasm
Date: Wed Mar 30 10:17:36 2016
New Revision: 1737090

URL: http://svn.apache.org/viewvc?rev=1737090&view=rev
Log:
OAK-301 Document Oak - more query docs

Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md

Modified: jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md
URL: http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md?rev=1737090&r1=1737089&r2=1737090&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/query/query-engine.md Wed Mar 30 10:17:36 2016
@@ -30,7 +30,24 @@ Query Indices are defined under the `oak
 
 For NodeIterator.getSize(), some versions of Jackrabbit 2.x returned the estimated (raw)
 Lucene result set size, including nodes that are not accessible. 
-Oak does not do this; it either returns the correct result size, or -1.
+By default, Oak does not do this; it either returns the correct result size, or -1.
+
+Oak 1.2.x and newer supports a compatibility flag so that it works in the same way
+as Jackrabbit 2.x, by returning an estimate. See also OAK-2926.
+This is best configured as described in OAK-2977:
+When using Apache Sling, since Oak 1.3.x, 
+add the following line to the file `conf/sling.properties`,
+and then restart the application:
+
+    oak.query.fastResultSize=true
+
+Please note this only works with the Lucene `compatVersion=2` right now.
+Example code to show how this work (where `test` is a common word in the index):
+
+    String query = "//element(*, cq:Page)[jcr:contains(., 'test')]";
+    Query query = queryManager.createQuery(qs, "xpath");
+    QueryResult result = query.execute();
+    long size = result.getRows().getSize();
 
 #### Quoting
 
@@ -299,7 +316,30 @@ traversed at the given subtree.
 
 Usually, data is read from the index and repository while traversing over the query 
 result. There are exceptions however, where all data is read in memory when the query
-is executed: when using a full-text index, and when using an "order by" clause.
+is executed. The most common case is when using an `order by` clause and 
+the index can not provide a sorted result.
+There are other cases where paths of the results read so far are kept in memory, 
+in order to not return duplicate results. 
+This is the case when using `or` conditions such that two indexes are used 
+(internally a `union` query is executed).
+
+If you enable debug logging for the module `org.apache.jackrabbit.oak.query`, you may see this:
+
+    cost for nodeType is 1638354.0
+    cost for property is 2.0
+    cost for traverse is 3451100.0
+    
+This means the cost for the nodetype index _would_ be about 1638354.0, 
+the cost for the property index _would_ be about 2, 
+and the cost for traversal _would_ be about 3451100.0. 
+It doesn't say traversal is actually used, it just lists the expected costs. 
+The query engine will then pick the index with the lowest expected cost, 
+which is (in the case above) "property".
+
+The expected cost for traversal is, with Oak 1.0.x, really just a guess looking at 
+the length of the path. With Oak 1.2 and newer, the "counter" index is used 
+(see mainly OAK-1907). There is an known issue with this, if you add and remove a lot 
+of nodes in a loop, you could end up with a too-low cost, see OAK-4065.
 
 ### The Node Type Index