You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-commits@jackrabbit.apache.org by th...@apache.org on 2021/10/06 14:39:47 UTC

[jackrabbit-oak] branch trunk updated: OAK-301: Document Oak

This is an automated email from the ASF dual-hosted git repository.

thomasm pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/jackrabbit-oak.git


The following commit(s) were added to refs/heads/trunk by this push:
     new bdc7dc7  OAK-301: Document Oak
bdc7dc7 is described below

commit bdc7dc7258a942819b5a60dd2ee548af62f1b7ec
Author: thomasm <th...@apache.org>
AuthorDate: Wed Oct 6 16:39:35 2021 +0200

    OAK-301: Document Oak
---
 oak-doc/src/site/markdown/query/query-engine.md | 36 +++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/oak-doc/src/site/markdown/query/query-engine.md b/oak-doc/src/site/markdown/query/query-engine.md
index d5b8b7e..0591bed 100644
--- a/oak-doc/src/site/markdown/query/query-engine.md
+++ b/oak-doc/src/site/markdown/query/query-engine.md
@@ -33,6 +33,7 @@ grep "^#.*$" src/site/markdown/query/query-engine.md | sed 's/#/    /g' | sed 's
         * [Quoting](#Quoting)
         * [Equality for Path Constraints](#Equality_for_Path_Constraints)
     * [Slow Queries and Read Limits](#Slow_Queries_and_Read_Limits)
+    * [Keyset Pagination](#Keyset_Pagination)
     * [Full-Text Queries](#Full-Text_Queries)
     * [Excerpts and Highlighting](#Excerpts_and_Highlighting)
     * [Native Queries](#Native_Queries)
@@ -307,6 +308,41 @@ For XPath queries, such conversion to `union` is always made,
 and for SQL-2 queries such a conversion is only made if the `union` query has a lower expected cost.
 When using `or` in combination with the same property, as in `a=1 or a=2`, then no conversion to `union` is made.
 
+### Keyset Pagination
+
+It is best to limit the result size to at most a few hundred entries.
+To read a large result, keyset pagination should be used.
+Note that "offset" with large values (more than a few hundred) should be avoided, as it can lead to performance and memory issues.
+Keyset pagination refers to ordering the result set by a key column, and then paginate using this column.
+It requires an ordered index on the key column. Example:
+
+    /jcr:root/content//element(*, nt:file)
+    [@jcr:lastModified >= $lastEntry]
+    order by @jcr:lastModified, @jcr:path
+
+For the first query, set `$lastEntry` to 0, and for subsequent queries,
+use the last modified time of the last result.
+
+An order index is needed for these queries to work efficiently, e.g.:
+
+    /oak:index/fileIndex
+      - type = lucene
+      - compatVersion = 2
+      - async = async
+      - includedPaths = [ "/content" ]
+      - queryPaths = [ "/content" ]
+      + indexRules
+        + nt:file
+          + properties
+            + jcrLastModified
+              - name = jcr:lastModified
+              - propertyIndex = true
+              - ordered = true
+
+Notice that multiple entries with the same modified date might exist.
+If your application requires that the same node is only processed once,
+then additional logic is required to skip over the entries already seen (for the same modified date).
+
 ### Full-Text Queries
 
 The full-text syntax supported by Jackrabbit Oak is a superset of the JCR specification.