You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by GitBox <gi...@apache.org> on 2022/06/28 16:54:23 UTC

[GitHub] [jackrabbit-oak] joerghoh opened a new pull request, #608: describe the process of executing a query

joerghoh opened a new pull request, #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608

   Describe the execution of a query


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #608: describe the process of executing a query

Posted by GitBox <gi...@apache.org>.

thomasmueller commented on code in PR #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608#discussion_r909256196


##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
+
+If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversal.
+
+#### Ordering
+If a query requests an ordered result set, the Query Engine tries to get an already ordered result from the index; in case the index definition does not support the requested ordering or in case of a traversal, the Query Engine must execute the ordering itself. To achieve this the entire result set is read into memory and then sorted which consumes memory and takes time.
+
+#### Iterating the result set

Review Comment:
   Iterating the Result Set (title case)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [jackrabbit-oak] joerghoh merged pull request #608: describe the process of executing a query

Posted by GitBox <gi...@apache.org>.

joerghoh merged PR #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [jackrabbit-oak] joerghoh commented on a diff in pull request #608: describe the process of executing a query

Posted by GitBox <gi...@apache.org>.

joerghoh commented on code in PR #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608#discussion_r909294378


##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
+
+If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversal.
+
+#### Ordering
+If a query requests an ordered result set, the Query Engine tries to get an already ordered result from the index; in case the index definition does not support the requested ordering or in case of a traversal, the Query Engine must execute the ordering itself. To achieve this the entire result set is read into memory and then sorted which consumes memory and takes time.
+
+#### Iterating the result set

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [jackrabbit-oak] fabriziofortino commented on a diff in pull request #608: describe the process of executing a query

Posted by GitBox <gi...@apache.org>.

fabriziofortino commented on code in PR #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608#discussion_r911028940


##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
+
+If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversal.
+
+#### Ordering
+If a query requests an ordered result set, the Query Engine tries to get an already ordered result from the index; in case the index definition does not support the requested ordering or in case of a traversal, the Query Engine must execute the ordering itself. To achieve this the entire result set is read into memory and then sorted which consumes memory and takes time.
+
+#### Iterating the Result Set
+Query results are implemented as lazy iterators, and the result set is only read if needed. When the next result is requested, the result iterator seeks the potential nodes to find the next node matching the query. 
+During this seek process the Query Engine reads and filters the potential nodes until if finds a match. Even if the query is handled completely by an index, the Query Engine needs to check if the requesting session is allowed to read the nodes.
+
+That means that during this final step every potential node must be loaded from the node store, thus counting towards the read limit (see [Slow Queries and Read Limits](#slow-queries-and-read-limits)).

Review Comment:
   typo
   
   ```suggestion
   Query results are implemented as lazy iterators, and the result set is only read if needed. When the next result is requested, the result iterator seeks the potential nodes to find the next node matching the query. 
   During this seek process the Query Engine reads and filters the potential nodes until it finds a match. Even if the query is handled completely by an index, the Query Engine needs to check if the requesting session is allowed to read the nodes.
   
   That means that during this final step every potential node must be loaded from the node store, thus counting towards the read limit (see [Slow Queries and Read Limits](#slow-queries-and-read-limits)).
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [jackrabbit-oak] klcodanr commented on a diff in pull request #608: describe the process of executing a query

Posted by GitBox <gi...@apache.org>.

klcodanr commented on code in PR #608:
URL: https://github.com/apache/jackrabbit-oak/pull/608#discussion_r911016520


##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.

Review Comment:
   Not 100% sure on the format of the anchor, but I think this ties the two identification and iteration together. 
   
   ```suggestion
   In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
   ```
   ```suggestion
   In this case, the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints while [Iterating the Result Set](#iterating-the-result-set) by retrieving the nodes from the node store and evaluating the nodes against the constraints. Each retrieval, including non-matching nodes is counted as a traversal. This means that despite the use of an index an additional traversal can be required if not all constraints in a query are executed against the index.
   ```



##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
+
+If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversal.

Review Comment:
   ```suggestion
   If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversing the Node Store and evaluating the nodes against the constraints.
   ```



##########
oak-doc/src/site/markdown/query/query-engine.md:
##########
@@ -122,6 +127,23 @@ so the method should be reasonably fast (not read any data itself, or at least n
 
 If an index implementation can not query the data, it has to return `Infinity` (`Double.POSITIVE_INFINITY`).
 
+#### Identifying Nodes
+
+If an index is selected, the query is executed against the index. The translation from the JCR Query syntax into the query language supported by the index includes as many constraints as possible which are supported by the index. Depending on the index definition this can mean that not all constraints can be resolved by the index itself. 
+In this case the Query Engine tries to let the index handle as much constraints as possible and later executes all remaining constraints on its own, accessing the node store and doing all necessary operations there, which can result in a traversal. This means that despite the use of an index an additional traversal is required.
+
+If no matching index is determined in the previous step, the Query Engine executes this query solely based on a traversal.
+
+#### Ordering
+If a query requests an ordered result set, the Query Engine tries to get an already ordered result from the index; in case the index definition does not support the requested ordering or in case of a traversal, the Query Engine must execute the ordering itself. To achieve this the entire result set is read into memory and then sorted which consumes memory and takes time.

Review Comment:
   ```suggestion
   If a query requests an ordered result set, the Query Engine tries to get an already ordered result from the index; in case the index definition does not support the requested ordering or in case of a traversal, the Query Engine must execute the ordering itself. To achieve this the entire result set is read into memory and then sorted. This consumes memory, takes time and requires the Query Engine to read the full result set even in the case where a limit setting would otherwise limit the number of results traversed.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org