You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Henry Kuijpers (Jira)" <ji...@apache.org> on 2022/03/10 17:31:00 UTC

[jira] [Updated] (JCR-4770) Query read limit should be overridable through query option

     [ https://issues.apache.org/jira/browse/JCR-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henry Kuijpers updated JCR-4770:
--------------------------------
    Description: 
When executing a query, it could happen that a query yields so many results (for example in the case of migration scripts), that it's causing a failure. Not while executing the query, but while iterating through the results of the query.

We have a few migration scripts in our codebase that need to migrate content (such as CMS components, pages, ...). We also have, especially on production, quite a lot of content. Such scripts can easily find 100.000+ nodes and thus produce a resultset that is bigger than the "query read limit".

This limit can currently be configured on system-level, either through a system property, or through OSGi configuration. QueryEngineSettingsService takes care of that. 

Raising this limit means raising the limit for the entire system. For every query that is executed. It would be ideal if we could configure this limit on the query level, for example through an option (like the options for traversal and for index tag selection). I would propose to add an option:

"select * from ... option (readlimit 999999)" 

which would take precedence over the limit that is active in QueryEngineSettingsService. Then, it would be the responsibility of the developer who creates the query to specify the correct overridden limit (or not specify a limit at all, of course).

Stacktrace of such a failing query, currently:

{code:java}
10.03.2022 16:55:00.032 *WARN* [qtp881876674-5121] org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor Index-Traversed 100000 nodes with filter Filter(query=select [jcr:path], [jcr:score], * from [cq:Page] as a where isdescendantnode(a, '/content') /* xpath: /jcr:root/content//element(*, cq:Page) */, path=/content//*)
10.03.2022 16:55:00.228 *ERROR* [qtp881876674-5121] com.day.crx.delite.impl.servlets.QueryServlet Exception while searching
org.apache.jackrabbit.oak.query.RuntimeNodeTraversalException: The query read or traversed more than 100000 nodes. To avoid affecting other tasks, processing was stopped.
	at org.apache.jackrabbit.oak.query.FilterIterators.checkReadLimit(FilterIterators.java:70) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.plugins.index.Cursors.checkReadLimit(Cursors.java:67) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:411) [org.apache.jackrabbit.oak-lucene:1.22.9]
	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:392) [org.apache.jackrabbit.oak-lucene:1.22.9]
	at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
	at org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor.hasNext(FulltextIndex.java:432) [org.apache.jackrabbit.oak-lucene:1.22.9]
	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.nextInternal(SelectorImpl.java:515) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:508) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:876) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:903) [org.apache.jackrabbit.oak-core:1.22.9]
	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:103) [org.apache.jackrabbit.oak-jcr:1.22.9]
	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:128) [org.apache.jackrabbit.oak-jcr:1.22.9]
	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:83) [org.apache.jackrabbit.oak-jcr:1.22.9]
	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702) [org.apache.jackrabbit.oak-jcr:1.22.9]
	at org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.next(PrefetchIterator.java:88) [org.apache.jackrabbit.oak-jcr:1.22.9]
	at org.apache.jackrabbit.commons.iterator.RangeIteratorAdapter.next(RangeIteratorAdapter.java:152) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
	at org.apache.jackrabbit.commons.iterator.RangeIteratorDecorator.next(RangeIteratorDecorator.java:92) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
	at org.apache.jackrabbit.commons.iterator.RowIteratorAdapter.nextRow(RowIteratorAdapter.java:76) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
{code}


  was:
When executing a query, it could happen that a query yields so many results (for example in the case of migration scripts), that it's causing a failure. Not while executing the query, but while iterating through the results of the query.

We have a few migration scripts in our codebase that need to migrate content (such as CMS components, pages, ...). We also have, especially on production, quite a lot of content. Such scripts can easily find 100.000+ nodes and thus produce a resultset that is bigger than the "query read limit".

This limit can currently be configured on system-level, either through a system property, or through OSGi configuration. QueryEngineSettingsService takes care of that. 

Raising this limit means raising the limit for the entire system. For every query that is executed. It would be ideal if we could configure this limit on the query level, for example through an option (like the options for traversal and for index tag selection). I would propose to add an option:

"select * from ... option (readlimit 999999)" 

which would take precedence over the limit that is active in QueryEngineSettingsService. Then, it would be the responsibility of the developer who creates the query to specify the correct overridden limit (or not specify a limit at all, of course).


> Query read limit should be overridable through query option 
> ------------------------------------------------------------
>
>                 Key: JCR-4770
>                 URL: https://issues.apache.org/jira/browse/JCR-4770
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query, sql
>            Reporter: Henry Kuijpers
>            Priority: Major
>
> When executing a query, it could happen that a query yields so many results (for example in the case of migration scripts), that it's causing a failure. Not while executing the query, but while iterating through the results of the query.
> We have a few migration scripts in our codebase that need to migrate content (such as CMS components, pages, ...). We also have, especially on production, quite a lot of content. Such scripts can easily find 100.000+ nodes and thus produce a resultset that is bigger than the "query read limit".
> This limit can currently be configured on system-level, either through a system property, or through OSGi configuration. QueryEngineSettingsService takes care of that. 
> Raising this limit means raising the limit for the entire system. For every query that is executed. It would be ideal if we could configure this limit on the query level, for example through an option (like the options for traversal and for index tag selection). I would propose to add an option:
> "select * from ... option (readlimit 999999)" 
> which would take precedence over the limit that is active in QueryEngineSettingsService. Then, it would be the responsibility of the developer who creates the query to specify the correct overridden limit (or not specify a limit at all, of course).
> Stacktrace of such a failing query, currently:
> {code:java}
> 10.03.2022 16:55:00.032 *WARN* [qtp881876674-5121] org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor Index-Traversed 100000 nodes with filter Filter(query=select [jcr:path], [jcr:score], * from [cq:Page] as a where isdescendantnode(a, '/content') /* xpath: /jcr:root/content//element(*, cq:Page) */, path=/content//*)
> 10.03.2022 16:55:00.228 *ERROR* [qtp881876674-5121] com.day.crx.delite.impl.servlets.QueryServlet Exception while searching
> org.apache.jackrabbit.oak.query.RuntimeNodeTraversalException: The query read or traversed more than 100000 nodes. To avoid affecting other tasks, processing was stopped.
> 	at org.apache.jackrabbit.oak.query.FilterIterators.checkReadLimit(FilterIterators.java:70) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.plugins.index.Cursors.checkReadLimit(Cursors.java:67) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:411) [org.apache.jackrabbit.oak-lucene:1.22.9]
> 	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$1.next(FulltextIndex.java:392) [org.apache.jackrabbit.oak-lucene:1.22.9]
> 	at com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> 	at org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor.hasNext(FulltextIndex.java:432) [org.apache.jackrabbit.oak-lucene:1.22.9]
> 	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.nextInternal(SelectorImpl.java:515) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:508) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:876) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:903) [org.apache.jackrabbit.oak-core:1.22.9]
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:103) [org.apache.jackrabbit.oak-jcr:1.22.9]
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:128) [org.apache.jackrabbit.oak-jcr:1.22.9]
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:83) [org.apache.jackrabbit.oak-jcr:1.22.9]
> 	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702) [org.apache.jackrabbit.oak-jcr:1.22.9]
> 	at org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.next(PrefetchIterator.java:88) [org.apache.jackrabbit.oak-jcr:1.22.9]
> 	at org.apache.jackrabbit.commons.iterator.RangeIteratorAdapter.next(RangeIteratorAdapter.java:152) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
> 	at org.apache.jackrabbit.commons.iterator.RangeIteratorDecorator.next(RangeIteratorDecorator.java:92) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
> 	at org.apache.jackrabbit.commons.iterator.RowIteratorAdapter.nextRow(RowIteratorAdapter.java:76) [org.apache.jackrabbit.jackrabbit-jcr-commons:2.20.2]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)