You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Jason Bailey <Ja...@sas.com> on 2017/08/28 17:02:32 UTC

slowness accessing lucene results

Hi all,

I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.

"executeTime": 0
"getNodesTime": 3279
"count": 2949
"countTime": 349807
"totalTime": 353086

In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?

-Jason

Re: slowness accessing lucene results

Posted by Chetan Mehrotra <ch...@gmail.com>.
I have backported the fix for OAK-6333 to 1.2 branch also. Would be
part of Oak 1.2.28
Chetan Mehrotra


On Wed, Aug 30, 2017 at 8:18 AM, Jason Bailey <Ja...@sas.com> wrote:
> Yes, that's how I patched the problem. You've answered my question by explaining that the index value is not the correct value and that issue is being addressed.
>
> Thank you.
> Jason
>
> -----Original Message-----
> From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com]
> Sent: Wednesday, August 30, 2017 10:56 AM
> To: users@jackrabbit.apache.org
> Subject: Re: slowness accessing lucene results
>
> EXTERNAL
>
>> "cost for lucene-property[/oak:index/lucene] is 1001.0", "property
>> cost for cqMaster is 1584.0",
>
> Problem here is global fultext index /oak:index/lucene is getting picked up instead of cqMaster. This has been fixed with OAK-6333 (not backported to 1.2 yet). As a workaround you can set 'entryCount'
> property to some high value such that this lucene index is not preferred Chetan Mehrotra
>
>
> On Wed, Aug 30, 2017 at 7:05 PM, Jason Bailey <Ja...@sas.com> wrote:
>> Hi Chetan,
>>
>> For the lucene index provider we have the following configured
>> Enabled:
>>   Copy on Read
>>   Copy on Write
>>   Open Index asynchronously
>>   Prefetch Index Files
>>
>> Thread pool is 20
>> Extracted text cache size is 0
>> Extracted text cache expiry 300
>> Boolean clause limit 1024
>>
>> Additional notes:
>> The timing is coming from a backup of our prod environment that I'm using for testing. Running the query in our production environment actually results in a response time that is closer to an hour. When we use a query that does not use Lucene and utilizes another index like cqMaster it is magnitudes faster, in the range of 400 milliseconds to count.
>>
>> I'm investigating the usage of ACL now.
>>
>> The following is the text from the explain query
>>
>> {
>>     "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL",
>>     "language": "JCR-SQL2",
>>     "explain": {
>>         "logs": [
>>             "cost using filter Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]])",
>>             "cost for aggregate lucene is Infinity",
>>             "Evaluating plan with index definition Lucene Index : cqTag(/oak:index/cqTagLucene)",
>>             "Evaluating plan with index definition Lucene Index : workflow(/oak:index/workflowDataLucene)",
>>             "Evaluating plan with index definition Lucene Index : authorizables(/oak:index/authorizables)",
>>             "Evaluating plan with index definition Lucene Index : /oak:index/damAssetLucene",
>>             "Evaluating plan with index definition Lucene Index : tags(/oak:index/ntBaseLucene)",
>>             "Evaluating plan with index definition Lucene Index : /oak:index/lucene",
>>             "Evaluating plan with index definition Lucene Index : cq:Page(/oak:index/cqPageLucene)",
>>             "cost for lucene-property[/oak:index/lucene] is 1001.0",
>>             "cost for reference is Infinity",
>>             "getPlans(Filter, List<OrderEntry>, NodeState)",
>>             "getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]]) - ",
>>             "getPlans() - sortOrder: null - ",
>>             "getPlans() - rootState: { jcr:primaryType = rep:root, jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], sling:target = /index.html, sling:resourceType = sling:redirect, :async : { async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
>>             "cost for ordered is Infinity",
>>             "cost for nodeType is Infinity",
>>             "property cost for cqMaster is 1584.0",
>>             "cost for property is 1584.0",
>>             "cost for traverse is 1.0588708E7"
>>         ],
>>         "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) :ancestors:/content where (isdescendantnode([nt:base], [/content])) and ([nt:base].[cq:master] is not null) */"
>>     },
>>     "heuristics": {
>>         "executeTime": 0,
>>         "getNodesTime": 3279,
>>         "count": 2949,
>>         "countTime": 349807,
>>         "totalTime": 353086
>>     }
>> }
>>
>> -Jason
>>
>> -----Original Message-----
>> From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com]
>> Sent: Wednesday, August 30, 2017 1:03 AM
>> To: users@jackrabbit.apache.org
>> Subject: Re: slowness accessing lucene results
>>
>> EXTERNAL
>>
>> The query can be slow due to multiple reasons
>>
>> 1. Copy-On-Write support is not enabled
>>
>> 2. Some of the JCR query constraints may not be getting mapped to
>> Lucene index i.e. index definition does not have full coverage of
>> query constraints causing quite a few paths to get filtered out. So
>> actual number of rows accessed may be large but only few of them
>> qualify all criteria at query engine level. This can be confirmed by
>> query explanation output
>>
>> 3. Or lots of filtering happening due to access control. Does the performance remains same with admin user?
>>
>> Chetan Mehrotra
>>
>>
>> On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <Ja...@sas.com> wrote:
>>> Hi all,
>>>
>>> I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.
>>>
>>> "executeTime": 0
>>> "getNodesTime": 3279
>>> "count": 2949
>>> "countTime": 349807
>>> "totalTime": 353086
>>>
>>> In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?
>>>
>>> -Jason

RE: slowness accessing lucene results

Posted by Jason Bailey <Ja...@sas.com>.
Yes, that's how I patched the problem. You've answered my question by explaining that the index value is not the correct value and that issue is being addressed.

Thank you.
Jason 

-----Original Message-----
From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com] 
Sent: Wednesday, August 30, 2017 10:56 AM
To: users@jackrabbit.apache.org
Subject: Re: slowness accessing lucene results

EXTERNAL

> "cost for lucene-property[/oak:index/lucene] is 1001.0", "property 
> cost for cqMaster is 1584.0",

Problem here is global fultext index /oak:index/lucene is getting picked up instead of cqMaster. This has been fixed with OAK-6333 (not backported to 1.2 yet). As a workaround you can set 'entryCount'
property to some high value such that this lucene index is not preferred Chetan Mehrotra


On Wed, Aug 30, 2017 at 7:05 PM, Jason Bailey <Ja...@sas.com> wrote:
> Hi Chetan,
>
> For the lucene index provider we have the following configured
> Enabled:
>   Copy on Read
>   Copy on Write
>   Open Index asynchronously
>   Prefetch Index Files
>
> Thread pool is 20
> Extracted text cache size is 0
> Extracted text cache expiry 300
> Boolean clause limit 1024
>
> Additional notes:
> The timing is coming from a backup of our prod environment that I'm using for testing. Running the query in our production environment actually results in a response time that is closer to an hour. When we use a query that does not use Lucene and utilizes another index like cqMaster it is magnitudes faster, in the range of 400 milliseconds to count.
>
> I'm investigating the usage of ACL now.
>
> The following is the text from the explain query
>
> {
>     "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL",
>     "language": "JCR-SQL2",
>     "explain": {
>         "logs": [
>             "cost using filter Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]])",
>             "cost for aggregate lucene is Infinity",
>             "Evaluating plan with index definition Lucene Index : cqTag(/oak:index/cqTagLucene)",
>             "Evaluating plan with index definition Lucene Index : workflow(/oak:index/workflowDataLucene)",
>             "Evaluating plan with index definition Lucene Index : authorizables(/oak:index/authorizables)",
>             "Evaluating plan with index definition Lucene Index : /oak:index/damAssetLucene",
>             "Evaluating plan with index definition Lucene Index : tags(/oak:index/ntBaseLucene)",
>             "Evaluating plan with index definition Lucene Index : /oak:index/lucene",
>             "Evaluating plan with index definition Lucene Index : cq:Page(/oak:index/cqPageLucene)",
>             "cost for lucene-property[/oak:index/lucene] is 1001.0",
>             "cost for reference is Infinity",
>             "getPlans(Filter, List<OrderEntry>, NodeState)",
>             "getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]]) - ",
>             "getPlans() - sortOrder: null - ",
>             "getPlans() - rootState: { jcr:primaryType = rep:root, jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], sling:target = /index.html, sling:resourceType = sling:redirect, :async : { async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
>             "cost for ordered is Infinity",
>             "cost for nodeType is Infinity",
>             "property cost for cqMaster is 1584.0",
>             "cost for property is 1584.0",
>             "cost for traverse is 1.0588708E7"
>         ],
>         "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) :ancestors:/content where (isdescendantnode([nt:base], [/content])) and ([nt:base].[cq:master] is not null) */"
>     },
>     "heuristics": {
>         "executeTime": 0,
>         "getNodesTime": 3279,
>         "count": 2949,
>         "countTime": 349807,
>         "totalTime": 353086
>     }
> }
>
> -Jason
>
> -----Original Message-----
> From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com]
> Sent: Wednesday, August 30, 2017 1:03 AM
> To: users@jackrabbit.apache.org
> Subject: Re: slowness accessing lucene results
>
> EXTERNAL
>
> The query can be slow due to multiple reasons
>
> 1. Copy-On-Write support is not enabled
>
> 2. Some of the JCR query constraints may not be getting mapped to 
> Lucene index i.e. index definition does not have full coverage of 
> query constraints causing quite a few paths to get filtered out. So 
> actual number of rows accessed may be large but only few of them 
> qualify all criteria at query engine level. This can be confirmed by 
> query explanation output
>
> 3. Or lots of filtering happening due to access control. Does the performance remains same with admin user?
>
> Chetan Mehrotra
>
>
> On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <Ja...@sas.com> wrote:
>> Hi all,
>>
>> I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.
>>
>> "executeTime": 0
>> "getNodesTime": 3279
>> "count": 2949
>> "countTime": 349807
>> "totalTime": 353086
>>
>> In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?
>>
>> -Jason

Re: slowness accessing lucene results

Posted by Chetan Mehrotra <ch...@gmail.com>.
> "cost for lucene-property[/oak:index/lucene] is 1001.0",
> "property cost for cqMaster is 1584.0",

Problem here is global fultext index /oak:index/lucene is getting
picked up instead of cqMaster. This has been fixed with OAK-6333 (not
backported to 1.2 yet). As a workaround you can set 'entryCount'
property to some high value such that this lucene index is not
preferred
Chetan Mehrotra


On Wed, Aug 30, 2017 at 7:05 PM, Jason Bailey <Ja...@sas.com> wrote:
> Hi Chetan,
>
> For the lucene index provider we have the following configured
> Enabled:
>   Copy on Read
>   Copy on Write
>   Open Index asynchronously
>   Prefetch Index Files
>
> Thread pool is 20
> Extracted text cache size is 0
> Extracted text cache expiry 300
> Boolean clause limit 1024
>
> Additional notes:
> The timing is coming from a backup of our prod environment that I'm using for testing. Running the query in our production environment actually results in a response time that is closer to an hour. When we use a query that does not use Lucene and utilizes another index like cqMaster it is magnitudes faster, in the range of 400 milliseconds to count.
>
> I'm investigating the usage of ACL now.
>
> The following is the text from the explain query
>
> {
>     "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL",
>     "language": "JCR-SQL2",
>     "explain": {
>         "logs": [
>             "cost using filter Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]])",
>             "cost for aggregate lucene is Infinity",
>             "Evaluating plan with index definition Lucene Index : cqTag(/oak:index/cqTagLucene)",
>             "Evaluating plan with index definition Lucene Index : workflow(/oak:index/workflowDataLucene)",
>             "Evaluating plan with index definition Lucene Index : authorizables(/oak:index/authorizables)",
>             "Evaluating plan with index definition Lucene Index : /oak:index/damAssetLucene",
>             "Evaluating plan with index definition Lucene Index : tags(/oak:index/ntBaseLucene)",
>             "Evaluating plan with index definition Lucene Index : /oak:index/lucene",
>             "Evaluating plan with index definition Lucene Index : cq:Page(/oak:index/cqPageLucene)",
>             "cost for lucene-property[/oak:index/lucene] is 1001.0",
>             "cost for reference is Infinity",
>             "getPlans(Filter, List<OrderEntry>, NodeState)",
>             "getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]]) - ",
>             "getPlans() - sortOrder: null - ",
>             "getPlans() - rootState: { jcr:primaryType = rep:root, jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], sling:target = /index.html, sling:resourceType = sling:redirect, :async : { async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
>             "cost for ordered is Infinity",
>             "cost for nodeType is Infinity",
>             "property cost for cqMaster is 1584.0",
>             "cost for property is 1584.0",
>             "cost for traverse is 1.0588708E7"
>         ],
>         "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) :ancestors:/content where (isdescendantnode([nt:base], [/content])) and ([nt:base].[cq:master] is not null) */"
>     },
>     "heuristics": {
>         "executeTime": 0,
>         "getNodesTime": 3279,
>         "count": 2949,
>         "countTime": 349807,
>         "totalTime": 353086
>     }
> }
>
> -Jason
>
> -----Original Message-----
> From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com]
> Sent: Wednesday, August 30, 2017 1:03 AM
> To: users@jackrabbit.apache.org
> Subject: Re: slowness accessing lucene results
>
> EXTERNAL
>
> The query can be slow due to multiple reasons
>
> 1. Copy-On-Write support is not enabled
>
> 2. Some of the JCR query constraints may not be getting mapped to Lucene index i.e. index definition does not have full coverage of query constraints causing quite a few paths to get filtered out. So actual number of rows accessed may be large but only few of them qualify all criteria at query engine level. This can be confirmed by query explanation output
>
> 3. Or lots of filtering happening due to access control. Does the performance remains same with admin user?
>
> Chetan Mehrotra
>
>
> On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <Ja...@sas.com> wrote:
>> Hi all,
>>
>> I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.
>>
>> "executeTime": 0
>> "getNodesTime": 3279
>> "count": 2949
>> "countTime": 349807
>> "totalTime": 353086
>>
>> In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?
>>
>> -Jason

RE: slowness accessing lucene results

Posted by Jason Bailey <Ja...@sas.com>.
Hi Chetan,

For the lucene index provider we have the following configured
Enabled:
  Copy on Read
  Copy on Write
  Open Index asynchronously
  Prefetch Index Files

Thread pool is 20
Extracted text cache size is 0
Extracted text cache expiry 300
Boolean clause limit 1024

Additional notes:
The timing is coming from a backup of our prod environment that I'm using for testing. Running the query in our production environment actually results in a response time that is closer to an hour. When we use a query that does not use Lucene and utilizes another index like cqMaster it is magnitudes faster, in the range of 400 milliseconds to count. 

I'm investigating the usage of ACL now.

The following is the text from the explain query

{
    "statement": "SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL",
    "language": "JCR-SQL2",
    "explain": {
        "logs": [
            "cost using filter Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]])",
            "cost for aggregate lucene is Infinity",
            "Evaluating plan with index definition Lucene Index : cqTag(/oak:index/cqTagLucene)",
            "Evaluating plan with index definition Lucene Index : workflow(/oak:index/workflowDataLucene)",
            "Evaluating plan with index definition Lucene Index : authorizables(/oak:index/authorizables)",
            "Evaluating plan with index definition Lucene Index : /oak:index/damAssetLucene",
            "Evaluating plan with index definition Lucene Index : tags(/oak:index/ntBaseLucene)",
            "Evaluating plan with index definition Lucene Index : /oak:index/lucene",
            "Evaluating plan with index definition Lucene Index : cq:Page(/oak:index/cqPageLucene)",
            "cost for lucene-property[/oak:index/lucene] is 1001.0",
            "cost for reference is Infinity",
            "getPlans(Filter, List<OrderEntry>, NodeState)",
            "getPlans() - filter: Filter(query=explain SELECT * FROM [nt:base] WHERE ISDESCENDANTNODE([/content]) AND [cq:master] IS NOT NULL, path=/content//*, property=[cq:master=[is not null]]) - ",
            "getPlans() - sortOrder: null - ",
            "getPlans() - rootState: { jcr:primaryType = rep:root, jcr:mixinTypes = [rep:RepoAccessControllable, rep:AccessControllable], sling:target = /index.html, sling:resourceType = sling:redirect, :async : { async-temp = [fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, f3ec0fb4-6f32-488d-b751-be3179dd0a40], async = fe9031ca-3935-4a24-b9bd-f6a6e4d8d035, async-LastIndexedTo = 2017-08-24T14:57:29.853-04:00 }, tmp = { ... }, home = { ... }, oak:index = { ... }, conf = { ... }, jcr:system = { ... }, etc = { ... }, system = { ... }, content = { ... }, bin : { jcr:primaryType = nt:folder, jcr:mixinTypes = [mix:versionable], jcr:createdBy = admin, jcr:versionHistory = e2777f7a-8fc5-4f23-9fc0-4478f1f063f6, jcr:predecessors = [], jcr:created = 2016-01-18T15:28:50.055-05:00, jcr:baseVersion = 777eb8f9-e606-4f29-bd6a-205994c9cd64, jcr:isCheckedOut = false, jcr:uuid = 40ed38c9-cb13-41dc-8602-0249e2c1b4c2 }, rep:policy = { ... }, libs = { ... }, rep:repoPolicy = { ... }, var = { ... }, apps = { ... } } - ",
            "cost for ordered is Infinity",
            "cost for nodeType is Infinity",
            "property cost for cqMaster is 1584.0",
            "cost for property is 1584.0",
            "cost for traverse is 1.0588708E7"
        ],
        "plan": "[nt:base] as [nt:base] /* lucene:lucene(/oak:index/lucene) :ancestors:/content where (isdescendantnode([nt:base], [/content])) and ([nt:base].[cq:master] is not null) */"
    },
    "heuristics": {
        "executeTime": 0,
        "getNodesTime": 3279,
        "count": 2949,
        "countTime": 349807,
        "totalTime": 353086
    }
}

-Jason

-----Original Message-----
From: Chetan Mehrotra [mailto:chetan.mehrotra@gmail.com] 
Sent: Wednesday, August 30, 2017 1:03 AM
To: users@jackrabbit.apache.org
Subject: Re: slowness accessing lucene results

EXTERNAL

The query can be slow due to multiple reasons

1. Copy-On-Write support is not enabled

2. Some of the JCR query constraints may not be getting mapped to Lucene index i.e. index definition does not have full coverage of query constraints causing quite a few paths to get filtered out. So actual number of rows accessed may be large but only few of them qualify all criteria at query engine level. This can be confirmed by query explanation output

3. Or lots of filtering happening due to access control. Does the performance remains same with admin user?

Chetan Mehrotra


On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <Ja...@sas.com> wrote:
> Hi all,
>
> I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.
>
> "executeTime": 0
> "getNodesTime": 3279
> "count": 2949
> "countTime": 349807
> "totalTime": 353086
>
> In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?
>
> -Jason

Re: slowness accessing lucene results

Posted by Chetan Mehrotra <ch...@gmail.com>.
The query can be slow due to multiple reasons

1. Copy-On-Write support is not enabled

2. Some of the JCR query constraints may not be getting mapped to
Lucene index i.e. index definition does not have full coverage of
query constraints causing quite a few paths to get filtered out. So
actual number of rows accessed may be large but only few of them
qualify all criteria at query engine level. This can be confirmed by
query explanation output

3. Or lots of filtering happening due to access control. Does the
performance remains same with admin user?

Chetan Mehrotra


On Mon, Aug 28, 2017 at 10:02 AM, Jason Bailey <Ja...@sas.com> wrote:
> Hi all,
>
> I'm tracking down a problem I'm currently experiencing when my queries use the lucene index on oak 1.2.18.  Here are the figures that I get from the acs commons tool that breaks down queries.
>
> "executeTime": 0
> "getNodesTime": 3279
> "count": 2949
> "countTime": 349807
> "totalTime": 353086
>
> In essence, it's taking me about 6 minutes to iterate through less than 3000 results. Does anyone have any ideas on root cause?
>
> -Jason