You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Ard Schrijvers (JIRA)" <ji...@apache.org> on 2007/10/30 12:20:50 UTC

[jira] Created: (JCR-1196) Queries for DescendantSelfAxisWeight/ChildAxisQuery are currently very heavy and become slow pretty quickly

Queries for DescendantSelfAxisWeight/ChildAxisQuery are currently very heavy and become slow pretty quickly
-----------------------------------------------------------------------------------------------------------

                 Key: JCR-1196
                 URL: https://issues.apache.org/jira/browse/JCR-1196
             Project: Jackrabbit
          Issue Type: Improvement
          Components: query
    Affects Versions: 1.3.3
            Reporter: Ard Schrijvers
             Fix For: 1.4


A query like 

/documents/en/news//*[@modificationDate] order by @modificationDate

when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 

1) these queries run faster and scale better (obviously)
2) moving a node must stay a cheap operation

Also see:

http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers resolved JCR-1196.
---------------------------------

       Resolution: Fixed
    Fix Version/s: 1.4.1
         Assignee: Ard Schrijvers

I close this issue as Fixed, since the fix of the hierarchical cache in jackrabbit 1.4 solves the largest problem of this issue. Furthermore, the jira issue is polluted with a lot of comments unrelated to the issue.

I will file a new JIRA issue 'Optimize first execution queries for DescendantSelfAxisWeight/ChildAxisQuery' because in 1.4 consecutive executions are fast. I will target for version 2.0 because I do not think in the current architecture we can improve the performance for the first executions. With NGP and perhaps storing paths info in the index we might be able to increase the performance of the first execution. I assume NGP won't be implemented before 2.0.

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>            Assignee: Ard Schrijvers
>             Fix For: 1.4.1
>
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting reopened JCR-1196:
--------------------------------


Reopening to close as Invalid instead of Resolved (see Ard's comment above).

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>            Assignee: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Martin Zdila (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Zdila updated JCR-1196:
------------------------------

    Attachment:     (was: jcr-repository-xml-dump.xml.tar.bz)

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Martin Zdila (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Zdila updated JCR-1196:
------------------------------

    Attachment: jcr-repository-xml-dump.xml.tar.bz

dump of /jcr:root/gfr:devices

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.tar.bz
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Martin Zdila (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Zdila updated JCR-1196:
------------------------------

    Comment: was deleted

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-1196:
-------------------------------

          Component/s: jackrabbit-core
        Fix Version/s:     (was: 1.4)
    Affects Version/s:     (was: 1.3.3)
              Summary: Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery  (was: Queries for DescendantSelfAxisWeight/ChildAxisQuery are currently very heavy and become slow pretty quickly)

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561005#action_12561005 ] 

Ard Schrijvers commented on JCR-1196:
-------------------------------------

t3 is 453 sec for ni.hasNext()!! 

This is indeed a different issue AFAICS. ATM, I am occupied with some other work, but I'll try to investigate the issue thie evening or tomorrow evening. I'll get back on this one, try to reproduce your numbers, and thanks so far for investigating the issue. 

By the way: did you implement a custom accessmanager?

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Martin Zdila (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560946#action_12560946 ] 

Martin Zdila commented on JCR-1196:
-----------------------------------

I used Jackrabbit 1.4 and only consecutive calls for above tests. I can do other tests if you tell me so.

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561441#action_12561441 ] 

Ard Schrijvers commented on JCR-1196:
-------------------------------------

Hello Martin, 

I reproduced your problems, but they are not related to this jira issue. I'll post a mail to the user list describing the set of problems you have, and the (easy) solutions

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Christoph Kiehl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560259#action_12560259 ] 

Christoph Kiehl commented on JCR-1196:
--------------------------------------

Description for slow ChildAxisQuery by Martin Zdila (JCR-1324):

I have following structure in my repository:

jcr:root
        gfr:devices
                gfr:device
                        gfr:capabilityMap

There are cca 4000 gfr:device nodes. Each gfr:device has only one gfr:capabilityMap. Each gfr:capabilityMap has average 20 properties.

Here are some interesting results:

1.
((QueryImpl) query).setLimit(30);
((QueryImpl) query).setOffset(anyLimit);

1.1
executing query //gfr:capabilityMap and fetching nodes takes cca 20-80ms

1.2
executing query /jcr:root/gfr:devices/gfr:device/gfr:capabilityMap and fetching nodes takes cca 2000ms

Why does this take longer time if the only difference is more specific path? I would expect even shorter time of execution and not this.

2.
now without proprietary limit/offset

2.1
//gfr:capabilityMap
cca 150-200ms

2.2
/jcr:root/gfr:devices/gfr:device/gfr:capabilityMap
cca 14000ms!!!

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560972#action_12560972 ] 

Ard Schrijvers commented on JCR-1196:
-------------------------------------

With the fixed caching I cannot imagine that 

>2.2
>/jcr:root/gfr:devices/gfr:device/gfr:capabilityMap
>cca 14000ms!!!

takes 14000 ms *after* the first execution. Can you attach your test case? You are sure you are measuring the time *after* the first time you execute the query? Can you tell me how large the resultset is (getSize()) when not using a limit/offset. If the resultset is in the order of 10^6 you might have performance issues when using paths in your query, even with the fixed cache. This is a known issue. 

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560945#action_12560945 ] 

Ard Schrijvers commented on JCR-1196:
-------------------------------------

The above numbers are done with Jackrabbit trunk or 1.4 or an older version?  In JCR-1213 we fixed the caching issue regarding DescendantSelfAxisWeight/ChildAxisQuery . Obviously, the caching is only measurable in consecutive calls. So perhaps Martin Zdila could do a new test with Jackrabbit 1.4, and specifically look at the time used for consecutive calls. That the first call takes longer we are aware of (though one single slower query at startup is perhaps acceptable).

Shouldn't we close this issue since JCR-1213 is fixed and create a new one 'first execution for queries for DescendantSelfAxisWeight/ChildAxisQuery are slow when the number of hits is large and the parent nodes are divided over many different lucene indexes'. We largely resolved this issue by fixing the cache IMO. WDOT? 

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved JCR-1196.
--------------------------------

       Resolution: Invalid
    Fix Version/s:     (was: 1.4.1)

> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>            Assignee: Ard Schrijvers
>         Attachments: jcr-repository-xml-dump.xml.bz2
>
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (JCR-1196) Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery

Posted by "Martin Zdila (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JCR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560997#action_12560997 ] 

Martin Zdila commented on JCR-1196:
-----------------------------------

test method:
-----------------

long time = System.currentTimeMillis();

final String queryString = ...;
final Query query = session.getWorkspace().getQueryManager().createQuery(queryString, Query.XPATH);
final QueryResult queryResult = query.execute();

System.out.println("t1:" + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();

final NodeIterator ni = queryResult.getNodes();

System.out.println("size:" + ni.getSize());

System.out.println("t2:" + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();

ni.hasNext(); // this single call consumes pretty much time (t3)!

System.out.println("t3:" + (System.currentTimeMillis() - time));
time = System.currentTimeMillis();

for (int i = 0; ni.hasNext() && i < 1000; i++) {
	ni.next();
}

System.out.println("t4:" + (System.currentTimeMillis() - time));


results:
--------
Measured second call of the test method (not first which takes longer).
Note that only first test measures t3, because it takes very long time which I don't have. BTW isn't it another issue?

//gfr:capabilityMap
t1:406
size:11208
t2:0
t3:453864
t4:12

/jcr:root/gfr:devices/gfr:device/gfr:capabilityMap
t1:35358
size:11208

/jcr:root/gfr:devices/gfr:device
t1:169
size:11208




> Optimize queries for DescendantSelfAxisWeight/ChildAxisQuery
> ------------------------------------------------------------
>
>                 Key: JCR-1196
>                 URL: https://issues.apache.org/jira/browse/JCR-1196
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, query
>            Reporter: Ard Schrijvers
>
> A query like 
> /documents/en/news//*[@modificationDate] order by @modificationDate
> when  there are many nodes ( > 1.000) in  /documents/en/news becomes very slow. I think the bottleneck is in something like recursive filters in lucene. First off all I'll try to find some stastistics about the performance, and describe the bottleneck. After that, a solution must be found, where we need to keep in mind that 
> 1) these queries run faster and scale better (obviously)
> 2) moving a node must stay a cheap operation
> Also see:
> http://www.nabble.com/Search-performance--%3A-MultiIndex-tf4695559.html#a13421949

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.