You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by "Clemens Wyss (JIRA)" <ji...@apache.org> on 2010/12/03 11:53:12 UTC

[jira] Created: (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
-----------------------------------------------------------------------------------

                 Key: JCR-2830
                 URL: https://issues.apache.org/jira/browse/JCR-2830
             Project: Jackrabbit Content Repository
          Issue Type: Improvement
    Affects Versions: 2.3.0
         Environment: + Win7 (64bit)
+ JR built from latest greatest sources
+ repo with many nodes of same node type > 77'000
            Reporter: Clemens Wyss


Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.

Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
q.setOffset( 0 ); // or any other offset
q.setLimit( 10 ); 
returnValue = q.execute();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

Posted by "Serge Huber (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12967112#action_12967112 ] 

Serge Huber commented on JCR-2830:
----------------------------------

Does anyone know how SQL-1 does it ? It seems from the code that it lazy loads the results, but how can it do so if sorting is needed ?

Best regards,
  Serge Huber.

> JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
> -----------------------------------------------------------------------------------
>
>                 Key: JCR-2830
>                 URL: https://issues.apache.org/jira/browse/JCR-2830
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>         Environment: + Win7 (64bit)
> + JR built from latest greatest sources
> + repo with many nodes of same node type > 77'000
>            Reporter: Clemens Wyss
>
> Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.
> Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
> q.setOffset( 0 ); // or any other offset
> q.setLimit( 10 ); 
> returnValue = q.execute();

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

Posted by "angela (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

angela updated JCR-2830:
------------------------

    Component/s: query

> JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
> -----------------------------------------------------------------------------------
>
>                 Key: JCR-2830
>                 URL: https://issues.apache.org/jira/browse/JCR-2830
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 2.3.0
>         Environment: + Win7 (64bit)
> + JR built from latest greatest sources
> + repo with many nodes of same node type > 77'000
>            Reporter: Clemens Wyss
>
> Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.
> Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
> q.setOffset( 0 ); // or any other offset
> q.setLimit( 10 ); 
> returnValue = q.execute();

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

Posted by "Alex Parvulescu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-2830:
---------------------------------

    Attachment: JCR-2830.patch

Attaching a patch for this issue.

You'll find that performance is WAY better when doing simple queries with just an offset and a limit.

just as an idea, for me on a repo that had ~110k nodes, it went from 12s to ~150ms. but that is extremely contextual, so it is better test on your setup, and give feedback on the patch.

This should be a bit better memory wise too, as the old SortedLuceneQueryHits used to keep the entire list of docs from lucene, not taking advantage of the buffering that is was already doing.

Also added some tests for the offset&limit changes.

> JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
> -----------------------------------------------------------------------------------
>
>                 Key: JCR-2830
>                 URL: https://issues.apache.org/jira/browse/JCR-2830
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 2.3.0
>         Environment: + Win7 (64bit)
> + JR built from latest greatest sources
> + repo with many nodes of same node type > 77'000
>            Reporter: Clemens Wyss
>         Attachments: JCR-2830.patch
>
>
> Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.
> Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
> q.setOffset( 0 ); // or any other offset
> q.setLimit( 10 ); 
> returnValue = q.execute();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

Posted by "Alex Parvulescu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084600#comment-13084600 ] 

Alex Parvulescu commented on JCR-2830:
--------------------------------------

yes, the patch is old, and will soon be rendered obsolete by JCR-2959

> JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
> -----------------------------------------------------------------------------------
>
>                 Key: JCR-2830
>                 URL: https://issues.apache.org/jira/browse/JCR-2830
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 2.3.0
>         Environment: + Win7 (64bit)
> + JR built from latest greatest sources
> + repo with many nodes of same node type > 77'000
>            Reporter: Clemens Wyss
>         Attachments: JCR-2830.patch
>
>
> Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.
> Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
> q.setOffset( 0 ); // or any other offset
> q.setLimit( 10 ); 
> returnValue = q.execute();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JCR-2830) JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084562#comment-13084562 ] 

Jukka Zitting commented on JCR-2830:
------------------------------------

Note that we need to take also ACL processing into account when evaluating the offset and limit settings. Also, the offset and limit settings only make sense when combined with an ORDER BY statement, which is why we need to implement JCR-2959 first before the proposed approach has a chance of working correctly.

> JCR-SQL2 : Query on large node-set is (too) slow even when offset and limit is used
> -----------------------------------------------------------------------------------
>
>                 Key: JCR-2830
>                 URL: https://issues.apache.org/jira/browse/JCR-2830
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 2.3.0
>         Environment: + Win7 (64bit)
> + JR built from latest greatest sources
> + repo with many nodes of same node type > 77'000
>            Reporter: Clemens Wyss
>         Attachments: JCR-2830.patch
>
>
> Given a node-set of approx 77'000 entries a SQL2-query limited to 10 nodes takes approx 37 to 59sec (!) whereas the corresponding SQL returns in less than  1sec.
> Query q = session.getWorkspace().getQueryManager().createQuery( "select * from [task]", Query.SQL2 );
> q.setOffset( 0 ); // or any other offset
> q.setLimit( 10 ); 
> returnValue = q.execute();

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira