You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Alex Parvulescu (JIRA)" <ji...@apache.org> on 2013/02/08 09:55:15 UTC

[jira] [Updated] (JCR-3513) Slower range query execution

     [ https://issues.apache.org/jira/browse/JCR-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Parvulescu updated JCR-3513:
---------------------------------

    Attachment: JCR-3513.patch

> We go with the second solution and removed the method org.apache.jackrabbit.core.query.lucene.RangeQuery.rewrite(IndexReader).

I'm going to propose a different solution. There is a way to force lucene to rewrite without using a filter, you need to specify the #setRewriteMethod with 'CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE'. This way I think it would fallback to the previous behavior (no more filters).

This rewrite behavior affects all MultiTermQuery impls, and the RangeQuery seems to be the last one that is still using  lucene's default.

I'm attaching a patch shortly (it is against trunk, but it should apply without problems on the 2.4 code).

Tom, I'd appreciate it if you could give it a go in your setup :)
                
> Slower range query execution
> ----------------------------
>
>                 Key: JCR-3513
>                 URL: https://issues.apache.org/jira/browse/JCR-3513
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>    Affects Versions: 2.4.3
>            Reporter: Tom Quellenberg
>            Assignee: Alex Parvulescu
>         Attachments: JCR-3513.patch
>
>
> After switching from JachRabbit 1.6.4 to 2.4.3 we experienced extreme slow query executions. All range query on date fields are often 10 times slow than before.
> In our repositories more than 1 million documents are stored which all contain for example a creation date. Typical queries look like this:
> //element(*, sophora-nt:story)[@sophora:creationDate > ...]
> JackRabbit has its own RangeQuery implementation which is used when Lucene throws a TooManyBooleanClauses-exception (and in some other situations, too). This worked well in Jackrabbit 1.6. In newer versions a different Lucene library is used which never throws TooManyBooleanClauses exceptions. Instead, is has its own fall-back in situations where a BooleanQuery does not work. This fall-back with a MultiTermQueryWrapperFilter seams to us much slower than the fall-back implementation in JackRabbit (Does anybody know the reason?). It is the same situation in Jackrabbit 2.6.0 (with Lucene 3.6.0)
> We patched org.apache.jackrabbit.core.query.lucene.RangeQuery to never use org.apache.lucene.search.TermRangeQuery but always use the JackRabbit implementation. This leads to query executions as fast as in older Jackrabbit versions.
> Do other people experience this problem? Are there any drawbacks using always the JackRabbit implementation for range queries? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira