You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Ulrich <Fo...@gombers.de> on 2013/06/11 17:18:14 UTC

JCR-JQOM: QueryObjectModelFactory.OR - combined query fails with Lucene error: TooManyClauses: maxClauseCount is set to 1024

The repository contains nt:file-nodes with an optional mixin-property of type
DATE. I need to find all nodes which don't have this property or where the
specified date is before a given date.
My code is:
try {
        // get all files in subtree
        Constraint getSubtree =
qomf.descendantNode(nodeTypeSelector.getSelectorName(), subtree);

        // if they have a property defined indicating that they already have
been virusscanned
        Constraint needRescan = qomf.and(getSubtree,
qomf.propertyExistence(nodeTypeSelector.getSelectorName(),
                                lastScannedDateProperty));
        // but scandate is outdated
        Comparison compareDate = qomf.comparison(lastScannedDateOperand,
QueryObjectModelFactory.JCR_OPERATOR_LESS_THAN,
                             lastScannedDateValue);
        needRescan = qomf.and(needRescan, compareDate);

        // OR
        // if nodes have not been scanned yet
        Constraint needInitalScan = qomf.and(getSubtree,

                  qomf.not(qomf.propertyExistence(nodeTypeSelector.getSelectorName(),
lastScannedDateProperty)));

        QueryObjectModel qom = qomf.createQuery(nodeTypeSelector,
qomf.or(needInitalScan, needRescan), null, null);
        QueryResult queryResult = qom.execute();
        nodeIterator = queryResult.getNodes();
} catch (InvalidQueryException e) {
....

The query fails for a large subtree with a lucene-error:
        org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is
set to 1024

If I split the query:
   QueryObjectModel qom = qomf.createQuery(nodeTypeSelector, needInitalScan,
null, null);
   QueryObjectModel qom = qomf.createQuery(nodeTypeSelector, needRescan, null,
null);

it works fine in my test-environment. I could live with this; it's a little bit
of effort to combine the both of the lists, but that's ok. Nevertheless it looks
suboptimal and I would like to ask if there isn't a way to improve it.
And I might get in trouble if running the queries at a larger system. So I'd
like to know what's the reason for the error - I've no clue. I found a nice
explanation here "http://dalelane.co.uk/blog/?p=2081" and this page
"http://blogs.adobe.com/dmcmahon/2012/03/14/crx2-2-booleanquerytoomanyclauses-maxclausecount-is-set-to-1024-running-sql2-query/".
The latter mentioned issue should be solved with
"http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/201203.mbox/%3C1346600686.11884.1330696077248.JavaMail.tomcat@hel.zones.apache.org%3E"

So any comment is appreciated.
Ulrich