You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Gian Merlino (JIRA)" <ji...@apache.org> on 2019/07/07 15:50:00 UTC

[jira] [Created] (CALCITE-3178) RexSimplify.simplifyOrTerms slow with large OR filters

Gian Merlino created CALCITE-3178:
-------------------------------------

             Summary: RexSimplify.simplifyOrTerms slow with large OR filters
                 Key: CALCITE-3178
                 URL: https://issues.apache.org/jira/browse/CALCITE-3178
             Project: Calcite
          Issue Type: Improvement
    Affects Versions: 1.19.0
            Reporter: Gian Merlino


In particular, once for each subpredicate within the OR, RexSimplify.simplifyOrTerms calls {{simplify.predicates.union}} and adds the freshly-unioned result to {{simplify.predicates}}. The most time-consuming part of this seems to be {{RexUtil.predicateConstants}}, which re-examines each previously-added entry. This is O(N^2) in the number of subpredicates within the OR.

I discovered this when someone tried to run a query with a 14,000-element IN filter, and planning took about 45 seconds. In Druid, we always convert INs to ORs, never allowing Calcite's subquery conversion to happen. This is because as far as native Druid queries are concerned, a huge OR is going to be more efficient than a join against a constant subquery.

I'm not sure what the best way is to fix this. The only thing that comes to mind immediately is the "quick fix" of limiting how many OR elements RexSimplify might attempt to simplify at once (and potentially AND as well? I haven't looked into that one.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)