You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Julian Hyde <jh...@apache.org> on 2017/07/01 01:24:44 UTC

Re: Wrapping an input JdbcRel forces the planner into an infinite loop

It's definitely a bug in your code that you don't remove the limit
that you intended to.

Is there a bug in the planner too? Maybe. Maybe it should recognize
that what you have created is IDENTICAL to what it had before, and
therefore deduce that the rule had no effect, and not cue up another
firing of the rule. That would be impressive, but I thought the
Volcano planner already did that. But maybe if you didn't implement
the necessary methods in GelbanaScanWithLimit to determine identity
(probably need to override explainTerms, or extend TableScan, or
something) then every GelbanaScanWithLimit will look distinct from
every other one, and the planner is doing as much as it can. Which
means bug #2 is in your code too.

Julian


On Fri, Jun 30, 2017 at 2:02 PM, Muhammad Gelbana <m....@gmail.com> wrote:
> Thanks Atri, but it isn't a bug I guess. The code doesn't stuck somewhere,
> the loop just goes on and without reaching a condition that would break it.
>
> Thanks Julian, your solution solved my problem. But Would you please
> explain to me why is this a mistake ?
>
> Assuming its because I called transform for the sort node without changing
> any of its semantics, shouldn't changing the child nodes be considered a
> change ?
>
> *---------------------*
> *Muhammad Gelbana*
>
> On Fri, Jun 30, 2017 at 6:36 PM, Julian Hyde <jh...@apache.org> wrote:
>
>> The 4-argument Sort.copy retains the old offset and fetch fields. So you
>> are producing another LogicalSort that has a not-null fetch.
>>
>>
>> > On Jun 30, 2017, at 9:33 AM, Atri Sharma <at...@gmail.com> wrote:
>> >
>> > Did you try attaching debugger and see where the code is hanging?
>> >
>> > My guess is that the code flow is hanging in applyRules in HepPlanner.
>> > The iterator is not moving over the plan hence is stuck in an infinite
>> > loop.
>> >
>> > This is a known bug in HepPlanner. I will create a JIRA case for this.
>> >
>> > Please add your case there.
>> >
>> > Regards,
>> >
>> > Atri
>> >
>> > On Fri, Jun 30, 2017 at 9:06 PM, Muhammad Gelbana <m....@gmail.com>
>> wrote:
>> >> Well it's not accurately an infinite loop, let me explain.
>> >>
>> >> First of all, this is the loop (I'm using Drill v1.9, which uses Calcite
>> >> v1.4)
>> >> https://github.com/apache/calcite/blob/branch-1.4/core/
>> src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L389
>> >>
>> >> The nMatches variable keeps on increasing without breaking the loop.
>> >> Theoretically it should eventually break the loop, but I can't accept
>> this
>> >> as a solution because it would take minutes to just plan a query ! There
>> >> must be another efficient way to break the loop.
>> >>
>> >> I *assume* since Calcite v1.4 doesn't support unparsing OFFSET and FETCH
>> >> clauses, Drill tries to apply this pagination using a *DrillLimitRel*
>> node.
>> >> But this implies pulling the whole set of data from the JDBC source,
>> then
>> >> filtering it within Drill, which is a huge waste for huge datasets if
>> you
>> >> ask me. Please correct me if I'm wrong.
>> >>
>> >> *My query:* SELECT CT_ID FROM gelbana.SLS.CTS LIMIT 3
>> >>
>> >> Which is planned as a
>> >> *LogicalSort* -> *LogicalProject* -> *JdbcTableScan*
>> >>
>> >> *LogicalSort* is then converted into a Drill specific node which is
>> >> *DrillLimitRel*.
>> >>
>> >> *LogicalSort* is the node that holds the fetch, offset and sorting
>> >> information. I'm trying to pushdown the fetch value (only if its a
>> literal
>> >> and there is no offset specified) to a custom scan node. Then I can pass
>> >> the fetch value to the Jdbc statement
>> >> <https://docs.oracle.com/javase/8/docs/api/java/sql/
>> Statement.html#setMaxRows-int->
>> >> and
>> >> achieve the limit I need.
>> >>
>> >> I'm trying to do so by wrapping the *JdbcTableScan* and another custom
>> Jdbc
>> >> scan node (i.e. *GelbanaJdbcJoin*), within a new kind of JdbcRel
>> >> implementation. This implementation exposes the same methods a *JdbcRel*
>> >> would, and the implementation of most of these methods just calls the
>> >> equivalent method of the wrapped JdbcRel node. The code the below.
>> >>
>> >> What happens is that the previously mentioned loop keeps on going on
>> and on
>> >> without breaking. I appreciate if someone tells me where did I mess up ?
>> >>
>> >> *My rule*
>> >> public class GelbanaLimitRule extends RelOptRule {
>> >>
>> >>    public GelbanaLimitRule() {
>> >>        super(operand(LogicalSort.class, operand(LogicalProject.class,
>> >> operand(JdbcRel.class, any()))), "GelbanaPushdownLimit");
>> >>    }
>> >>
>> >>    @Override
>> >>    public boolean matches(RelOptRuleCall call) {
>> >>        LogicalSort limit = (LogicalSort) call.rels[0];
>> >>        RelNode input = call.rels[2];
>> >>
>> >>        boolean jdbcInputCheck = input.getClass() == JdbcTableScan.class
>> ||
>> >> input.getClass() == GelbanaJdbcJoin.class;
>> >>        return jdbcInputCheck && limit.fetch != null &&
>> >> limit.fetch.getClass() == RexLiteral.class && limit.offset == null;
>> >>    }
>> >>
>> >>    @Override
>> >>    public void onMatch(RelOptRuleCall call) {
>> >>        LogicalSort limit = (LogicalSort) call.rels[0];
>> >>        LogicalProject project = (LogicalProject) call.rels[1];
>> >>        JdbcRel input = (JdbcRel) call.rels[2];
>> >>
>> >>        BigDecimal limitValue = (BigDecimal) ((RexLiteral)
>> >> limit.fetch).getValue();
>> >>
>> >>        GelbanaScanWithLimit newInput = new GelbanaScanWithLimit(input,
>> >> limitValue.intValue());
>> >>        LogicalProject newProject = project.copy(project.getTraitSet(),
>> >> newInput, project.getProjects(), project.getRowType());
>> >>        Sort newLimit = limit.copy(limit.getTraitSet(), newProject,
>> >> limit.getCollation());
>> >>
>> >>        call.transformTo(newLimit);
>> >>    }
>> >> }
>> >>
>> >> This is a portion of the code of the *GelbanaScanWithLimit* node.
>> >>
>> >> public class GelbanaScanWithLimit implements JdbcRel {
>> >>    private JdbcRel input;
>> >>    private Integer limit;
>> >>
>> >>    public GelbanaScanWithLimit(JdbcRel input, Integer limit) {
>> >>        this.input = input;
>> >>        this.limit = limit;
>> >>    }
>> >>
>> >>    public boolean hasLimit() {
>> >>        return this.limit != null;
>> >>    }
>> >>
>> >>    public int getLimit() {
>> >>        assert hasLimit();
>> >>        return this.limit;
>> >>    }
>> >>
>> >>    public RelOptCost computeSelfCost(RelOptPlanner planner) {
>> >>        return planner.getCostFactory().makeZeroCost(); //It's the best
>> >> case for me to push down *JdbcTableScan*s and joins to my datasource,
>> with
>> >> limits of course
>> >>    }
>> >>
>> >>    // More methods exist to expose the wrapped *JdbcRel* node. The upper
>> >> methods override the equivalent ones of the wrapped *JdbcRel* node.
>> >> }
>> >>
>> >> Thanks !
>> >>
>> >> *---------------------*
>> >> *Muhammad Gelbana*
>> >
>> >
>> >
>> > --
>> > Regards,
>> >
>> > Atri
>> > l'apprenant
>>
>>