You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Stamatis Zampetakis <za...@gmail.com> on 2020/06/01 21:30:44 UTC

Re: Regarding CALCITE-3997

Hello,

Since I was among the people who raised some concerns about this during the
vote I would like to mention that it was mostly about the fact it came very
late in the release cycle.
It is better to avoid breaking changes in the RC phase since it may be
difficult for people to react or test it.

I agree with Haisheng that Calcite code should follow the best practice so
that other projects can draw inspiration from it. Nevertheless, I would
like to mention again that Enumerable is not a convention that is used only
for demonstration purposes but it is also used in production systems so we
have to be careful about backward compatibility. Still nobody complaint so
far so it means that we are doing pretty good on this aspect :)

The release brought some very cool features and I fully support the
direction on which we are going. I don't think there is a need to open a
new thread here since we can discuss under the respective jira(s) if
necessary.

Best,
Stamatis






On Wed, May 27, 2020 at 5:07 AM Haisheng Yuan <hy...@apache.org> wrote:

> Hi all,
>
> During vote of 1.23.0 rc1, there are some discussions about the solution
> of CALCITE-3997.
>
> Some may think it is a matter of best practice, but in many cases it is a
> matter of right or wrong.
>
> Here is the rationale.
>
> The original query in CALCITE-3997 is
>
> SELECT t1.k1, t2.k1 FROM t1, t2
> ON t1.k1=t2.k1
> WHERE t1.n1 + 1 = t2.n3
>
> After FilterIntoJoin and MergeJoinRule, planner generates an alternative:
> MergeJoin on t1.k1=t2.k1 and t1.n1 + 1 = t2.n3
>
> The mergejoin's collation is [k1].
>
> This mergejoin matches JoinPushExpressionsRule, which pushes expression in
> join condition, and generates a Project in child.
> Project t1.k1, t2.k1
>   -> MergeJoin on t1.k1=t2.k1 and t1.col = t2.n3
>        -> Project k1, n1+1 as col
>             -> TableScan t1
>        -> TableScan t2
>
> The new mergejoin now has 2 equi-conditions, but the collation is the same
> with old mergejoin, still [k1], which is wrong.
> It should be updated to [k1,col] and [k1, n3]. Hence error.
>
> The similar issue already happened before, see CALCITE-3353. And it is
> very easy to trigger same error using different query even without
> FilterIntoJoin. If we change WHERE to AND in the query,
> SELECT t1.k1, t2.k1 FROM t1, t2
> ON t1.k1=t2.k1
> AND t1.n1 + 1 = t2.n3
>
> or with the following query with join condition const reductions on, we
> can still see the issue.
> SELECT t1.k1, t2.k1 FROM t1, t2
> ON t1.k1=t2.k1
> AND (t1.n1 + 1 = t2.n3 or t1.n1 + 1 = t2.n3)
>
> Can we use the similar approach that in CALCITE-3353 to fix the issue?
> We can, but it just hide the issue. Doing that way can generate mergejoin
> with correct traitset, but the rule is still generating an invalid plan.
> Because the MergeJoin's new input is a LogicalProject, which is generated
> by RelBuilder.
>
> In Calcite, RelNode is a very special class. Unlike operators in Cascades
> or Volcano paper, where logical/physical operator and plan expression are
> separated, when constructing logical/physical expression tree, each
> operator doesn't require anything trait from its child. But RelNode in
> Calcite not only represents operator, but also plan expression, and can
> even represent a MEMO group with required traitset, which is RelSubset.
> When constructing a RelNode XXX, we need to specify the input RelNodes, the
> input nodes' traitsets will become node XXX's requirement on those inputs,
> even we are not aware that we are requesting it, which is extremely
> error-prone.
>
> In above case, the new MergeJoin's left input is a LogicalProject, which
> means the physical MergeJoin is requesting its left input to be logical
> operator with NONE convention, which has {inf} large cost. That means the
> newly generated MergeJoin is never implementable.
>
> What if we have a physical RelBuilder to generate the Project in the rule?
> The consequence is worse.
> With logical RelBuilder, it just generates an invalid plan, we are still
> safe because it won't be selected as the best plan. If we get a physical
> RelBuilder, which generates a physical plan that is implementable, but this
> is a wrong plan. Because in RelOptUtil.pushDownJoinConditions, when
> creating the MergeJoin, it just specifies its traitset and inputs, but it
> doesn't request the correct collation on the MergeJoin's inputs. The
> Project's collation is EMPTY, that means the MergeJoin doesn't require any
> collation on the Project. Apparently this will be a wrong plan.
>
> This kind of issue doesn't limit to JoinPushExpressionsRule, any rule that
> has Join as operand may have the same issue. In fact, all rules that has
> Aggregate as operand may suffer the same issue. Because in 1.23.0, we add a
> new physical operator EnumerableSortedAggregate, which requires its input
> to be sorted by group keys. AggregateProjectMergeRule can generate wrong
> plan if we apply it on EnumerableSortedAggregate. In addition, we are now
> discussing about adding a new EnumerableMergeUnion, to require the UNION's
> input sorted. Maybe in next few releases, we will have sort based
> EnumerableWindow operator. Finally we may find that only ProjectMerge,
> FilterMerge, CalcMerge can work correctly with physical operators, because
> these operators doesn't require any trait from input, so it just happens to
> work. The change of TransformationRule looks risky at first glance,
> however, it makes those rules less risky.
>
> Some thinks we should provide different rule instances / types to match
> logical/physical operators. But IMO, these built-in transformation rules
> shouldn't match physical operators at all in VolcanoPlanner, because most,
> if not all, of them can't worked correctly with physical operators. Even
> for Project/Filter/Calc merge rules, all the merge work can be done through
> logical operators, applying on physical operators will duplicate the
> process again.
>
> Even if we just view this as a best practice, and we (as Calcite project
> members) don't follow the best practice on Calcite's built-in
> EnumerableConvention, how can we tell downstream projects to follow the
> best practices? Is there any document stating the best practices? Most of
> them just follow what Calcite does. Just like children follow the example
> of their parents. If we checkout the code of Apache Druid, Apache Beam, we
> will find that they are doing the similar way as Calcite's Enumerable
> Convention. Beam even adds JoinCommuteRule, JoinAssociateRule,
> JoinPushThroughJoinRule.RIGHT, JoinPushThroughJoinRule.LEFT all together in
> its ruleset. I guess join performance is not a problem for BEAM.
>
> Finally, I don't think CalcMergeRule should be an exception, nothing is
> impossible without physical CalcMergeRule.
>
> Thanks,
> Haisheng
>
>
>

Re: Regarding CALCITE-3997

Posted by Ruben Q L <ru...@gmail.com>.
Hello Haisheng,

thanks for bringing up this discussion. I was also part of the people who
expressed concerns about CALCITE-3997 due to a backwards compatibility
issue in a very specific scenario in a downstream project. This issue was
satisfactory handled with a temporary workaround for the RC, and now this
workaround needs to be removed for the next version (CALCITE-4032).
I agree with your analysis, and I think my specific scenario should not
prevent this new general approach to be fully implemented (including
CalcMergerRule). Thus, I am ok with CALCITE-4032, I will implement my own
rule on my side to solve my specific problem, under no circumstances this
should be the responsibility of the Calcite library.

Best regards,
Ruben




Le lun. 1 juin 2020 à 23:31, Stamatis Zampetakis <za...@gmail.com> a
écrit :

> Hello,
>
> Since I was among the people who raised some concerns about this during the
> vote I would like to mention that it was mostly about the fact it came very
> late in the release cycle.
> It is better to avoid breaking changes in the RC phase since it may be
> difficult for people to react or test it.
>
> I agree with Haisheng that Calcite code should follow the best practice so
> that other projects can draw inspiration from it. Nevertheless, I would
> like to mention again that Enumerable is not a convention that is used only
> for demonstration purposes but it is also used in production systems so we
> have to be careful about backward compatibility. Still nobody complaint so
> far so it means that we are doing pretty good on this aspect :)
>
> The release brought some very cool features and I fully support the
> direction on which we are going. I don't think there is a need to open a
> new thread here since we can discuss under the respective jira(s) if
> necessary.
>
> Best,
> Stamatis
>
>
>
>
>
>
> On Wed, May 27, 2020 at 5:07 AM Haisheng Yuan <hy...@apache.org> wrote:
>
> > Hi all,
> >
> > During vote of 1.23.0 rc1, there are some discussions about the solution
> > of CALCITE-3997.
> >
> > Some may think it is a matter of best practice, but in many cases it is a
> > matter of right or wrong.
> >
> > Here is the rationale.
> >
> > The original query in CALCITE-3997 is
> >
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > WHERE t1.n1 + 1 = t2.n3
> >
> > After FilterIntoJoin and MergeJoinRule, planner generates an alternative:
> > MergeJoin on t1.k1=t2.k1 and t1.n1 + 1 = t2.n3
> >
> > The mergejoin's collation is [k1].
> >
> > This mergejoin matches JoinPushExpressionsRule, which pushes expression
> in
> > join condition, and generates a Project in child.
> > Project t1.k1, t2.k1
> >   -> MergeJoin on t1.k1=t2.k1 and t1.col = t2.n3
> >        -> Project k1, n1+1 as col
> >             -> TableScan t1
> >        -> TableScan t2
> >
> > The new mergejoin now has 2 equi-conditions, but the collation is the
> same
> > with old mergejoin, still [k1], which is wrong.
> > It should be updated to [k1,col] and [k1, n3]. Hence error.
> >
> > The similar issue already happened before, see CALCITE-3353. And it is
> > very easy to trigger same error using different query even without
> > FilterIntoJoin. If we change WHERE to AND in the query,
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > AND t1.n1 + 1 = t2.n3
> >
> > or with the following query with join condition const reductions on, we
> > can still see the issue.
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > AND (t1.n1 + 1 = t2.n3 or t1.n1 + 1 = t2.n3)
> >
> > Can we use the similar approach that in CALCITE-3353 to fix the issue?
> > We can, but it just hide the issue. Doing that way can generate mergejoin
> > with correct traitset, but the rule is still generating an invalid plan.
> > Because the MergeJoin's new input is a LogicalProject, which is generated
> > by RelBuilder.
> >
> > In Calcite, RelNode is a very special class. Unlike operators in Cascades
> > or Volcano paper, where logical/physical operator and plan expression are
> > separated, when constructing logical/physical expression tree, each
> > operator doesn't require anything trait from its child. But RelNode in
> > Calcite not only represents operator, but also plan expression, and can
> > even represent a MEMO group with required traitset, which is RelSubset.
> > When constructing a RelNode XXX, we need to specify the input RelNodes,
> the
> > input nodes' traitsets will become node XXX's requirement on those
> inputs,
> > even we are not aware that we are requesting it, which is extremely
> > error-prone.
> >
> > In above case, the new MergeJoin's left input is a LogicalProject, which
> > means the physical MergeJoin is requesting its left input to be logical
> > operator with NONE convention, which has {inf} large cost. That means the
> > newly generated MergeJoin is never implementable.
> >
> > What if we have a physical RelBuilder to generate the Project in the
> rule?
> > The consequence is worse.
> > With logical RelBuilder, it just generates an invalid plan, we are still
> > safe because it won't be selected as the best plan. If we get a physical
> > RelBuilder, which generates a physical plan that is implementable, but
> this
> > is a wrong plan. Because in RelOptUtil.pushDownJoinConditions, when
> > creating the MergeJoin, it just specifies its traitset and inputs, but it
> > doesn't request the correct collation on the MergeJoin's inputs. The
> > Project's collation is EMPTY, that means the MergeJoin doesn't require
> any
> > collation on the Project. Apparently this will be a wrong plan.
> >
> > This kind of issue doesn't limit to JoinPushExpressionsRule, any rule
> that
> > has Join as operand may have the same issue. In fact, all rules that has
> > Aggregate as operand may suffer the same issue. Because in 1.23.0, we
> add a
> > new physical operator EnumerableSortedAggregate, which requires its input
> > to be sorted by group keys. AggregateProjectMergeRule can generate wrong
> > plan if we apply it on EnumerableSortedAggregate. In addition, we are now
> > discussing about adding a new EnumerableMergeUnion, to require the
> UNION's
> > input sorted. Maybe in next few releases, we will have sort based
> > EnumerableWindow operator. Finally we may find that only ProjectMerge,
> > FilterMerge, CalcMerge can work correctly with physical operators,
> because
> > these operators doesn't require any trait from input, so it just happens
> to
> > work. The change of TransformationRule looks risky at first glance,
> > however, it makes those rules less risky.
> >
> > Some thinks we should provide different rule instances / types to match
> > logical/physical operators. But IMO, these built-in transformation rules
> > shouldn't match physical operators at all in VolcanoPlanner, because
> most,
> > if not all, of them can't worked correctly with physical operators. Even
> > for Project/Filter/Calc merge rules, all the merge work can be done
> through
> > logical operators, applying on physical operators will duplicate the
> > process again.
> >
> > Even if we just view this as a best practice, and we (as Calcite project
> > members) don't follow the best practice on Calcite's built-in
> > EnumerableConvention, how can we tell downstream projects to follow the
> > best practices? Is there any document stating the best practices? Most of
> > them just follow what Calcite does. Just like children follow the example
> > of their parents. If we checkout the code of Apache Druid, Apache Beam,
> we
> > will find that they are doing the similar way as Calcite's Enumerable
> > Convention. Beam even adds JoinCommuteRule, JoinAssociateRule,
> > JoinPushThroughJoinRule.RIGHT, JoinPushThroughJoinRule.LEFT all together
> in
> > its ruleset. I guess join performance is not a problem for BEAM.
> >
> > Finally, I don't think CalcMergeRule should be an exception, nothing is
> > impossible without physical CalcMergeRule.
> >
> > Thanks,
> > Haisheng
> >
> >
> >
>