You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Jacques Nadeau <ja...@apache.org> on 2015/03/01 05:49:47 UTC

Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Hey Guys,

I was tired of a hack we use in Drill to make trait propagation work as it
blows out the plan space.  As such, I was building a simple example to
discuss here.  Of course, I ran across additional difficulties that I would
like to discuss.

I happen to build my initial example on an old copy of Calcite and got to
the point where I was illustrating the fact that we weren't doing trait
propagation.  However, rather than post that as an example, I wanted to
move to the latest Calcite so we could discuss in that context.

After updating all the code I've found a couple places where things have
changed and I'm not even sure how to implement my original pattern.  The
biggest issue that changed things looks like CALCITE-557 [1] since it
doesn't guarantee a converter is created to ensure a non-root trait
conversion at planning time.  I imagine I'm missing something in the new
paradigm but I'm not sure how to resolve given these changes.

The simple example is:

We have a logical plan that is Agg > Project > EnumTableAccess.

We then convert to a physical plan PhysAgg > PhysSort > PhysProj > PhysScan

The PhysAgg requires a collation trait since we'll imagine it is a
streaming aggregate.  To illustrate the trait propagation problem, I've
defined PhysScan to expose the required collation already.  The goal is to
get the PhysProj to propagate the collation trait so the PhysAgg doesn't
require insertion of a PhysSort.

On the old Optiq branch (~4 months back), I get the following plans using
[2]:
--LOGICAL PLAN--
AggregateRel(group=[{0}], cnt=[COUNT($1)])
  ProjectRel(s=[$0])
    EnumerableTableAccessRel(table=[[t1]])

--PHYSICAL PLAN--
PhysAgg(group=[{0}], cnt=[COUNT($1)])
  PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
    PhysProj(s=[$0])
      PhysTable                      // dir0=[ASC-nulls-first]


On the new Branch using [3], I can't actually get this to plan because of
the removal in [1].

Is there a test case for trait propagation or simply a RelOptRule imposing
an additional trait that works after 557 was merged?

Thanks,
Jacques



[1]
https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
[2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
[3] https://gist.github.com/jacques-n/01ed0203039374688710

Re: Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Posted by Jacques Nadeau <ja...@apache.org>.
Few stream of consciouness thoughts (and note that i don't want to shutdown
any brainstorming, just some of my thinking):

@Jinfeng: We can't pick the best rel since we don't know what the best rel
is in the broader context (only in the context of this subset).

@Julian:
 - We've been very hesitant to take a add the kitchen sink and remove it
later strategy because that seems to also have the potential to grossly
expand the plan space.
- We don't actually try to address distribution in the logical planning
phase as it made the planning space too big.  As such, while collation
could possibly be known and propagated at logical planning time,
distribution can only be propagated at physical planning time.  This means
we're going to expect subsets without full traits rather than individual
traits as we'll also be requiring a physical convention.  (



On Wed, Mar 4, 2015 at 4:37 PM, Jinfeng Ni <ji...@gmail.com> wrote:

> I made some change to the rules defined in the testcase, and it seems to
> get the plan w/o thought in the testcase of withoutHack(). Certainly, the
> testcase withHack() will not work now, since I change the PhysProjRule.
> See [1].
>
> Here is the output of withoutHack() testcase:
>
> LOGICAL PLAN
> PhysAgg(group=[{0}], cnt=[COUNT($1)]): rowcount = 1.0, cumulative cost =
> {3.0 rows, 3.0 cpu, 3.0 io}, id = 80
>   PhysProj(s=[$0], i=[$1]): rowcount = 1.0, cumulative cost = {2.0 rows,
> 2.0 cpu, 2.0 io}, id = 79
>     PhysTable: rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 1.0
> io}, id = 32
>
> Here is main idea of the change.
>
> 1. Change PhysProjRule to a subclass of ConverterRule, and make it simply
> convert its child and its own from Logical ('NONE') to Physical convention.
>
> 2. Add a new PhysProjTraitPropUpRule.   This rule will match any *physical
> Project*, and will try to propagate the traits from the project's input's
> *best* RelNode.
>
> Seems to me both the new PhysProjRule and PhysProjTraitPropUpRule probably
> will not blow out the search space, since each rule fire will produce at
> most one new RelNode.
>
> The reason the original rule does not work, in my view, is.
>
>      The original PhysProjRule is fired, it *only matchs with
> LogicalProject*. However, the LogicalProject's child (SCAN) is also
> Logical.  Only when the SCAN is converted into physical, we'll add the
> RelCollation to the Physical Scan. Unfortunately, Physical Scan is not a
> child of LogicalProject. So, when a new Physical Scan node with the desired
> RelCollation is created, the original PhysProjRule will not be fired at
> all. Hence, no RelCollation trait propagation bottom-up.
>
> Does the above change make sense? Thanks!
>
> [1].
> https://github.com/jinfengni/incubator-optiq/tree/CALCITE-606-ALT
>
>
>
> On Wed, Mar 4, 2015 at 2:45 PM, Julian Hyde <ju...@gmail.com> wrote:
>
> > I’m working on this. Here are my two ideas:
> >
> > 1. Try adding SortRemoveRule.INSTANCE
> >
> > 2. Try calling the new “create” methods for each RelNode sub-class. For
> > example, LogicalProject.create:
> >
> > public static LogicalProject create(final RelNode input,
> >     final List<? extends RexNode> projects, RelDataType rowType) {
> >   final RelOptCluster cluster = input.getCluster();
> >   final RelTraitSet traitSet =
> >       cluster.traitSet().replace(Convention.NONE)
> >           .replaceIfs(
> >               RelCollationTraitDef.INSTANCE,
> >               new Supplier<List<RelCollation>>() {
> >                 public List<RelCollation> get() {
> >                   return RelMdCollation.project(input, projects);
> >                 }
> >               });
> >   return new LogicalProject(cluster, traitSet, input, projects, rowType);
> > }
> >
> > This method derives the collation trait, if enabled, before calling the
> > LogicalProject constructor. The caller does not need to derive the traits
> > (which is good, because the caller would likely get it wrong).
> >
> > The traits need to be defined before the super-class constructor is
> called
> > because the RelTraitSet is an immutable field of RelNode (it is not
> final,
> > but it should be, and a lot of code assumes that it is).
> >
> > Julian
> >
> >
> >
> > On Mar 1, 2015, at 8:55 AM, Jacques Nadeau <ja...@apache.org> wrote:
> >
> > > I've created CALCITE-606 to track this and committed some changes to
> [1].
> > > Julian, I'd love for you to look at my updated test.  I have two tests:
> > one
> > > with a propagation hack and one without.  I'd love your thoughts on how
> > to
> > > get the one without to work.
> > >
> > > thanks,
> > > Jacques
> > >
> > > [1] https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606
> > >
> > > On Sat, Feb 28, 2015 at 8:49 PM, Jacques Nadeau <ja...@apache.org>
> > wrote:
> > >
> > >> Hey Guys,
> > >>
> > >> I was tired of a hack we use in Drill to make trait propagation work
> as
> > it
> > >> blows out the plan space.  As such, I was building a simple example to
> > >> discuss here.  Of course, I ran across additional difficulties that I
> > would
> > >> like to discuss.
> > >>
> > >> I happen to build my initial example on an old copy of Calcite and got
> > to
> > >> the point where I was illustrating the fact that we weren't doing
> trait
> > >> propagation.  However, rather than post that as an example, I wanted
> to
> > >> move to the latest Calcite so we could discuss in that context.
> > >>
> > >> After updating all the code I've found a couple places where things
> have
> > >> changed and I'm not even sure how to implement my original pattern.
> The
> > >> biggest issue that changed things looks like CALCITE-557 [1] since it
> > >> doesn't guarantee a converter is created to ensure a non-root trait
> > >> conversion at planning time.  I imagine I'm missing something in the
> new
> > >> paradigm but I'm not sure how to resolve given these changes.
> > >>
> > >> The simple example is:
> > >>
> > >> We have a logical plan that is Agg > Project > EnumTableAccess.
> > >>
> > >> We then convert to a physical plan PhysAgg > PhysSort > PhysProj >
> > PhysScan
> > >>
> > >> The PhysAgg requires a collation trait since we'll imagine it is a
> > >> streaming aggregate.  To illustrate the trait propagation problem,
> I've
> > >> defined PhysScan to expose the required collation already.  The goal
> is
> > to
> > >> get the PhysProj to propagate the collation trait so the PhysAgg
> doesn't
> > >> require insertion of a PhysSort.
> > >>
> > >> On the old Optiq branch (~4 months back), I get the following plans
> > using
> > >> [2]:
> > >> --LOGICAL PLAN--
> > >> AggregateRel(group=[{0}], cnt=[COUNT($1)])
> > >>  ProjectRel(s=[$0])
> > >>    EnumerableTableAccessRel(table=[[t1]])
> > >>
> > >> --PHYSICAL PLAN--
> > >> PhysAgg(group=[{0}], cnt=[COUNT($1)])
> > >>  PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
> > >>    PhysProj(s=[$0])
> > >>      PhysTable                      // dir0=[ASC-nulls-first]
> > >>
> > >>
> > >> On the new Branch using [3], I can't actually get this to plan because
> > of
> > >> the removal in [1].
> > >>
> > >> Is there a test case for trait propagation or simply a RelOptRule
> > imposing
> > >> an additional trait that works after 557 was merged?
> > >>
> > >> Thanks,
> > >> Jacques
> > >>
> > >>
> > >>
> > >> [1]
> > >>
> >
> https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
> > >> [2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
> > >> [3] https://gist.github.com/jacques-n/01ed0203039374688710
> > >>
> >
> >
>

Re: Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Posted by Jinfeng Ni <ji...@gmail.com>.
BTW: The branch I posted above is based on Jacques's CALCITE-606 branch [2]

[2]
https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606

On Wed, Mar 4, 2015 at 4:37 PM, Jinfeng Ni <ji...@gmail.com> wrote:

> I made some change to the rules defined in the testcase, and it seems to
> get the plan w/o thought in the testcase of withoutHack(). Certainly, the
> testcase withHack() will not work now, since I change the PhysProjRule.
> See [1].
>
> Here is the output of withoutHack() testcase:
>
> LOGICAL PLAN
> PhysAgg(group=[{0}], cnt=[COUNT($1)]): rowcount = 1.0, cumulative cost =
> {3.0 rows, 3.0 cpu, 3.0 io}, id = 80
>   PhysProj(s=[$0], i=[$1]): rowcount = 1.0, cumulative cost = {2.0 rows,
> 2.0 cpu, 2.0 io}, id = 79
>     PhysTable: rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 1.0
> io}, id = 32
>
> Here is main idea of the change.
>
> 1. Change PhysProjRule to a subclass of ConverterRule, and make it simply
> convert its child and its own from Logical ('NONE') to Physical convention.
>
> 2. Add a new PhysProjTraitPropUpRule.   This rule will match any *physical
> Project*, and will try to propagate the traits from the project's input's
> *best* RelNode.
>
> Seems to me both the new PhysProjRule and PhysProjTraitPropUpRule probably
> will not blow out the search space, since each rule fire will produce at
> most one new RelNode.
>
> The reason the original rule does not work, in my view, is.
>
>      The original PhysProjRule is fired, it *only matchs with
> LogicalProject*. However, the LogicalProject's child (SCAN) is also
> Logical.  Only when the SCAN is converted into physical, we'll add the
> RelCollation to the Physical Scan. Unfortunately, Physical Scan is not a
> child of LogicalProject. So, when a new Physical Scan node with the desired
> RelCollation is created, the original PhysProjRule will not be fired at
> all. Hence, no RelCollation trait propagation bottom-up.
>
> Does the above change make sense? Thanks!
>
> [1].
> https://github.com/jinfengni/incubator-optiq/tree/CALCITE-606-ALT
>
>
>
> On Wed, Mar 4, 2015 at 2:45 PM, Julian Hyde <ju...@gmail.com> wrote:
>
>> I’m working on this. Here are my two ideas:
>>
>> 1. Try adding SortRemoveRule.INSTANCE
>>
>> 2. Try calling the new “create” methods for each RelNode sub-class. For
>> example, LogicalProject.create:
>>
>> public static LogicalProject create(final RelNode input,
>>     final List<? extends RexNode> projects, RelDataType rowType) {
>>   final RelOptCluster cluster = input.getCluster();
>>   final RelTraitSet traitSet =
>>       cluster.traitSet().replace(Convention.NONE)
>>           .replaceIfs(
>>               RelCollationTraitDef.INSTANCE,
>>               new Supplier<List<RelCollation>>() {
>>                 public List<RelCollation> get() {
>>                   return RelMdCollation.project(input, projects);
>>                 }
>>               });
>>   return new LogicalProject(cluster, traitSet, input, projects, rowType);
>> }
>>
>> This method derives the collation trait, if enabled, before calling the
>> LogicalProject constructor. The caller does not need to derive the traits
>> (which is good, because the caller would likely get it wrong).
>>
>> The traits need to be defined before the super-class constructor is
>> called because the RelTraitSet is an immutable field of RelNode (it is not
>> final, but it should be, and a lot of code assumes that it is).
>>
>> Julian
>>
>>
>>
>> On Mar 1, 2015, at 8:55 AM, Jacques Nadeau <ja...@apache.org> wrote:
>>
>> > I've created CALCITE-606 to track this and committed some changes to
>> [1].
>> > Julian, I'd love for you to look at my updated test.  I have two tests:
>> one
>> > with a propagation hack and one without.  I'd love your thoughts on how
>> to
>> > get the one without to work.
>> >
>> > thanks,
>> > Jacques
>> >
>> > [1] https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606
>> >
>> > On Sat, Feb 28, 2015 at 8:49 PM, Jacques Nadeau <ja...@apache.org>
>> wrote:
>> >
>> >> Hey Guys,
>> >>
>> >> I was tired of a hack we use in Drill to make trait propagation work
>> as it
>> >> blows out the plan space.  As such, I was building a simple example to
>> >> discuss here.  Of course, I ran across additional difficulties that I
>> would
>> >> like to discuss.
>> >>
>> >> I happen to build my initial example on an old copy of Calcite and got
>> to
>> >> the point where I was illustrating the fact that we weren't doing trait
>> >> propagation.  However, rather than post that as an example, I wanted to
>> >> move to the latest Calcite so we could discuss in that context.
>> >>
>> >> After updating all the code I've found a couple places where things
>> have
>> >> changed and I'm not even sure how to implement my original pattern.
>> The
>> >> biggest issue that changed things looks like CALCITE-557 [1] since it
>> >> doesn't guarantee a converter is created to ensure a non-root trait
>> >> conversion at planning time.  I imagine I'm missing something in the
>> new
>> >> paradigm but I'm not sure how to resolve given these changes.
>> >>
>> >> The simple example is:
>> >>
>> >> We have a logical plan that is Agg > Project > EnumTableAccess.
>> >>
>> >> We then convert to a physical plan PhysAgg > PhysSort > PhysProj >
>> PhysScan
>> >>
>> >> The PhysAgg requires a collation trait since we'll imagine it is a
>> >> streaming aggregate.  To illustrate the trait propagation problem, I've
>> >> defined PhysScan to expose the required collation already.  The goal
>> is to
>> >> get the PhysProj to propagate the collation trait so the PhysAgg
>> doesn't
>> >> require insertion of a PhysSort.
>> >>
>> >> On the old Optiq branch (~4 months back), I get the following plans
>> using
>> >> [2]:
>> >> --LOGICAL PLAN--
>> >> AggregateRel(group=[{0}], cnt=[COUNT($1)])
>> >>  ProjectRel(s=[$0])
>> >>    EnumerableTableAccessRel(table=[[t1]])
>> >>
>> >> --PHYSICAL PLAN--
>> >> PhysAgg(group=[{0}], cnt=[COUNT($1)])
>> >>  PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
>> >>    PhysProj(s=[$0])
>> >>      PhysTable                      // dir0=[ASC-nulls-first]
>> >>
>> >>
>> >> On the new Branch using [3], I can't actually get this to plan because
>> of
>> >> the removal in [1].
>> >>
>> >> Is there a test case for trait propagation or simply a RelOptRule
>> imposing
>> >> an additional trait that works after 557 was merged?
>> >>
>> >> Thanks,
>> >> Jacques
>> >>
>> >>
>> >>
>> >> [1]
>> >>
>> https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
>> >> [2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
>> >> [3] https://gist.github.com/jacques-n/01ed0203039374688710
>> >>
>>
>>
>

Re: Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Posted by Jinfeng Ni <ji...@gmail.com>.
I made some change to the rules defined in the testcase, and it seems to
get the plan w/o thought in the testcase of withoutHack(). Certainly, the
testcase withHack() will not work now, since I change the PhysProjRule.
See [1].

Here is the output of withoutHack() testcase:

LOGICAL PLAN
PhysAgg(group=[{0}], cnt=[COUNT($1)]): rowcount = 1.0, cumulative cost =
{3.0 rows, 3.0 cpu, 3.0 io}, id = 80
  PhysProj(s=[$0], i=[$1]): rowcount = 1.0, cumulative cost = {2.0 rows,
2.0 cpu, 2.0 io}, id = 79
    PhysTable: rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 1.0
io}, id = 32

Here is main idea of the change.

1. Change PhysProjRule to a subclass of ConverterRule, and make it simply
convert its child and its own from Logical ('NONE') to Physical convention.

2. Add a new PhysProjTraitPropUpRule.   This rule will match any *physical
Project*, and will try to propagate the traits from the project's input's
*best* RelNode.

Seems to me both the new PhysProjRule and PhysProjTraitPropUpRule probably
will not blow out the search space, since each rule fire will produce at
most one new RelNode.

The reason the original rule does not work, in my view, is.

     The original PhysProjRule is fired, it *only matchs with
LogicalProject*. However, the LogicalProject's child (SCAN) is also
Logical.  Only when the SCAN is converted into physical, we'll add the
RelCollation to the Physical Scan. Unfortunately, Physical Scan is not a
child of LogicalProject. So, when a new Physical Scan node with the desired
RelCollation is created, the original PhysProjRule will not be fired at
all. Hence, no RelCollation trait propagation bottom-up.

Does the above change make sense? Thanks!

[1].
https://github.com/jinfengni/incubator-optiq/tree/CALCITE-606-ALT



On Wed, Mar 4, 2015 at 2:45 PM, Julian Hyde <ju...@gmail.com> wrote:

> I’m working on this. Here are my two ideas:
>
> 1. Try adding SortRemoveRule.INSTANCE
>
> 2. Try calling the new “create” methods for each RelNode sub-class. For
> example, LogicalProject.create:
>
> public static LogicalProject create(final RelNode input,
>     final List<? extends RexNode> projects, RelDataType rowType) {
>   final RelOptCluster cluster = input.getCluster();
>   final RelTraitSet traitSet =
>       cluster.traitSet().replace(Convention.NONE)
>           .replaceIfs(
>               RelCollationTraitDef.INSTANCE,
>               new Supplier<List<RelCollation>>() {
>                 public List<RelCollation> get() {
>                   return RelMdCollation.project(input, projects);
>                 }
>               });
>   return new LogicalProject(cluster, traitSet, input, projects, rowType);
> }
>
> This method derives the collation trait, if enabled, before calling the
> LogicalProject constructor. The caller does not need to derive the traits
> (which is good, because the caller would likely get it wrong).
>
> The traits need to be defined before the super-class constructor is called
> because the RelTraitSet is an immutable field of RelNode (it is not final,
> but it should be, and a lot of code assumes that it is).
>
> Julian
>
>
>
> On Mar 1, 2015, at 8:55 AM, Jacques Nadeau <ja...@apache.org> wrote:
>
> > I've created CALCITE-606 to track this and committed some changes to [1].
> > Julian, I'd love for you to look at my updated test.  I have two tests:
> one
> > with a propagation hack and one without.  I'd love your thoughts on how
> to
> > get the one without to work.
> >
> > thanks,
> > Jacques
> >
> > [1] https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606
> >
> > On Sat, Feb 28, 2015 at 8:49 PM, Jacques Nadeau <ja...@apache.org>
> wrote:
> >
> >> Hey Guys,
> >>
> >> I was tired of a hack we use in Drill to make trait propagation work as
> it
> >> blows out the plan space.  As such, I was building a simple example to
> >> discuss here.  Of course, I ran across additional difficulties that I
> would
> >> like to discuss.
> >>
> >> I happen to build my initial example on an old copy of Calcite and got
> to
> >> the point where I was illustrating the fact that we weren't doing trait
> >> propagation.  However, rather than post that as an example, I wanted to
> >> move to the latest Calcite so we could discuss in that context.
> >>
> >> After updating all the code I've found a couple places where things have
> >> changed and I'm not even sure how to implement my original pattern.  The
> >> biggest issue that changed things looks like CALCITE-557 [1] since it
> >> doesn't guarantee a converter is created to ensure a non-root trait
> >> conversion at planning time.  I imagine I'm missing something in the new
> >> paradigm but I'm not sure how to resolve given these changes.
> >>
> >> The simple example is:
> >>
> >> We have a logical plan that is Agg > Project > EnumTableAccess.
> >>
> >> We then convert to a physical plan PhysAgg > PhysSort > PhysProj >
> PhysScan
> >>
> >> The PhysAgg requires a collation trait since we'll imagine it is a
> >> streaming aggregate.  To illustrate the trait propagation problem, I've
> >> defined PhysScan to expose the required collation already.  The goal is
> to
> >> get the PhysProj to propagate the collation trait so the PhysAgg doesn't
> >> require insertion of a PhysSort.
> >>
> >> On the old Optiq branch (~4 months back), I get the following plans
> using
> >> [2]:
> >> --LOGICAL PLAN--
> >> AggregateRel(group=[{0}], cnt=[COUNT($1)])
> >>  ProjectRel(s=[$0])
> >>    EnumerableTableAccessRel(table=[[t1]])
> >>
> >> --PHYSICAL PLAN--
> >> PhysAgg(group=[{0}], cnt=[COUNT($1)])
> >>  PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
> >>    PhysProj(s=[$0])
> >>      PhysTable                      // dir0=[ASC-nulls-first]
> >>
> >>
> >> On the new Branch using [3], I can't actually get this to plan because
> of
> >> the removal in [1].
> >>
> >> Is there a test case for trait propagation or simply a RelOptRule
> imposing
> >> an additional trait that works after 557 was merged?
> >>
> >> Thanks,
> >> Jacques
> >>
> >>
> >>
> >> [1]
> >>
> https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
> >> [2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
> >> [3] https://gist.github.com/jacques-n/01ed0203039374688710
> >>
>
>

Re: Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Posted by Julian Hyde <ju...@gmail.com>.
I’m working on this. Here are my two ideas:

1. Try adding SortRemoveRule.INSTANCE

2. Try calling the new “create” methods for each RelNode sub-class. For example, LogicalProject.create:

public static LogicalProject create(final RelNode input,
    final List<? extends RexNode> projects, RelDataType rowType) {
  final RelOptCluster cluster = input.getCluster();
  final RelTraitSet traitSet =
      cluster.traitSet().replace(Convention.NONE)
          .replaceIfs(
              RelCollationTraitDef.INSTANCE,
              new Supplier<List<RelCollation>>() {
                public List<RelCollation> get() {
                  return RelMdCollation.project(input, projects);
                }
              });
  return new LogicalProject(cluster, traitSet, input, projects, rowType);
}

This method derives the collation trait, if enabled, before calling the LogicalProject constructor. The caller does not need to derive the traits (which is good, because the caller would likely get it wrong).

The traits need to be defined before the super-class constructor is called because the RelTraitSet is an immutable field of RelNode (it is not final, but it should be, and a lot of code assumes that it is).

Julian



On Mar 1, 2015, at 8:55 AM, Jacques Nadeau <ja...@apache.org> wrote:

> I've created CALCITE-606 to track this and committed some changes to [1].
> Julian, I'd love for you to look at my updated test.  I have two tests: one
> with a propagation hack and one without.  I'd love your thoughts on how to
> get the one without to work.
> 
> thanks,
> Jacques
> 
> [1] https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606
> 
> On Sat, Feb 28, 2015 at 8:49 PM, Jacques Nadeau <ja...@apache.org> wrote:
> 
>> Hey Guys,
>> 
>> I was tired of a hack we use in Drill to make trait propagation work as it
>> blows out the plan space.  As such, I was building a simple example to
>> discuss here.  Of course, I ran across additional difficulties that I would
>> like to discuss.
>> 
>> I happen to build my initial example on an old copy of Calcite and got to
>> the point where I was illustrating the fact that we weren't doing trait
>> propagation.  However, rather than post that as an example, I wanted to
>> move to the latest Calcite so we could discuss in that context.
>> 
>> After updating all the code I've found a couple places where things have
>> changed and I'm not even sure how to implement my original pattern.  The
>> biggest issue that changed things looks like CALCITE-557 [1] since it
>> doesn't guarantee a converter is created to ensure a non-root trait
>> conversion at planning time.  I imagine I'm missing something in the new
>> paradigm but I'm not sure how to resolve given these changes.
>> 
>> The simple example is:
>> 
>> We have a logical plan that is Agg > Project > EnumTableAccess.
>> 
>> We then convert to a physical plan PhysAgg > PhysSort > PhysProj > PhysScan
>> 
>> The PhysAgg requires a collation trait since we'll imagine it is a
>> streaming aggregate.  To illustrate the trait propagation problem, I've
>> defined PhysScan to expose the required collation already.  The goal is to
>> get the PhysProj to propagate the collation trait so the PhysAgg doesn't
>> require insertion of a PhysSort.
>> 
>> On the old Optiq branch (~4 months back), I get the following plans using
>> [2]:
>> --LOGICAL PLAN--
>> AggregateRel(group=[{0}], cnt=[COUNT($1)])
>>  ProjectRel(s=[$0])
>>    EnumerableTableAccessRel(table=[[t1]])
>> 
>> --PHYSICAL PLAN--
>> PhysAgg(group=[{0}], cnt=[COUNT($1)])
>>  PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
>>    PhysProj(s=[$0])
>>      PhysTable                      // dir0=[ASC-nulls-first]
>> 
>> 
>> On the new Branch using [3], I can't actually get this to plan because of
>> the removal in [1].
>> 
>> Is there a test case for trait propagation or simply a RelOptRule imposing
>> an additional trait that works after 557 was merged?
>> 
>> Thanks,
>> Jacques
>> 
>> 
>> 
>> [1]
>> https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
>> [2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
>> [3] https://gist.github.com/jacques-n/01ed0203039374688710
>> 


Re: Trait Propagation, requesting traits using RelOptRule.convert() & CALCITE-557

Posted by Jacques Nadeau <ja...@apache.org>.
I've created CALCITE-606 to track this and committed some changes to [1].
Julian, I'd love for you to look at my updated test.  I have two tests: one
with a propagation hack and one without.  I'd love your thoughts on how to
get the one without to work.

thanks,
Jacques

[1] https://github.com/jacques-n/incubator-calcite/tree/CALCITE-606

On Sat, Feb 28, 2015 at 8:49 PM, Jacques Nadeau <ja...@apache.org> wrote:

> Hey Guys,
>
> I was tired of a hack we use in Drill to make trait propagation work as it
> blows out the plan space.  As such, I was building a simple example to
> discuss here.  Of course, I ran across additional difficulties that I would
> like to discuss.
>
> I happen to build my initial example on an old copy of Calcite and got to
> the point where I was illustrating the fact that we weren't doing trait
> propagation.  However, rather than post that as an example, I wanted to
> move to the latest Calcite so we could discuss in that context.
>
> After updating all the code I've found a couple places where things have
> changed and I'm not even sure how to implement my original pattern.  The
> biggest issue that changed things looks like CALCITE-557 [1] since it
> doesn't guarantee a converter is created to ensure a non-root trait
> conversion at planning time.  I imagine I'm missing something in the new
> paradigm but I'm not sure how to resolve given these changes.
>
> The simple example is:
>
> We have a logical plan that is Agg > Project > EnumTableAccess.
>
> We then convert to a physical plan PhysAgg > PhysSort > PhysProj > PhysScan
>
> The PhysAgg requires a collation trait since we'll imagine it is a
> streaming aggregate.  To illustrate the trait propagation problem, I've
> defined PhysScan to expose the required collation already.  The goal is to
> get the PhysProj to propagate the collation trait so the PhysAgg doesn't
> require insertion of a PhysSort.
>
> On the old Optiq branch (~4 months back), I get the following plans using
> [2]:
> --LOGICAL PLAN--
> AggregateRel(group=[{0}], cnt=[COUNT($1)])
>   ProjectRel(s=[$0])
>     EnumerableTableAccessRel(table=[[t1]])
>
> --PHYSICAL PLAN--
> PhysAgg(group=[{0}], cnt=[COUNT($1)])
>   PhysSort(sort0=[$0], dir0=[ASC-nulls-first])
>     PhysProj(s=[$0])
>       PhysTable                      // dir0=[ASC-nulls-first]
>
>
> On the new Branch using [3], I can't actually get this to plan because of
> the removal in [1].
>
> Is there a test case for trait propagation or simply a RelOptRule imposing
> an additional trait that works after 557 was merged?
>
> Thanks,
> Jacques
>
>
>
> [1]
> https://github.com/apache/incubator-calcite/commit/506c1ab46b6b3e0279c799acc4e4412d542e8b6d#diff-8
> [2] https://gist.github.com/jacques-n/2f004311bd0ead714ca4
> [3] https://gist.github.com/jacques-n/01ed0203039374688710
>