You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Jacques Nadeau <ja...@apache.org> on 2021/12/02 05:02:06 UTC

Feedback on a generic return type version of RelShuttle?

Right now, RelNode's accept method and RelShuttle forces a user to return
RelNodes. This makes tree traversal/conversion extra painful in main
circumstances (you want to have more specific relnode classes explicit, you
want to rewrite to non-relnodes, etc). I've worked around this for many
years but really think it should be resolved inside Calcite. I wanted to
get some feedback on introducing a more generic pattern (that RelShuttle
becomes a concrete variation of). If people are on board with the change,
I'll pull it through the codebase.

You can see the rough idea in this wip patch:
https://github.com/apache/calcite/pull/2625

Basically, introduce a new parameterized visitor that people can use. This
doesn't change the existing behavior of RelShuttle (it's now just a
specific version of this generic version) but it does allow other rewrite
patterns.

Thoughts?

Re: Feedback on a generic return type version of RelShuttle?

Posted by Vladimir Sitnikov <si...@gmail.com>.
Steven, James,

>there are 22 implementations of RelShuttleImpl(excluding tests)

14 of them are RelHomogeneousShuttle subclasses.
RelHomogeneousShuttle redirects all customized visit methods into a single
visit(RelNode) method.

That means Calcite has only 6 usages (!) of the "visitor" feature, and the
feature basically forces ALL RelNodes
implement an obscure RelNode#accept.

The method is ill-defined.
Here's its javadoc: "Accepts a visit from a shuttle"
It makes no sense at all. How should I implement this?

@return A copy of this node incorporating changes made by the shuttle to
   * this node's children

None of Calcite's RelNode implementations walk over children or something
when implementing accept(RelShuttle).

----

The only non-test, non-homogeneous subclasses are:

ToLogicalConverter
RelNodesExprsHandler
SqlToRelConverter#convertSelectImpl # "attach hint to the first Hintable
rel"
RelOptMaterialization#tryUseStar
ViewTable#expandView

I do not know if those classes require any of the "visitor" features,
however,
it is clear that the half-visitor is complicating things more often than it
helps.

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Posted by Vladimir Sitnikov <si...@gmail.com>.
Steven>Look at the examples in the wikipedia page you link to

Let me please copy Wikipedia for you.
I agree it is not that easy to spot, however, please pay attention to
Car#accept method.

class Car {
...
    @Override
    public void accept(CarElementVisitor visitor) {
        for (CarElement element : elements) {
            element.accept(visitor);
        }
        visitor.visit(this);
    }
}

The notable difference is that Wikipedia suggests that Car is in charge of
enumerating Car elements.
Of course, trivial car elements just call visitor.visit(this).
However, compound elements should call the visitor and do re-composition or
whatever.

CarElementPrintVisitor in Wikipedia does **not** enumerate Car's elements.
There's no CarElementPrintVisitor#visitElements

In other words, if we want Visitor Pattern, we need to move
RelShuttleImpl#visitChildren(RelNode rel)
into the **default** implementation of RelNode#accept(RelShuttle)

However, as I said, if we consider moving towards the proper visitor
pattern, we might consider
the approach listed in "From Object Algebras to Finally Tagless
Interpreters" article above.

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Posted by Steven Phillips <st...@dremio.com>.
I don't really understand what you are trying to say. An accept() method in
the classes on which visitor operates is a standard, normal part of the
visitor pattern. Look at the examples in the wikipedia page you link to.  I
think the accept() method is necessary for compile time binding to work.

And RelShuttleImpl uses the RelNode.accept() method in the visitChild()
method. Wouldn't replacing the accept() method with visit(RelNode), which
is what I think you are suggesting, not have the correct dispatch?

On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> Steven>Agree with James, and that's not even including implementations in
> other
> Steven>codebases that use calcite (e.g. there are dozens of implementations
> in
> Steven>Dremio's codebase).
>
> Can you please show RelShuttle in Dremio codebase?
> I see no RelShuttle implementations:
> https://github.com/dremio/dremio-oss/search?q=relshuttle
>
> Note: RelShuttleImpl does not count since RelShuttleImpl does not really
> need RelNode cooperation.
> RelShuttleImpl just enumerates children of the relations, and it does not
> rely on accept(..) calls.
>
> It can do the very same thing without calling accept(this), so we can
> remove  RelNode#accept(RelNode) methods,
> and we can still get the same RelShuttleImpl.
> It can have a series of if (... instanceof) checks or a Map#get(getClass())
> or even https://github.com/forax/exotic#visitor---javadoc
>
> Vladimir
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Jacques Nadeau <ja...@apache.org>.
Stamatis et al, any additional feedback on desired features for such a
change?

Thanks,
Jacques

On Mon, Dec 6, 2021 at 11:48 AM Julian Hyde <jh...@gmail.com> wrote:

>
>
> > On Dec 4, 2021, at 9:41 PM, Jacques Nadeau <ja...@apache.org> wrote:
> >
> > A few thoughts:
> >
> > Algebra approach.
> >
> >
> > If I understand the model correctly, I actually implemented this in a
> > different project last year (not knowing it was now a design pattern). My
> > main finding was that while it had certain benefits, it was pretty
> > confusing for most people.
>
> Thanks, Vladimir, for educating us about the algebra approach.
>
> I have a feeling that OperandBuilder (used in the Config of many planner
> rules) uses something like this pattern. I was not aware of the algebraic
> approach, but OperandBuilder emerged as I tried to make the code type-safe
> and reusable.
>
> The irony of the “algebra approach” is that Calcite is an algebra engine.
> And a very powerful one at that. People who are new to Calcite often try to
> use a visitor pattern on their RelNode tree, and we have to direct them to
> the “algebra approach” of using collections of rewrite rules to accomplish
> their task.
>
> For this reason, I think we should keep the built-in visitors fairly
> simple.
>
> Also, my work in https://issues.apache.org/jira/browse/CALCITE-4559 <
> https://issues.apache.org/jira/browse/CALCITE-4559> (not finished, sadly)
> can be viewed as an attempt to convert RexSimplify (which is becoming
> spaghetti code) into something more algebraic.
>
> Julian
>
>
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Julian Hyde <jh...@gmail.com>.

> On Dec 4, 2021, at 9:41 PM, Jacques Nadeau <ja...@apache.org> wrote:
> 
> A few thoughts:
> 
> Algebra approach.
> 
> 
> If I understand the model correctly, I actually implemented this in a
> different project last year (not knowing it was now a design pattern). My
> main finding was that while it had certain benefits, it was pretty
> confusing for most people.

Thanks, Vladimir, for educating us about the algebra approach.

I have a feeling that OperandBuilder (used in the Config of many planner rules) uses something like this pattern. I was not aware of the algebraic approach, but OperandBuilder emerged as I tried to make the code type-safe and reusable.

The irony of the “algebra approach” is that Calcite is an algebra engine. And a very powerful one at that. People who are new to Calcite often try to use a visitor pattern on their RelNode tree, and we have to direct them to the “algebra approach” of using collections of rewrite rules to accomplish their task.

For this reason, I think we should keep the built-in visitors fairly simple.

Also, my work in https://issues.apache.org/jira/browse/CALCITE-4559 <https://issues.apache.org/jira/browse/CALCITE-4559> (not finished, sadly) can be viewed as an attempt to convert RexSimplify (which is becoming spaghetti code) into something more algebraic.

Julian



Re: Feedback on a generic return type version of RelShuttle?

Posted by Jacques Nadeau <ja...@apache.org>.
A few thoughts:

Algebra approach.


If I understand the model correctly, I actually implemented this in a
different project last year (not knowing it was now a design pattern). My
main finding was that while it had certain benefits, it was pretty
confusing for most people.

I am planning to go again over the extensions of RelVisitor/RelShuttle in
> Calcite and Hive and see if there are other changes that may make sense to
> incorporate in the new design.
>

This would be very helpful. I hope others will do the same. I hate
introducing new apis at this level so let's make sure we're accomplishing
support of enough key use cases to make it worth it.

use default methods to avoid forcing everybody to implement if not
> necessary;


Agreed.


> base dispatching on abstract relational expressions and not logical ones.


Agreed again. A wish (maybe difficult without making a big mess) would
ideally one could do something similar to how the relmetadataquery handlers
work where things are priority bound (from specific to generic). For
example, I may want to handle LogicalProject in a special way that is
different than the base class of Project (imagine a tree that might be
mixing multiple conventions or have multiple variations of a project e.g. a
columnar project and a row-wise project).

Another wish that the algebra model seems to hit on: have a clean pattern
for new kinds of relnodes outside the core. Let's say I want to make a
visitor that handles DruidAggregateNode in a special way. Is there a clean
way to extend traversal to work with that node type cleanly?



On Sat, Dec 4, 2021 at 2:53 PM Stamatis Zampetakis <za...@gmail.com>
wrote:

> The visitor pattern is not something written in stone. We all have a
> general understanding of what a visitor is and how it is used but there are
> also widely used variations.
>
> The following snippet (referring to the visitor pattern) is taken from
> "Design Patterns: Elements of Reusable Object-Oriented Software", the first
> reference of Wikipedia and a book I like a lot.
>
> "Who is responsible for traversing the object structure? A visitor must
> visit each
> element of the object structure. The question is, how does it get there? We
> can
> put responsibility for traversal in any of three places: in the object
> structure,
> in the visitor, or in a separate iterator object (see Iterator (257))."
>
> There is not really right or wrong here; there are alternatives with
> different advantages/disadvantages.
>
> I like the idea of having a generic return type for the shuttle, I think it
> is useful.
> I think we should take the opportunity to see how we can make the new
> API useful for the vast majority of use-cases.
> Ideally, I would like to see a solution that can effectively replace both
> RelVisitor and RelShuttle interfaces.
> I am not saying remove RelVisitor, and RelShuttle directly but it would be
> nice if we were able to deprecate them and remove them in a few releases
> by providing a better alternative.
>
> Both RelVisitor, and RelShuttle, have been in the code base for quite some
> time now and they have many extensions both in the Calcite repo and in
> other projects using Calcite.
> I would suggest spending a bit of time on these extensions and see what
> other makes sense to introduce in the new API.
> I am planning to go again over the extensions of RelVisitor/RelShuttle in
> Calcite and Hive and see if there are other changes that may make sense to
> incorporate in the new design.
> I would encourage others to do the same.
>
> A few things that I had in mind at some point regarding the design of the
> visitor are:
> use default methods to avoid forcing everybody to implement if not
> necessary;
> do not rely on method overloading;
> base dispatching on abstract relational expressions and not logical ones.
>
> Example
>
> Instead of:
>
> RelNode visit(LogicalFilter filter);
>
> use:
>
> default <T extends Filter> R visitFilter(T filter) {
>     return null;
> }
>
> and other similar things but in order to get this right I think we have to
> work out some concrete examples and rework existing use-cases.
>
> The object-algebra approach shared by Vladimir sounds interesting but again
> we should evaluate the existing use-cases and see what we really need.
>
> Many thanks Jacques for taking this initiative, and all the rest for the
> valuable feedback. I really think it is worth pushing this forward and
> getting feedback is important to get this right.
>
> Best,
> Stamatis
>
>
> On Fri, Dec 3, 2021 at 10:30 PM James Starr <ja...@gmail.com> wrote:
>
> > What if RelNode.accept(RelShuttle) was deprecated and the RelShuttleImpl
> > had an if(node instanceof Logical*Node) instead?
> >
> > I am against further enhancing RelNode.accept(RelShuttle), because, as
> > Vladimir pointed out, it does not actually follow a visitor pattern(aka
> > walking the tree) and fixing that could be a breaking change.  I would
> > prefer if the tree traversal and the dispatch was independent of the
> > RelNode implementation.  This would help with loose coupling so we do not
> > have email threads talking visit patterns when people like want to do
> > something rather simple.
> >
> > James
> >
> > On Fri, Dec 3, 2021 at 1:11 PM Jacques Nadeau <ja...@apache.org>
> wrote:
> >
> > > I think this thread has forked into two distinct topics:
> > >
> > >    1. A general discussion of the utility of visitors
> > >    2. Support for generic return type for RelShuttle (the original
> > purpose
> > >    of the thread)
> > >
> > >
> > > My quick thoughts on #1: I've historically used the visitors in Calcite
> > > extensively (external to Calcite and typically not using RelShuttleImpl
> > > which requires an instance per user/thread due to internal use of a
> > deque.
> > > I don't support the removal of visitors or the current impl of accept
> > from
> > > RelNode. I think that would be disruptive to the community.
> > >
> > > For #2: It seems like most people are supportive of this kind of
> pattern
> > > assuming it doesn't impact compatibility. I 100% agree on compat and am
> > > exploring the best way to actually expose this safely. The problem I
> > > haven't yet found a great solution for is if someone has built a new
> > > relnode type that overrides the default RelShuttle based accept
> methods.
> > In
> > > this situation, to maintain compatibility, you can't actually delegate
> > the
> > > relshuttle use to the generic method (as you would skip over the
> > > customization). The options I've identified:
> > >
> > > a) Have a (* instanceof RelShuttle) check in the generic path that
> routes
> > > to the specific accept methods.
> > > b) Use reflection to statically determine if a particular relnode class
> > has
> > > a relsthuttle based override and then route to it in those
> circumstances
> > > (similar to (a) but more complex and likely slightly more performant).
> > > c) Don't try to collapse the existing RelShuttle into a new pattern and
> > > instead just introduce a new pattern independently.
> > > d) Start with a or b but also deprecate the RelShuttle specific
> 'accept'
> > > methods. Then in a subsequent release, remove the RelShuttle specific
> > > methods entirely.
> > >
> > > I'm currently leaning towards starting with (a) since it is the easiest
> > for
> > > users to understand. However, I'd love to hear if others have better
> > ideas
> > > or opinions on these existing ideas.
> > >
> > > Thanks!
> > > Jacques
> > >
> > >
> > >
> > >
> > > On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
> > > sitnikov.vladimir@gmail.com> wrote:
> > >
> > > > Steven>Agree with James, and that's not even including
> implementations
> > in
> > > > other
> > > > Steven>codebases that use calcite (e.g. there are dozens of
> > > implementations
> > > > in
> > > > Steven>Dremio's codebase).
> > > >
> > > > Can you please show RelShuttle in Dremio codebase?
> > > > I see no RelShuttle implementations:
> > > > https://github.com/dremio/dremio-oss/search?q=relshuttle
> > > >
> > > > Note: RelShuttleImpl does not count since RelShuttleImpl does not
> > really
> > > > need RelNode cooperation.
> > > > RelShuttleImpl just enumerates children of the relations, and it does
> > not
> > > > rely on accept(..) calls.
> > > >
> > > > It can do the very same thing without calling accept(this), so we can
> > > > remove  RelNode#accept(RelNode) methods,
> > > > and we can still get the same RelShuttleImpl.
> > > > It can have a series of if (... instanceof) checks or a
> > > Map#get(getClass())
> > > > or even https://github.com/forax/exotic#visitor---javadoc
> > > >
> > > > Vladimir
> > > >
> > >
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Stamatis Zampetakis <za...@gmail.com>.
The visitor pattern is not something written in stone. We all have a
general understanding of what a visitor is and how it is used but there are
also widely used variations.

The following snippet (referring to the visitor pattern) is taken from
"Design Patterns: Elements of Reusable Object-Oriented Software", the first
reference of Wikipedia and a book I like a lot.

"Who is responsible for traversing the object structure? A visitor must
visit each
element of the object structure. The question is, how does it get there? We
can
put responsibility for traversal in any of three places: in the object
structure,
in the visitor, or in a separate iterator object (see Iterator (257))."

There is not really right or wrong here; there are alternatives with
different advantages/disadvantages.

I like the idea of having a generic return type for the shuttle, I think it
is useful.
I think we should take the opportunity to see how we can make the new
API useful for the vast majority of use-cases.
Ideally, I would like to see a solution that can effectively replace both
RelVisitor and RelShuttle interfaces.
I am not saying remove RelVisitor, and RelShuttle directly but it would be
nice if we were able to deprecate them and remove them in a few releases
by providing a better alternative.

Both RelVisitor, and RelShuttle, have been in the code base for quite some
time now and they have many extensions both in the Calcite repo and in
other projects using Calcite.
I would suggest spending a bit of time on these extensions and see what
other makes sense to introduce in the new API.
I am planning to go again over the extensions of RelVisitor/RelShuttle in
Calcite and Hive and see if there are other changes that may make sense to
incorporate in the new design.
I would encourage others to do the same.

A few things that I had in mind at some point regarding the design of the
visitor are:
use default methods to avoid forcing everybody to implement if not
necessary;
do not rely on method overloading;
base dispatching on abstract relational expressions and not logical ones.

Example

Instead of:

RelNode visit(LogicalFilter filter);

use:

default <T extends Filter> R visitFilter(T filter) {
    return null;
}

and other similar things but in order to get this right I think we have to
work out some concrete examples and rework existing use-cases.

The object-algebra approach shared by Vladimir sounds interesting but again
we should evaluate the existing use-cases and see what we really need.

Many thanks Jacques for taking this initiative, and all the rest for the
valuable feedback. I really think it is worth pushing this forward and
getting feedback is important to get this right.

Best,
Stamatis


On Fri, Dec 3, 2021 at 10:30 PM James Starr <ja...@gmail.com> wrote:

> What if RelNode.accept(RelShuttle) was deprecated and the RelShuttleImpl
> had an if(node instanceof Logical*Node) instead?
>
> I am against further enhancing RelNode.accept(RelShuttle), because, as
> Vladimir pointed out, it does not actually follow a visitor pattern(aka
> walking the tree) and fixing that could be a breaking change.  I would
> prefer if the tree traversal and the dispatch was independent of the
> RelNode implementation.  This would help with loose coupling so we do not
> have email threads talking visit patterns when people like want to do
> something rather simple.
>
> James
>
> On Fri, Dec 3, 2021 at 1:11 PM Jacques Nadeau <ja...@apache.org> wrote:
>
> > I think this thread has forked into two distinct topics:
> >
> >    1. A general discussion of the utility of visitors
> >    2. Support for generic return type for RelShuttle (the original
> purpose
> >    of the thread)
> >
> >
> > My quick thoughts on #1: I've historically used the visitors in Calcite
> > extensively (external to Calcite and typically not using RelShuttleImpl
> > which requires an instance per user/thread due to internal use of a
> deque.
> > I don't support the removal of visitors or the current impl of accept
> from
> > RelNode. I think that would be disruptive to the community.
> >
> > For #2: It seems like most people are supportive of this kind of pattern
> > assuming it doesn't impact compatibility. I 100% agree on compat and am
> > exploring the best way to actually expose this safely. The problem I
> > haven't yet found a great solution for is if someone has built a new
> > relnode type that overrides the default RelShuttle based accept methods.
> In
> > this situation, to maintain compatibility, you can't actually delegate
> the
> > relshuttle use to the generic method (as you would skip over the
> > customization). The options I've identified:
> >
> > a) Have a (* instanceof RelShuttle) check in the generic path that routes
> > to the specific accept methods.
> > b) Use reflection to statically determine if a particular relnode class
> has
> > a relsthuttle based override and then route to it in those circumstances
> > (similar to (a) but more complex and likely slightly more performant).
> > c) Don't try to collapse the existing RelShuttle into a new pattern and
> > instead just introduce a new pattern independently.
> > d) Start with a or b but also deprecate the RelShuttle specific 'accept'
> > methods. Then in a subsequent release, remove the RelShuttle specific
> > methods entirely.
> >
> > I'm currently leaning towards starting with (a) since it is the easiest
> for
> > users to understand. However, I'd love to hear if others have better
> ideas
> > or opinions on these existing ideas.
> >
> > Thanks!
> > Jacques
> >
> >
> >
> >
> > On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
> > sitnikov.vladimir@gmail.com> wrote:
> >
> > > Steven>Agree with James, and that's not even including implementations
> in
> > > other
> > > Steven>codebases that use calcite (e.g. there are dozens of
> > implementations
> > > in
> > > Steven>Dremio's codebase).
> > >
> > > Can you please show RelShuttle in Dremio codebase?
> > > I see no RelShuttle implementations:
> > > https://github.com/dremio/dremio-oss/search?q=relshuttle
> > >
> > > Note: RelShuttleImpl does not count since RelShuttleImpl does not
> really
> > > need RelNode cooperation.
> > > RelShuttleImpl just enumerates children of the relations, and it does
> not
> > > rely on accept(..) calls.
> > >
> > > It can do the very same thing without calling accept(this), so we can
> > > remove  RelNode#accept(RelNode) methods,
> > > and we can still get the same RelShuttleImpl.
> > > It can have a series of if (... instanceof) checks or a
> > Map#get(getClass())
> > > or even https://github.com/forax/exotic#visitor---javadoc
> > >
> > > Vladimir
> > >
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by James Starr <ja...@gmail.com>.
What if RelNode.accept(RelShuttle) was deprecated and the RelShuttleImpl
had an if(node instanceof Logical*Node) instead?

I am against further enhancing RelNode.accept(RelShuttle), because, as
Vladimir pointed out, it does not actually follow a visitor pattern(aka
walking the tree) and fixing that could be a breaking change.  I would
prefer if the tree traversal and the dispatch was independent of the
RelNode implementation.  This would help with loose coupling so we do not
have email threads talking visit patterns when people like want to do
something rather simple.

James

On Fri, Dec 3, 2021 at 1:11 PM Jacques Nadeau <ja...@apache.org> wrote:

> I think this thread has forked into two distinct topics:
>
>    1. A general discussion of the utility of visitors
>    2. Support for generic return type for RelShuttle (the original purpose
>    of the thread)
>
>
> My quick thoughts on #1: I've historically used the visitors in Calcite
> extensively (external to Calcite and typically not using RelShuttleImpl
> which requires an instance per user/thread due to internal use of a deque.
> I don't support the removal of visitors or the current impl of accept from
> RelNode. I think that would be disruptive to the community.
>
> For #2: It seems like most people are supportive of this kind of pattern
> assuming it doesn't impact compatibility. I 100% agree on compat and am
> exploring the best way to actually expose this safely. The problem I
> haven't yet found a great solution for is if someone has built a new
> relnode type that overrides the default RelShuttle based accept methods. In
> this situation, to maintain compatibility, you can't actually delegate the
> relshuttle use to the generic method (as you would skip over the
> customization). The options I've identified:
>
> a) Have a (* instanceof RelShuttle) check in the generic path that routes
> to the specific accept methods.
> b) Use reflection to statically determine if a particular relnode class has
> a relsthuttle based override and then route to it in those circumstances
> (similar to (a) but more complex and likely slightly more performant).
> c) Don't try to collapse the existing RelShuttle into a new pattern and
> instead just introduce a new pattern independently.
> d) Start with a or b but also deprecate the RelShuttle specific 'accept'
> methods. Then in a subsequent release, remove the RelShuttle specific
> methods entirely.
>
> I'm currently leaning towards starting with (a) since it is the easiest for
> users to understand. However, I'd love to hear if others have better ideas
> or opinions on these existing ideas.
>
> Thanks!
> Jacques
>
>
>
>
> On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
> sitnikov.vladimir@gmail.com> wrote:
>
> > Steven>Agree with James, and that's not even including implementations in
> > other
> > Steven>codebases that use calcite (e.g. there are dozens of
> implementations
> > in
> > Steven>Dremio's codebase).
> >
> > Can you please show RelShuttle in Dremio codebase?
> > I see no RelShuttle implementations:
> > https://github.com/dremio/dremio-oss/search?q=relshuttle
> >
> > Note: RelShuttleImpl does not count since RelShuttleImpl does not really
> > need RelNode cooperation.
> > RelShuttleImpl just enumerates children of the relations, and it does not
> > rely on accept(..) calls.
> >
> > It can do the very same thing without calling accept(this), so we can
> > remove  RelNode#accept(RelNode) methods,
> > and we can still get the same RelShuttleImpl.
> > It can have a series of if (... instanceof) checks or a
> Map#get(getClass())
> > or even https://github.com/forax/exotic#visitor---javadoc
> >
> > Vladimir
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Jacques Nadeau <ja...@apache.org>.
I think this thread has forked into two distinct topics:

   1. A general discussion of the utility of visitors
   2. Support for generic return type for RelShuttle (the original purpose
   of the thread)


My quick thoughts on #1: I've historically used the visitors in Calcite
extensively (external to Calcite and typically not using RelShuttleImpl
which requires an instance per user/thread due to internal use of a deque.
I don't support the removal of visitors or the current impl of accept from
RelNode. I think that would be disruptive to the community.

For #2: It seems like most people are supportive of this kind of pattern
assuming it doesn't impact compatibility. I 100% agree on compat and am
exploring the best way to actually expose this safely. The problem I
haven't yet found a great solution for is if someone has built a new
relnode type that overrides the default RelShuttle based accept methods. In
this situation, to maintain compatibility, you can't actually delegate the
relshuttle use to the generic method (as you would skip over the
customization). The options I've identified:

a) Have a (* instanceof RelShuttle) check in the generic path that routes
to the specific accept methods.
b) Use reflection to statically determine if a particular relnode class has
a relsthuttle based override and then route to it in those circumstances
(similar to (a) but more complex and likely slightly more performant).
c) Don't try to collapse the existing RelShuttle into a new pattern and
instead just introduce a new pattern independently.
d) Start with a or b but also deprecate the RelShuttle specific 'accept'
methods. Then in a subsequent release, remove the RelShuttle specific
methods entirely.

I'm currently leaning towards starting with (a) since it is the easiest for
users to understand. However, I'd love to hear if others have better ideas
or opinions on these existing ideas.

Thanks!
Jacques




On Thu, Dec 2, 2021 at 10:22 AM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> Steven>Agree with James, and that's not even including implementations in
> other
> Steven>codebases that use calcite (e.g. there are dozens of implementations
> in
> Steven>Dremio's codebase).
>
> Can you please show RelShuttle in Dremio codebase?
> I see no RelShuttle implementations:
> https://github.com/dremio/dremio-oss/search?q=relshuttle
>
> Note: RelShuttleImpl does not count since RelShuttleImpl does not really
> need RelNode cooperation.
> RelShuttleImpl just enumerates children of the relations, and it does not
> rely on accept(..) calls.
>
> It can do the very same thing without calling accept(this), so we can
> remove  RelNode#accept(RelNode) methods,
> and we can still get the same RelShuttleImpl.
> It can have a series of if (... instanceof) checks or a Map#get(getClass())
> or even https://github.com/forax/exotic#visitor---javadoc
>
> Vladimir
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Vladimir Sitnikov <si...@gmail.com>.
Steven>Agree with James, and that's not even including implementations in
other
Steven>codebases that use calcite (e.g. there are dozens of implementations
in
Steven>Dremio's codebase).

Can you please show RelShuttle in Dremio codebase?
I see no RelShuttle implementations:
https://github.com/dremio/dremio-oss/search?q=relshuttle

Note: RelShuttleImpl does not count since RelShuttleImpl does not really
need RelNode cooperation.
RelShuttleImpl just enumerates children of the relations, and it does not
rely on accept(..) calls.

It can do the very same thing without calling accept(this), so we can
remove  RelNode#accept(RelNode) methods,
and we can still get the same RelShuttleImpl.
It can have a series of if (... instanceof) checks or a Map#get(getClass())
or even https://github.com/forax/exotic#visitor---javadoc

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Posted by Steven Phillips <st...@dremio.com>.
Agree with James, and that's not even including implementations in other
codebases that use calcite (e.g. there are dozens of implementations in
Dremio's codebase).

On Thu, Dec 2, 2021 at 9:38 AM James Starr <ja...@gmail.com> wrote:

> Vladimir, there are 22 implementations of RelShuttleImpl(excluding tests)
> in calcite core.  That hardly seems like no value add.  I am not sure why
> RelNode.visit needs to exist though, the dispatch could simply be done in
> some method internal to the visitor/shuttle.
>
> On Thu, Dec 2, 2021 at 2:07 AM Vladimir Sitnikov <
> sitnikov.vladimir@gmail.com> wrote:
>
> > I've no issues with making the existing visitor more generic, however,
> > the choice of accept methods seems arbitrary :-/
> >
> > Would it help if we move or prefer using algebras rather than visitors?
> >
> >
> https://oleksandrmanzyuk.wordpress.com/2014/06/18/from-object-algebras-to-finally-tagless-interpreters-2/
> >
> > Just to clarify:
> > 1) I would like to refrain from breaking compatibility, especially if we
> > add an abstract method that is used in 1% of Calcite calls.
> > If you still want to break backward compatibility, it would be great if
> you
> > could show at least one use case for the change.
> > Otherwise, it would end up a breakage without even having a use case for
> it
> > :-(
> > No kidding. I think RelShuttle is not really useful, and we could just
> drop
> > all RelNode#accept methods and lose virtually no features.
> > Adding generics on top of RelShuttle is not going to heal that.
> >
> > 2) The current RelShuttle does not seem to add much value.
> > In other words, all implementations of accept(RelShuttle) method are
> > trivial.
> > The example from Wikipedia suggests:
> >
> >
> https://en.wikipedia.org/wiki/Visitor_pattern#What_solution_does_the_Visitor_design_pattern_describe
> > ?
> >
> > Wikipedia>Define a separate (visitor) object that implements an operation
> > to be performed on elements of an object structure
> > Wikipedia>*Clients* *traverse *the object structure and *call a
> dispatching
> > operation accept* (visitor) on an element
> >
> > I think the naming or the implementation in Calcite is wrong.
> > We should either stop naming such things as RelShuttle and RexVisitor
> > "visitors" or we should incline to the canonical approach.
> >
> > Vladimir
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by James Starr <ja...@gmail.com>.
Vladimir, there are 22 implementations of RelShuttleImpl(excluding tests)
in calcite core.  That hardly seems like no value add.  I am not sure why
RelNode.visit needs to exist though, the dispatch could simply be done in
some method internal to the visitor/shuttle.

On Thu, Dec 2, 2021 at 2:07 AM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> I've no issues with making the existing visitor more generic, however,
> the choice of accept methods seems arbitrary :-/
>
> Would it help if we move or prefer using algebras rather than visitors?
>
> https://oleksandrmanzyuk.wordpress.com/2014/06/18/from-object-algebras-to-finally-tagless-interpreters-2/
>
> Just to clarify:
> 1) I would like to refrain from breaking compatibility, especially if we
> add an abstract method that is used in 1% of Calcite calls.
> If you still want to break backward compatibility, it would be great if you
> could show at least one use case for the change.
> Otherwise, it would end up a breakage without even having a use case for it
> :-(
> No kidding. I think RelShuttle is not really useful, and we could just drop
> all RelNode#accept methods and lose virtually no features.
> Adding generics on top of RelShuttle is not going to heal that.
>
> 2) The current RelShuttle does not seem to add much value.
> In other words, all implementations of accept(RelShuttle) method are
> trivial.
> The example from Wikipedia suggests:
>
> https://en.wikipedia.org/wiki/Visitor_pattern#What_solution_does_the_Visitor_design_pattern_describe
> ?
>
> Wikipedia>Define a separate (visitor) object that implements an operation
> to be performed on elements of an object structure
> Wikipedia>*Clients* *traverse *the object structure and *call a dispatching
> operation accept* (visitor) on an element
>
> I think the naming or the implementation in Calcite is wrong.
> We should either stop naming such things as RelShuttle and RexVisitor
> "visitors" or we should incline to the canonical approach.
>
> Vladimir
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Vladimir Sitnikov <si...@gmail.com>.
I've no issues with making the existing visitor more generic, however,
the choice of accept methods seems arbitrary :-/

Would it help if we move or prefer using algebras rather than visitors?
https://oleksandrmanzyuk.wordpress.com/2014/06/18/from-object-algebras-to-finally-tagless-interpreters-2/

Just to clarify:
1) I would like to refrain from breaking compatibility, especially if we
add an abstract method that is used in 1% of Calcite calls.
If you still want to break backward compatibility, it would be great if you
could show at least one use case for the change.
Otherwise, it would end up a breakage without even having a use case for it
:-(
No kidding. I think RelShuttle is not really useful, and we could just drop
all RelNode#accept methods and lose virtually no features.
Adding generics on top of RelShuttle is not going to heal that.

2) The current RelShuttle does not seem to add much value.
In other words, all implementations of accept(RelShuttle) method are
trivial.
The example from Wikipedia suggests:
https://en.wikipedia.org/wiki/Visitor_pattern#What_solution_does_the_Visitor_design_pattern_describe?

Wikipedia>Define a separate (visitor) object that implements an operation
to be performed on elements of an object structure
Wikipedia>*Clients* *traverse *the object structure and *call a dispatching
operation accept* (visitor) on an element

I think the naming or the implementation in Calcite is wrong.
We should either stop naming such things as RelShuttle and RexVisitor
"visitors" or we should incline to the canonical approach.

Vladimir

Re: Feedback on a generic return type version of RelShuttle?

Posted by Walaa Eldin Moustafa <wa...@gmail.com>.
You are right! Always used RexShuttle directly. To make it parallel, we
could have named the new Rel class RelVisitor<T>, but RelVisitor is already
taken.

On Thu, Dec 2, 2021 at 1:16 AM Ruben Q L <ru...@gmail.com> wrote:

> @Walaa Eldin Moustafa <wa...@gmail.com>, for RexNodes there seems
> to be already this "generic return type visitor pattern" via RexVisitor; in
> fact, RexShuttle is just an implementation of RexVisitor.
>
>
>
>
> On Thu, Dec 2, 2021 at 9:06 AM Walaa Eldin Moustafa <wa...@gmail.com>
> wrote:
>
>> +1 while maintaining backward compatibility. Is there a plan for
>> RexShuttle
>> too?
>>
>> On Thu, Dec 2, 2021 at 1:03 AM Ruben Q L <ru...@gmail.com> wrote:
>>
>> > Sounds like a good idea, as long as it does not break backwards
>> > compatibility.
>> >
>> > Regards,
>> > Ruben
>> >
>> > On Thu, Dec 2, 2021 at 7:22 AM Jay Narale <ja...@gmail.com>
>> wrote:
>> >
>> > > +1  re-writing  to other types would be easier with this
>> > >
>> > > On Thu, Dec 2, 2021 at 16:18 Alessandro Solimando <
>> > > alessandro.solimando@gmail.com> wrote:
>> > >
>> > > > Hi Jacques,
>> > > > I have faced the same issue recently and I think it's a good idea to
>> > > have a
>> > > > generic version of it and have RelShuttle be one of its concrete
>> > > > implementations.
>> > > >
>> > > > Best regards,
>> > > > Alessandro
>> > > >
>> > > >
>> > > > Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha
>> > scritto:
>> > > >
>> > > > > Right now, RelNode's accept method and RelShuttle forces a user to
>> > > return
>> > > > > RelNodes. This makes tree traversal/conversion extra painful in
>> main
>> > > > > circumstances (you want to have more specific relnode classes
>> > explicit,
>> > > > you
>> > > > > want to rewrite to non-relnodes, etc). I've worked around this for
>> > many
>> > > > > years but really think it should be resolved inside Calcite. I
>> wanted
>> > > to
>> > > > > get some feedback on introducing a more generic pattern (that
>> > > RelShuttle
>> > > > > becomes a concrete variation of). If people are on board with the
>> > > change,
>> > > > > I'll pull it through the codebase.
>> > > > >
>> > > > > You can see the rough idea in this wip patch:
>> > > > > https://github.com/apache/calcite/pull/2625
>> > > > >
>> > > > > Basically, introduce a new parameterized visitor that people can
>> use.
>> > > > This
>> > > > > doesn't change the existing behavior of RelShuttle (it's now just
>> a
>> > > > > specific version of this generic version) but it does allow other
>> > > rewrite
>> > > > > patterns.
>> > > > >
>> > > > > Thoughts?
>> > > > >
>> > > >
>> > > --
>> > > Warm Regards,
>> > >
>> > > Jay Narale
>> > >
>> >
>>
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Ruben Q L <ru...@gmail.com>.
@Walaa Eldin Moustafa <wa...@gmail.com>, for RexNodes there seems to
be already this "generic return type visitor pattern" via RexVisitor; in
fact, RexShuttle is just an implementation of RexVisitor.




On Thu, Dec 2, 2021 at 9:06 AM Walaa Eldin Moustafa <wa...@gmail.com>
wrote:

> +1 while maintaining backward compatibility. Is there a plan for RexShuttle
> too?
>
> On Thu, Dec 2, 2021 at 1:03 AM Ruben Q L <ru...@gmail.com> wrote:
>
> > Sounds like a good idea, as long as it does not break backwards
> > compatibility.
> >
> > Regards,
> > Ruben
> >
> > On Thu, Dec 2, 2021 at 7:22 AM Jay Narale <ja...@gmail.com>
> wrote:
> >
> > > +1  re-writing  to other types would be easier with this
> > >
> > > On Thu, Dec 2, 2021 at 16:18 Alessandro Solimando <
> > > alessandro.solimando@gmail.com> wrote:
> > >
> > > > Hi Jacques,
> > > > I have faced the same issue recently and I think it's a good idea to
> > > have a
> > > > generic version of it and have RelShuttle be one of its concrete
> > > > implementations.
> > > >
> > > > Best regards,
> > > > Alessandro
> > > >
> > > >
> > > > Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha
> > scritto:
> > > >
> > > > > Right now, RelNode's accept method and RelShuttle forces a user to
> > > return
> > > > > RelNodes. This makes tree traversal/conversion extra painful in
> main
> > > > > circumstances (you want to have more specific relnode classes
> > explicit,
> > > > you
> > > > > want to rewrite to non-relnodes, etc). I've worked around this for
> > many
> > > > > years but really think it should be resolved inside Calcite. I
> wanted
> > > to
> > > > > get some feedback on introducing a more generic pattern (that
> > > RelShuttle
> > > > > becomes a concrete variation of). If people are on board with the
> > > change,
> > > > > I'll pull it through the codebase.
> > > > >
> > > > > You can see the rough idea in this wip patch:
> > > > > https://github.com/apache/calcite/pull/2625
> > > > >
> > > > > Basically, introduce a new parameterized visitor that people can
> use.
> > > > This
> > > > > doesn't change the existing behavior of RelShuttle (it's now just a
> > > > > specific version of this generic version) but it does allow other
> > > rewrite
> > > > > patterns.
> > > > >
> > > > > Thoughts?
> > > > >
> > > >
> > > --
> > > Warm Regards,
> > >
> > > Jay Narale
> > >
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Walaa Eldin Moustafa <wa...@gmail.com>.
+1 while maintaining backward compatibility. Is there a plan for RexShuttle
too?

On Thu, Dec 2, 2021 at 1:03 AM Ruben Q L <ru...@gmail.com> wrote:

> Sounds like a good idea, as long as it does not break backwards
> compatibility.
>
> Regards,
> Ruben
>
> On Thu, Dec 2, 2021 at 7:22 AM Jay Narale <ja...@gmail.com> wrote:
>
> > +1  re-writing  to other types would be easier with this
> >
> > On Thu, Dec 2, 2021 at 16:18 Alessandro Solimando <
> > alessandro.solimando@gmail.com> wrote:
> >
> > > Hi Jacques,
> > > I have faced the same issue recently and I think it's a good idea to
> > have a
> > > generic version of it and have RelShuttle be one of its concrete
> > > implementations.
> > >
> > > Best regards,
> > > Alessandro
> > >
> > >
> > > Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha
> scritto:
> > >
> > > > Right now, RelNode's accept method and RelShuttle forces a user to
> > return
> > > > RelNodes. This makes tree traversal/conversion extra painful in main
> > > > circumstances (you want to have more specific relnode classes
> explicit,
> > > you
> > > > want to rewrite to non-relnodes, etc). I've worked around this for
> many
> > > > years but really think it should be resolved inside Calcite. I wanted
> > to
> > > > get some feedback on introducing a more generic pattern (that
> > RelShuttle
> > > > becomes a concrete variation of). If people are on board with the
> > change,
> > > > I'll pull it through the codebase.
> > > >
> > > > You can see the rough idea in this wip patch:
> > > > https://github.com/apache/calcite/pull/2625
> > > >
> > > > Basically, introduce a new parameterized visitor that people can use.
> > > This
> > > > doesn't change the existing behavior of RelShuttle (it's now just a
> > > > specific version of this generic version) but it does allow other
> > rewrite
> > > > patterns.
> > > >
> > > > Thoughts?
> > > >
> > >
> > --
> > Warm Regards,
> >
> > Jay Narale
> >
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Ruben Q L <ru...@gmail.com>.
Sounds like a good idea, as long as it does not break backwards
compatibility.

Regards,
Ruben

On Thu, Dec 2, 2021 at 7:22 AM Jay Narale <ja...@gmail.com> wrote:

> +1  re-writing  to other types would be easier with this
>
> On Thu, Dec 2, 2021 at 16:18 Alessandro Solimando <
> alessandro.solimando@gmail.com> wrote:
>
> > Hi Jacques,
> > I have faced the same issue recently and I think it's a good idea to
> have a
> > generic version of it and have RelShuttle be one of its concrete
> > implementations.
> >
> > Best regards,
> > Alessandro
> >
> >
> > Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha scritto:
> >
> > > Right now, RelNode's accept method and RelShuttle forces a user to
> return
> > > RelNodes. This makes tree traversal/conversion extra painful in main
> > > circumstances (you want to have more specific relnode classes explicit,
> > you
> > > want to rewrite to non-relnodes, etc). I've worked around this for many
> > > years but really think it should be resolved inside Calcite. I wanted
> to
> > > get some feedback on introducing a more generic pattern (that
> RelShuttle
> > > becomes a concrete variation of). If people are on board with the
> change,
> > > I'll pull it through the codebase.
> > >
> > > You can see the rough idea in this wip patch:
> > > https://github.com/apache/calcite/pull/2625
> > >
> > > Basically, introduce a new parameterized visitor that people can use.
> > This
> > > doesn't change the existing behavior of RelShuttle (it's now just a
> > > specific version of this generic version) but it does allow other
> rewrite
> > > patterns.
> > >
> > > Thoughts?
> > >
> >
> --
> Warm Regards,
>
> Jay Narale
>

Re: Feedback on a generic return type version of RelShuttle?

Posted by Jay Narale <ja...@gmail.com>.
+1  re-writing  to other types would be easier with this

On Thu, Dec 2, 2021 at 16:18 Alessandro Solimando <
alessandro.solimando@gmail.com> wrote:

> Hi Jacques,
> I have faced the same issue recently and I think it's a good idea to have a
> generic version of it and have RelShuttle be one of its concrete
> implementations.
>
> Best regards,
> Alessandro
>
>
> Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha scritto:
>
> > Right now, RelNode's accept method and RelShuttle forces a user to return
> > RelNodes. This makes tree traversal/conversion extra painful in main
> > circumstances (you want to have more specific relnode classes explicit,
> you
> > want to rewrite to non-relnodes, etc). I've worked around this for many
> > years but really think it should be resolved inside Calcite. I wanted to
> > get some feedback on introducing a more generic pattern (that RelShuttle
> > becomes a concrete variation of). If people are on board with the change,
> > I'll pull it through the codebase.
> >
> > You can see the rough idea in this wip patch:
> > https://github.com/apache/calcite/pull/2625
> >
> > Basically, introduce a new parameterized visitor that people can use.
> This
> > doesn't change the existing behavior of RelShuttle (it's now just a
> > specific version of this generic version) but it does allow other rewrite
> > patterns.
> >
> > Thoughts?
> >
>
-- 
Warm Regards,

Jay Narale

Re: Feedback on a generic return type version of RelShuttle?

Posted by Alessandro Solimando <al...@gmail.com>.
Hi Jacques,
I have faced the same issue recently and I think it's a good idea to have a
generic version of it and have RelShuttle be one of its concrete
implementations.

Best regards,
Alessandro


Il Gio 2 Dic 2021, 06:02 Jacques Nadeau <ja...@apache.org> ha scritto:

> Right now, RelNode's accept method and RelShuttle forces a user to return
> RelNodes. This makes tree traversal/conversion extra painful in main
> circumstances (you want to have more specific relnode classes explicit, you
> want to rewrite to non-relnodes, etc). I've worked around this for many
> years but really think it should be resolved inside Calcite. I wanted to
> get some feedback on introducing a more generic pattern (that RelShuttle
> becomes a concrete variation of). If people are on board with the change,
> I'll pull it through the codebase.
>
> You can see the rough idea in this wip patch:
> https://github.com/apache/calcite/pull/2625
>
> Basically, introduce a new parameterized visitor that people can use. This
> doesn't change the existing behavior of RelShuttle (it's now just a
> specific version of this generic version) but it does allow other rewrite
> patterns.
>
> Thoughts?
>