You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Vladimir Sitnikov <si...@gmail.com> on 2020/01/03 12:35:44 UTC

[DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

Hi,

RelOptTableImpl#toRel uses if-else logic and it creates LogicalTableScan
or EnumerableTableScan depending on table instance and some properties.
Does anybody know why can't it always create LogicalTableScan and use rules
to convert it to EnumerableTableScan later?

The current logic of creating EnumerableTableScan hurts for cases
like RelToSqlConverterStructsTest.
The test in question uses its own Table implementation (that is new
Table()), and the RelOptTableImpl#toRel tries to create EnumerableTableScan
out of it.

It looks weird as the table could not be implemented anyway, so why don't
we create LogicalTableScan there?

I have filed https://github.com/apache/calcite/pull/1721 to see what
breaks, but it does not seem right to me to EnumerableTableScan which is
known to be non-implementable.

Vladimir

Re: [DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

Posted by Stamatis Zampetakis <za...@gmail.com>.

FYI: There is some ongoing work and discussion somehow relevant to this
topic in PR [1] and CALCITE-3769 [2].

[1] https://github.com/apache/calcite/pull/1790
[2] https://issues.apache.org/jira/browse/CALCITE-3769

On Tue, Jan 7, 2020 at 8:48 PM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> So far I have PR https://github.com/apache/calcite/pull/1721 that adds
> verification to EnumerableTableScan constructor.
>
> Unfortunately, we have a lot of test cases that assume the relation would
> be EnumerableTableScan even
> in the case test verifies logical plan only :(
>
> That is why I added extra logging and asserts to EnumerableTableScan so it
> throws an exception if someone tries to create
> wrong relation.
>
> Note: I discovered Enumerable does not support tables with struct, array,
> and map fields, so I added system properties to
> selectively enable/disable activating enumerable for those rowtypes.
>
> I'm inclined to commit 1721 shortly as it is a pre-requisite for other
> cost-related commits.
>
> Vladimir
>

Re: [DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

Posted by Vladimir Sitnikov <si...@gmail.com>.

So far I have PR https://github.com/apache/calcite/pull/1721 that adds
verification to EnumerableTableScan constructor.

Unfortunately, we have a lot of test cases that assume the relation would
be EnumerableTableScan even
in the case test verifies logical plan only :(

That is why I added extra logging and asserts to EnumerableTableScan so it
throws an exception if someone tries to create
wrong relation.

Note: I discovered Enumerable does not support tables with struct, array,
and map fields, so I added system properties to
selectively enable/disable activating enumerable for those rowtypes.

I'm inclined to commit 1721 shortly as it is a pre-requisite for other
cost-related commits.

Vladimir

Re: [DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

Posted by Julian Hyde <jh...@apache.org>.

The toRel method was intended to save the effort of writing a rule.
It's easy to write a rule that matches particular types of RelNode
(e.g. a Filter on a TableScan) but it's harder to write a rule that
matches a particular kind of Table inside a TableScan.

If we can obsolete the toRel method without too much pain, let's do so.

We must be able to "plug and play": create a schema that defines a new
type of table (say CsvTable). If obsoleted toRel, that table type
would need to register the rule(s) necessary to convert
LogicalTableScan to its sub-class of CsvTableScan.

Julian

On Tue, Jan 7, 2020 at 3:48 AM Stamatis Zampetakis <za...@gmail.com> wrote:
>
> The fact that RelOptTableImpl creates directly an EnumerableTableScan seems
> to be a residue of the history (at the very beginning the class was only
> used to create EnumerableTableScan operators). I guess it is also an
> attempt to reduce the search space of the optimizer short-circuiting some
> LogicalTableScan to OtherTableScan transformations.
>
> I also feel that generating a LogicalTableScan is more appropriate and this
> seems to be inline with the documentation of RelOptTable#toRel method.
>
> In all cases generating an EnumerableTableScan that is non-implementable
> does not seem to be a good idea.
>
> Best,
> Stamatis
>
> On Fri, Jan 3, 2020 at 1:36 PM Vladimir Sitnikov <
> sitnikov.vladimir@gmail.com> wrote:
>
> > Hi,
> >
> > RelOptTableImpl#toRel uses if-else logic and it creates LogicalTableScan
> > or EnumerableTableScan depending on table instance and some properties.
> > Does anybody know why can't it always create LogicalTableScan and use rules
> > to convert it to EnumerableTableScan later?
> >
> > The current logic of creating EnumerableTableScan hurts for cases
> > like RelToSqlConverterStructsTest.
> > The test in question uses its own Table implementation (that is new
> > Table()), and the RelOptTableImpl#toRel tries to create EnumerableTableScan
> > out of it.
> >
> > It looks weird as the table could not be implemented anyway, so why don't
> > we create LogicalTableScan there?
> >
> > I have filed https://github.com/apache/calcite/pull/1721 to see what
> > breaks, but it does not seem right to me to EnumerableTableScan which is
> > known to be non-implementable.
> >
> > Vladimir
> >

Re: [DISCUSS] RelOptTableImpl#toRel vs EnumerableTableScan

Posted by Stamatis Zampetakis <za...@gmail.com>.

The fact that RelOptTableImpl creates directly an EnumerableTableScan seems
to be a residue of the history (at the very beginning the class was only
used to create EnumerableTableScan operators). I guess it is also an
attempt to reduce the search space of the optimizer short-circuiting some
LogicalTableScan to OtherTableScan transformations.

I also feel that generating a LogicalTableScan is more appropriate and this
seems to be inline with the documentation of RelOptTable#toRel method.

In all cases generating an EnumerableTableScan that is non-implementable
does not seem to be a good idea.

Best,
Stamatis

On Fri, Jan 3, 2020 at 1:36 PM Vladimir Sitnikov <
sitnikov.vladimir@gmail.com> wrote:

> Hi,
>
> RelOptTableImpl#toRel uses if-else logic and it creates LogicalTableScan
> or EnumerableTableScan depending on table instance and some properties.
> Does anybody know why can't it always create LogicalTableScan and use rules
> to convert it to EnumerableTableScan later?
>
> The current logic of creating EnumerableTableScan hurts for cases
> like RelToSqlConverterStructsTest.
> The test in question uses its own Table implementation (that is new
> Table()), and the RelOptTableImpl#toRel tries to create EnumerableTableScan
> out of it.
>
> It looks weird as the table could not be implemented anyway, so why don't
> we create LogicalTableScan there?
>
> I have filed https://github.com/apache/calcite/pull/1721 to see what
> breaks, but it does not seem right to me to EnumerableTableScan which is
> known to be non-implementable.
>
> Vladimir
>