You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Julian Feinauer <j....@pragmaticminds.de> on 2022/01/21 12:21:30 UTC

Question on how to integrate Apache IoTDB into Calcite

Hi all,

in the last weeks I worked on Integrating the Apache IoTDB Project with Calcite.
This covers two possible scenarios. One, to use Apache IoTDB as an Adapter in Apache Calcite (like MongoDB, Cassandra, et al) and on the other hand we are looking at using Calcites Query Optimizer to introduce indexing into the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes it to the planner, after planning the resulting RelNode is then processed further by the IoTDB Server, executed and returned).

I looked a lot on the other Adapters and how they are implemented and have some questions:

One rather general question is about the Queryable<> Interface. I tried to look up all the docs (also in Linq) but still not fully understand it. From my understanding it is like a Enumerable<> but it has a “native” way to already to things like ordering or filtering. So if I have a Queryable<> which implements a custom Filter an automated “Push Down” can be done by the framework without a Rule or code generation.

One important requirement for us in IoTDB is to do the query pushdown to the TableScan (which is done implicitly in the current server but is first explicit in the RelNode that we generate).
So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan to a “filtered” scan?
Is the right way to return a QueryableTable as TableScan and the Planner will take care by generating the call to ‘.filter(…)’.
The same applies to ordering.

Another question that is important for us is the usage of “Materialized Views” or other “Indexes”.
As we handle basically always timeseries in most cases the only suitable index is a “Materialized View” on parts of the time series which we can use to replace parts of the Relational Tree to avoid IO and computation for parts that are already precomputed.

Is there already an existing support for that in Calcite or would we just write custom Rules for our cases?

My last question is about the Callable TraitDef. So far I only used Enumerable Convention which results in  Code generation (which has an impact on the query latency). Am I right in assuming that the Binable Convention is somehow similar to the Enumerable Convention with the only difference that it does not do code generation but interpretation?
And to potentially use both (depending on whatever switch we set) we just have to provide Converter Rules for both?
What would you use in a Server setup? Always Enumerable?

Thanks already for any responses or hints!
Julian F

Re: Question on how to integrate Apache IoTDB into Calcite

Posted by Nicola Vitucci <ni...@gmail.com>.
Thanks Stamatis, that makes sense. I'll keep using it as shown in those
examples then.

Best,

Nicola

Il giorno lun 7 feb 2022 alle ore 22:52 Stamatis Zampetakis <
zabetak@gmail.com> ha scritto:

> Hey Nicola,
>
> If you need a way to combine operators from the Enumerable convention,
> which generate Java code, with other kind operators you need to have some
> common interfaces to pass from one to the other.
>
> The XToEnumerableConverter needs to know how to generate code calling the
> operators of the other convention thus using existing interfaces e.g.,
> Queryable comes in handy. Possibly it would be possible to combine
> Enumerable with other conventions without using the interfaces you
> mentioned but it would require more effort/code.
>
> Best,
> Stamatis
>
> On Mon, Feb 7, 2022 at 11:17 PM Nicola Vitucci <ni...@gmail.com>
> wrote:
>
> > I've seen that in many adapters, although they implement the
> > TranslatableTable interface, Queryable is still used (see the usage of
> > table.getExpression(...) in XToEnumerableConverter for Elasticsearch,
> > Mongo, and Geode, to name a few). What is the reasoning there?
> >
> > Il giorno mar 25 gen 2022 alle ore 20:15 Julian Hyde <
> > jhyde.apache@gmail.com>
> > ha scritto:
> >
> > > +1 what Stamatis said. Queryable is for compatibility with LINQ. If you
> > > want to build an adapter that supports push-down, you will likely use
> > > FilterableTable for simple adapters, TranslatableTable for more complex
> > > adapters. In neither case will you need to deal with Queryable.
> > >
> > > Stamatis laid out best practices in his excellent BOSS tutorial:
> > > https://www.youtube.com/watch?v=meI0W12f_nw <
> > > https://www.youtube.com/watch?v=meI0W12f_nw>. (I am co-presenter in
> > other
> > > parts of the tutorial, but all credit for the adapters material goes to
> > > Stamatis.)
> > >
> > > Julian
> > >
> > >
> > >
> > > > On Jan 23, 2022, at 12:46 PM, Stamatis Zampetakis <zabetak@gmail.com
> >
> > > wrote:
> > > >
> > > > Hi Julian,
> > > >
> > > > I don't think there is an easy way to understand the Queryable
> > interface
> > > > unless you check how it is used. I don't see it as something that you
> > > need
> > > > to implement no matter what but more as a convenience API that will
> > > > facilitate the integration with the Enumerable operators (if you rely
> > on
> > > > them). Even in that case it could be possible to skip it.
> > > >
> > > > There are many ways to push a filter down into the underlying engine
> > and
> > > I
> > > > think the Calcite code base has already quite a few examples on how
> > this
> > > > can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
> > > > option that is best in all cases. Using one you may (e.g.,
> > > FilterableTable)
> > > > write more concise code and using another (e.g., custom rules +
> custom
> > > > operators) may lead to more powerful optimizations or an easier
> > > extensible
> > > > system.
> > > >
> > > > Regarding materialized views there are many built in things in
> Calcite.
> > > The
> > > > best place to start would probably be the website [1]
> > > >
> > > > The bindable convention uses interpretation most of the time but it
> > also
> > > > involves code generation in some parts (e.g., BindableFilter). The
> > > > Enumerable convention is more widely used in the wild so I would say
> it
> > > is
> > > > a more stable and better option to begin with. Afterwards you may
> need
> > to
> > > > invest in building in-house operators to solve some invonveniences of
> > > > Calcite built-in conventions.
> > > >
> > > > Best,
> > > > Stamatis
> > > >
> > > > [1] https://calcite.apache.org/docs/materialized_views.html
> > > >
> > > > On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
> > > > j.feinauer@pragmaticminds.de> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> in the last weeks I worked on Integrating the Apache IoTDB Project
> > with
> > > >> Calcite.
> > > >> This covers two possible scenarios. One, to use Apache IoTDB as an
> > > Adapter
> > > >> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other
> > > hand we
> > > >> are looking at using Calcites Query Optimizer to introduce indexing
> > into
> > > >> the IoTDB server (the IoTDB Server builds a RelNode / Tree and
> passes
> > > it to
> > > >> the planner, after planning the resulting RelNode is then processed
> > > further
> > > >> by the IoTDB Server, executed and returned).
> > > >>
> > > >> I looked a lot on the other Adapters and how they are implemented
> and
> > > have
> > > >> some questions:
> > > >>
> > > >> One rather general question is about the Queryable<> Interface. I
> > tried
> > > to
> > > >> look up all the docs (also in Linq) but still not fully understand
> it.
> > > From
> > > >> my understanding it is like a Enumerable<> but it has a “native” way
> > to
> > > >> already to things like ordering or filtering. So if I have a
> > Queryable<>
> > > >> which implements a custom Filter an automated “Push Down” can be
> done
> > by
> > > >> the framework without a Rule or code generation.
> > > >>
> > > >> One important requirement for us in IoTDB is to do the query
> pushdown
> > to
> > > >> the TableScan (which is done implicitly in the current server but is
> > > first
> > > >> explicit in the RelNode that we generate).
> > > >> So whats the best way to “merge” a LogicalFilter and a
> IoTDBTableScan
> > > to a
> > > >> “filtered” scan?
> > > >> Is the right way to return a QueryableTable as TableScan and the
> > Planner
> > > >> will take care by generating the call to ‘.filter(…)’.
> > > >> The same applies to ordering.
> > > >>
> > > >> Another question that is important for us is the usage of
> > “Materialized
> > > >> Views” or other “Indexes”.
> > > >> As we handle basically always timeseries in most cases the only
> > suitable
> > > >> index is a “Materialized View” on parts of the time series which we
> > can
> > > use
> > > >> to replace parts of the Relational Tree to avoid IO and computation
> > for
> > > >> parts that are already precomputed.
> > > >>
> > > >> Is there already an existing support for that in Calcite or would we
> > > just
> > > >> write custom Rules for our cases?
> > > >>
> > > >> My last question is about the Callable TraitDef. So far I only used
> > > >> Enumerable Convention which results in  Code generation (which has
> an
> > > >> impact on the query latency). Am I right in assuming that the
> Binable
> > > >> Convention is somehow similar to the Enumerable Convention with the
> > only
> > > >> difference that it does not do code generation but interpretation?
> > > >> And to potentially use both (depending on whatever switch we set) we
> > > just
> > > >> have to provide Converter Rules for both?
> > > >> What would you use in a Server setup? Always Enumerable?
> > > >>
> > > >> Thanks already for any responses or hints!
> > > >> Julian F
> > > >>
> > >
> > >
> >
>

Re: Question on how to integrate Apache IoTDB into Calcite

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hey Nicola,

If you need a way to combine operators from the Enumerable convention,
which generate Java code, with other kind operators you need to have some
common interfaces to pass from one to the other.

The XToEnumerableConverter needs to know how to generate code calling the
operators of the other convention thus using existing interfaces e.g.,
Queryable comes in handy. Possibly it would be possible to combine
Enumerable with other conventions without using the interfaces you
mentioned but it would require more effort/code.

Best,
Stamatis

On Mon, Feb 7, 2022 at 11:17 PM Nicola Vitucci <ni...@gmail.com>
wrote:

> I've seen that in many adapters, although they implement the
> TranslatableTable interface, Queryable is still used (see the usage of
> table.getExpression(...) in XToEnumerableConverter for Elasticsearch,
> Mongo, and Geode, to name a few). What is the reasoning there?
>
> Il giorno mar 25 gen 2022 alle ore 20:15 Julian Hyde <
> jhyde.apache@gmail.com>
> ha scritto:
>
> > +1 what Stamatis said. Queryable is for compatibility with LINQ. If you
> > want to build an adapter that supports push-down, you will likely use
> > FilterableTable for simple adapters, TranslatableTable for more complex
> > adapters. In neither case will you need to deal with Queryable.
> >
> > Stamatis laid out best practices in his excellent BOSS tutorial:
> > https://www.youtube.com/watch?v=meI0W12f_nw <
> > https://www.youtube.com/watch?v=meI0W12f_nw>. (I am co-presenter in
> other
> > parts of the tutorial, but all credit for the adapters material goes to
> > Stamatis.)
> >
> > Julian
> >
> >
> >
> > > On Jan 23, 2022, at 12:46 PM, Stamatis Zampetakis <za...@gmail.com>
> > wrote:
> > >
> > > Hi Julian,
> > >
> > > I don't think there is an easy way to understand the Queryable
> interface
> > > unless you check how it is used. I don't see it as something that you
> > need
> > > to implement no matter what but more as a convenience API that will
> > > facilitate the integration with the Enumerable operators (if you rely
> on
> > > them). Even in that case it could be possible to skip it.
> > >
> > > There are many ways to push a filter down into the underlying engine
> and
> > I
> > > think the Calcite code base has already quite a few examples on how
> this
> > > can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
> > > option that is best in all cases. Using one you may (e.g.,
> > FilterableTable)
> > > write more concise code and using another (e.g., custom rules + custom
> > > operators) may lead to more powerful optimizations or an easier
> > extensible
> > > system.
> > >
> > > Regarding materialized views there are many built in things in Calcite.
> > The
> > > best place to start would probably be the website [1]
> > >
> > > The bindable convention uses interpretation most of the time but it
> also
> > > involves code generation in some parts (e.g., BindableFilter). The
> > > Enumerable convention is more widely used in the wild so I would say it
> > is
> > > a more stable and better option to begin with. Afterwards you may need
> to
> > > invest in building in-house operators to solve some invonveniences of
> > > Calcite built-in conventions.
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1] https://calcite.apache.org/docs/materialized_views.html
> > >
> > > On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
> > > j.feinauer@pragmaticminds.de> wrote:
> > >
> > >> Hi all,
> > >>
> > >> in the last weeks I worked on Integrating the Apache IoTDB Project
> with
> > >> Calcite.
> > >> This covers two possible scenarios. One, to use Apache IoTDB as an
> > Adapter
> > >> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other
> > hand we
> > >> are looking at using Calcites Query Optimizer to introduce indexing
> into
> > >> the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes
> > it to
> > >> the planner, after planning the resulting RelNode is then processed
> > further
> > >> by the IoTDB Server, executed and returned).
> > >>
> > >> I looked a lot on the other Adapters and how they are implemented and
> > have
> > >> some questions:
> > >>
> > >> One rather general question is about the Queryable<> Interface. I
> tried
> > to
> > >> look up all the docs (also in Linq) but still not fully understand it.
> > From
> > >> my understanding it is like a Enumerable<> but it has a “native” way
> to
> > >> already to things like ordering or filtering. So if I have a
> Queryable<>
> > >> which implements a custom Filter an automated “Push Down” can be done
> by
> > >> the framework without a Rule or code generation.
> > >>
> > >> One important requirement for us in IoTDB is to do the query pushdown
> to
> > >> the TableScan (which is done implicitly in the current server but is
> > first
> > >> explicit in the RelNode that we generate).
> > >> So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan
> > to a
> > >> “filtered” scan?
> > >> Is the right way to return a QueryableTable as TableScan and the
> Planner
> > >> will take care by generating the call to ‘.filter(…)’.
> > >> The same applies to ordering.
> > >>
> > >> Another question that is important for us is the usage of
> “Materialized
> > >> Views” or other “Indexes”.
> > >> As we handle basically always timeseries in most cases the only
> suitable
> > >> index is a “Materialized View” on parts of the time series which we
> can
> > use
> > >> to replace parts of the Relational Tree to avoid IO and computation
> for
> > >> parts that are already precomputed.
> > >>
> > >> Is there already an existing support for that in Calcite or would we
> > just
> > >> write custom Rules for our cases?
> > >>
> > >> My last question is about the Callable TraitDef. So far I only used
> > >> Enumerable Convention which results in  Code generation (which has an
> > >> impact on the query latency). Am I right in assuming that the Binable
> > >> Convention is somehow similar to the Enumerable Convention with the
> only
> > >> difference that it does not do code generation but interpretation?
> > >> And to potentially use both (depending on whatever switch we set) we
> > just
> > >> have to provide Converter Rules for both?
> > >> What would you use in a Server setup? Always Enumerable?
> > >>
> > >> Thanks already for any responses or hints!
> > >> Julian F
> > >>
> >
> >
>

Re: Question on how to integrate Apache IoTDB into Calcite

Posted by Nicola Vitucci <ni...@gmail.com>.
I've seen that in many adapters, although they implement the
TranslatableTable interface, Queryable is still used (see the usage of
table.getExpression(...) in XToEnumerableConverter for Elasticsearch,
Mongo, and Geode, to name a few). What is the reasoning there?

Il giorno mar 25 gen 2022 alle ore 20:15 Julian Hyde <jh...@gmail.com>
ha scritto:

> +1 what Stamatis said. Queryable is for compatibility with LINQ. If you
> want to build an adapter that supports push-down, you will likely use
> FilterableTable for simple adapters, TranslatableTable for more complex
> adapters. In neither case will you need to deal with Queryable.
>
> Stamatis laid out best practices in his excellent BOSS tutorial:
> https://www.youtube.com/watch?v=meI0W12f_nw <
> https://www.youtube.com/watch?v=meI0W12f_nw>. (I am co-presenter in other
> parts of the tutorial, but all credit for the adapters material goes to
> Stamatis.)
>
> Julian
>
>
>
> > On Jan 23, 2022, at 12:46 PM, Stamatis Zampetakis <za...@gmail.com>
> wrote:
> >
> > Hi Julian,
> >
> > I don't think there is an easy way to understand the Queryable interface
> > unless you check how it is used. I don't see it as something that you
> need
> > to implement no matter what but more as a convenience API that will
> > facilitate the integration with the Enumerable operators (if you rely on
> > them). Even in that case it could be possible to skip it.
> >
> > There are many ways to push a filter down into the underlying engine and
> I
> > think the Calcite code base has already quite a few examples on how this
> > can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
> > option that is best in all cases. Using one you may (e.g.,
> FilterableTable)
> > write more concise code and using another (e.g., custom rules + custom
> > operators) may lead to more powerful optimizations or an easier
> extensible
> > system.
> >
> > Regarding materialized views there are many built in things in Calcite.
> The
> > best place to start would probably be the website [1]
> >
> > The bindable convention uses interpretation most of the time but it also
> > involves code generation in some parts (e.g., BindableFilter). The
> > Enumerable convention is more widely used in the wild so I would say it
> is
> > a more stable and better option to begin with. Afterwards you may need to
> > invest in building in-house operators to solve some invonveniences of
> > Calcite built-in conventions.
> >
> > Best,
> > Stamatis
> >
> > [1] https://calcite.apache.org/docs/materialized_views.html
> >
> > On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
> > j.feinauer@pragmaticminds.de> wrote:
> >
> >> Hi all,
> >>
> >> in the last weeks I worked on Integrating the Apache IoTDB Project with
> >> Calcite.
> >> This covers two possible scenarios. One, to use Apache IoTDB as an
> Adapter
> >> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other
> hand we
> >> are looking at using Calcites Query Optimizer to introduce indexing into
> >> the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes
> it to
> >> the planner, after planning the resulting RelNode is then processed
> further
> >> by the IoTDB Server, executed and returned).
> >>
> >> I looked a lot on the other Adapters and how they are implemented and
> have
> >> some questions:
> >>
> >> One rather general question is about the Queryable<> Interface. I tried
> to
> >> look up all the docs (also in Linq) but still not fully understand it.
> From
> >> my understanding it is like a Enumerable<> but it has a “native” way to
> >> already to things like ordering or filtering. So if I have a Queryable<>
> >> which implements a custom Filter an automated “Push Down” can be done by
> >> the framework without a Rule or code generation.
> >>
> >> One important requirement for us in IoTDB is to do the query pushdown to
> >> the TableScan (which is done implicitly in the current server but is
> first
> >> explicit in the RelNode that we generate).
> >> So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan
> to a
> >> “filtered” scan?
> >> Is the right way to return a QueryableTable as TableScan and the Planner
> >> will take care by generating the call to ‘.filter(…)’.
> >> The same applies to ordering.
> >>
> >> Another question that is important for us is the usage of “Materialized
> >> Views” or other “Indexes”.
> >> As we handle basically always timeseries in most cases the only suitable
> >> index is a “Materialized View” on parts of the time series which we can
> use
> >> to replace parts of the Relational Tree to avoid IO and computation for
> >> parts that are already precomputed.
> >>
> >> Is there already an existing support for that in Calcite or would we
> just
> >> write custom Rules for our cases?
> >>
> >> My last question is about the Callable TraitDef. So far I only used
> >> Enumerable Convention which results in  Code generation (which has an
> >> impact on the query latency). Am I right in assuming that the Binable
> >> Convention is somehow similar to the Enumerable Convention with the only
> >> difference that it does not do code generation but interpretation?
> >> And to potentially use both (depending on whatever switch we set) we
> just
> >> have to provide Converter Rules for both?
> >> What would you use in a Server setup? Always Enumerable?
> >>
> >> Thanks already for any responses or hints!
> >> Julian F
> >>
>
>

Re: Question on how to integrate Apache IoTDB into Calcite

Posted by Julian Hyde <jh...@gmail.com>.
+1 what Stamatis said. Queryable is for compatibility with LINQ. If you want to build an adapter that supports push-down, you will likely use FilterableTable for simple adapters, TranslatableTable for more complex adapters. In neither case will you need to deal with Queryable.

Stamatis laid out best practices in his excellent BOSS tutorial: https://www.youtube.com/watch?v=meI0W12f_nw <https://www.youtube.com/watch?v=meI0W12f_nw>. (I am co-presenter in other parts of the tutorial, but all credit for the adapters material goes to Stamatis.)

Julian



> On Jan 23, 2022, at 12:46 PM, Stamatis Zampetakis <za...@gmail.com> wrote:
> 
> Hi Julian,
> 
> I don't think there is an easy way to understand the Queryable interface
> unless you check how it is used. I don't see it as something that you need
> to implement no matter what but more as a convenience API that will
> facilitate the integration with the Enumerable operators (if you rely on
> them). Even in that case it could be possible to skip it.
> 
> There are many ways to push a filter down into the underlying engine and I
> think the Calcite code base has already quite a few examples on how this
> can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
> option that is best in all cases. Using one you may (e.g., FilterableTable)
> write more concise code and using another (e.g., custom rules + custom
> operators) may lead to more powerful optimizations or an easier extensible
> system.
> 
> Regarding materialized views there are many built in things in Calcite. The
> best place to start would probably be the website [1]
> 
> The bindable convention uses interpretation most of the time but it also
> involves code generation in some parts (e.g., BindableFilter). The
> Enumerable convention is more widely used in the wild so I would say it is
> a more stable and better option to begin with. Afterwards you may need to
> invest in building in-house operators to solve some invonveniences of
> Calcite built-in conventions.
> 
> Best,
> Stamatis
> 
> [1] https://calcite.apache.org/docs/materialized_views.html
> 
> On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
> j.feinauer@pragmaticminds.de> wrote:
> 
>> Hi all,
>> 
>> in the last weeks I worked on Integrating the Apache IoTDB Project with
>> Calcite.
>> This covers two possible scenarios. One, to use Apache IoTDB as an Adapter
>> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other hand we
>> are looking at using Calcites Query Optimizer to introduce indexing into
>> the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes it to
>> the planner, after planning the resulting RelNode is then processed further
>> by the IoTDB Server, executed and returned).
>> 
>> I looked a lot on the other Adapters and how they are implemented and have
>> some questions:
>> 
>> One rather general question is about the Queryable<> Interface. I tried to
>> look up all the docs (also in Linq) but still not fully understand it. From
>> my understanding it is like a Enumerable<> but it has a “native” way to
>> already to things like ordering or filtering. So if I have a Queryable<>
>> which implements a custom Filter an automated “Push Down” can be done by
>> the framework without a Rule or code generation.
>> 
>> One important requirement for us in IoTDB is to do the query pushdown to
>> the TableScan (which is done implicitly in the current server but is first
>> explicit in the RelNode that we generate).
>> So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan to a
>> “filtered” scan?
>> Is the right way to return a QueryableTable as TableScan and the Planner
>> will take care by generating the call to ‘.filter(…)’.
>> The same applies to ordering.
>> 
>> Another question that is important for us is the usage of “Materialized
>> Views” or other “Indexes”.
>> As we handle basically always timeseries in most cases the only suitable
>> index is a “Materialized View” on parts of the time series which we can use
>> to replace parts of the Relational Tree to avoid IO and computation for
>> parts that are already precomputed.
>> 
>> Is there already an existing support for that in Calcite or would we just
>> write custom Rules for our cases?
>> 
>> My last question is about the Callable TraitDef. So far I only used
>> Enumerable Convention which results in  Code generation (which has an
>> impact on the query latency). Am I right in assuming that the Binable
>> Convention is somehow similar to the Enumerable Convention with the only
>> difference that it does not do code generation but interpretation?
>> And to potentially use both (depending on whatever switch we set) we just
>> have to provide Converter Rules for both?
>> What would you use in a Server setup? Always Enumerable?
>> 
>> Thanks already for any responses or hints!
>> Julian F
>> 


Re: Question on how to integrate Apache IoTDB into Calcite

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Julian,

I don't think there is an easy way to understand the Queryable interface
unless you check how it is used. I don't see it as something that you need
to implement no matter what but more as a convenience API that will
facilitate the integration with the Enumerable operators (if you rely on
them). Even in that case it could be possible to skip it.

There are many ways to push a filter down into the underlying engine and I
think the Calcite code base has already quite a few examples on how this
can be done (JdbcAdapter, ElasticSearch, Druid, etc). There is no one
option that is best in all cases. Using one you may (e.g., FilterableTable)
write more concise code and using another (e.g., custom rules + custom
operators) may lead to more powerful optimizations or an easier extensible
system.

Regarding materialized views there are many built in things in Calcite. The
best place to start would probably be the website [1]

The bindable convention uses interpretation most of the time but it also
involves code generation in some parts (e.g., BindableFilter). The
Enumerable convention is more widely used in the wild so I would say it is
a more stable and better option to begin with. Afterwards you may need to
invest in building in-house operators to solve some invonveniences of
Calcite built-in conventions.

Best,
Stamatis

[1] https://calcite.apache.org/docs/materialized_views.html

On Fri, Jan 21, 2022 at 1:21 PM Julian Feinauer <
j.feinauer@pragmaticminds.de> wrote:

> Hi all,
>
> in the last weeks I worked on Integrating the Apache IoTDB Project with
> Calcite.
> This covers two possible scenarios. One, to use Apache IoTDB as an Adapter
> in Apache Calcite (like MongoDB, Cassandra, et al) and on the other hand we
> are looking at using Calcites Query Optimizer to introduce indexing into
> the IoTDB server (the IoTDB Server builds a RelNode / Tree and passes it to
> the planner, after planning the resulting RelNode is then processed further
> by the IoTDB Server, executed and returned).
>
> I looked a lot on the other Adapters and how they are implemented and have
> some questions:
>
> One rather general question is about the Queryable<> Interface. I tried to
> look up all the docs (also in Linq) but still not fully understand it. From
> my understanding it is like a Enumerable<> but it has a “native” way to
> already to things like ordering or filtering. So if I have a Queryable<>
> which implements a custom Filter an automated “Push Down” can be done by
> the framework without a Rule or code generation.
>
> One important requirement for us in IoTDB is to do the query pushdown to
> the TableScan (which is done implicitly in the current server but is first
> explicit in the RelNode that we generate).
> So whats the best way to “merge” a LogicalFilter and a IoTDBTableScan to a
> “filtered” scan?
> Is the right way to return a QueryableTable as TableScan and the Planner
> will take care by generating the call to ‘.filter(…)’.
> The same applies to ordering.
>
> Another question that is important for us is the usage of “Materialized
> Views” or other “Indexes”.
> As we handle basically always timeseries in most cases the only suitable
> index is a “Materialized View” on parts of the time series which we can use
> to replace parts of the Relational Tree to avoid IO and computation for
> parts that are already precomputed.
>
> Is there already an existing support for that in Calcite or would we just
> write custom Rules for our cases?
>
> My last question is about the Callable TraitDef. So far I only used
> Enumerable Convention which results in  Code generation (which has an
> impact on the query latency). Am I right in assuming that the Binable
> Convention is somehow similar to the Enumerable Convention with the only
> difference that it does not do code generation but interpretation?
> And to potentially use both (depending on whatever switch we set) we just
> have to provide Converter Rules for both?
> What would you use in a Server setup? Always Enumerable?
>
> Thanks already for any responses or hints!
> Julian F
>