You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Pranav Deshpande <de...@gmail.com> on 2022/07/20 22:56:40 UTC
Requesting Information Regarding Data Federation
Dear Apache Calcite Team,
I am trying to learn Calcite and wish to build a poc for data federation.
In the video here, https://www.youtube.com/watch?v=4JAOkLKrcYE, somehow the
presenter and his team managed to squash parts of the Relational Nodes into
"Spark Tables" and then Spark handled the execution of those.
How do I exactly go about doing this?
As per this discussion I understand that one has to create a RelOptRule to
do the same.
Also, one has to somehow define the cost (I don't know how to do this).
Is there a simple tutorial which demonstrates the basics of this? Like some
kind of simple implementation with ListTable etc.
Thanks & Regards,
Pranav
Re: Requesting Information Regarding Data Federation
Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Pranav,
A very simplistic example of using Calcite for data integration can be
found here [1] along with some links to presentations and relevant material.
Apart from Apache Drill, Apache Hive is using Calcite for executing
federated queries. The main entry point is CalcitePlannerAction#apply [2]
where most of the Calcite configuration is done.
Best,
Stamatis
[1] https://github.com/zabetak/cy-calcite-tutorial
[2]
https://github.com/apache/hive/blob/834308091624c1a69cba7a8b97919ed1ff0fc616/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L1646
On Thu, Jul 21, 2022 at 2:01 AM Charles Givre <cg...@gmail.com> wrote:
> Hi Pranav,
> You might want to take a look at Apache Drill, as it uses Calcite as a
> query planner and can executed federated queries against a pretty wide
> array of data sets.
> Best,
> -- C
>
> > On Jul 20, 2022, at 6:56 PM, Pranav Deshpande <
> deshpande.v.pranav@gmail.com> wrote:
> >
> > Dear Apache Calcite Team,
> > I am trying to learn Calcite and wish to build a poc for data federation.
> >
> > In the video here, https://www.youtube.com/watch?v=4JAOkLKrcYE, somehow
> the
> > presenter and his team managed to squash parts of the Relational Nodes
> into
> > "Spark Tables" and then Spark handled the execution of those.
> >
> > How do I exactly go about doing this?
> >
> > As per this discussion I understand that one has to create a RelOptRule
> to
> > do the same.
> >
> > Also, one has to somehow define the cost (I don't know how to do this).
> >
> > Is there a simple tutorial which demonstrates the basics of this? Like
> some
> > kind of simple implementation with ListTable etc.
> >
> > Thanks & Regards,
> > Pranav
>
>
Re: Requesting Information Regarding Data Federation
Posted by Charles Givre <cg...@gmail.com>.
Hi Pranav,
You might want to take a look at Apache Drill, as it uses Calcite as a query planner and can executed federated queries against a pretty wide array of data sets.
Best,
-- C
> On Jul 20, 2022, at 6:56 PM, Pranav Deshpande <de...@gmail.com> wrote:
>
> Dear Apache Calcite Team,
> I am trying to learn Calcite and wish to build a poc for data federation.
>
> In the video here, https://www.youtube.com/watch?v=4JAOkLKrcYE, somehow the
> presenter and his team managed to squash parts of the Relational Nodes into
> "Spark Tables" and then Spark handled the execution of those.
>
> How do I exactly go about doing this?
>
> As per this discussion I understand that one has to create a RelOptRule to
> do the same.
>
> Also, one has to somehow define the cost (I don't know how to do this).
>
> Is there a simple tutorial which demonstrates the basics of this? Like some
> kind of simple implementation with ListTable etc.
>
> Thanks & Regards,
> Pranav