You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Luso Clemens <u2...@gmail.com> on 2019/06/05 02:50:09 UTC

Question about spark adaptor

Hi，
     I'm a system architect work for china merchant bank, I want to use
calcite to do some federated queries across different databases and push
calculations to spark, so I enable the spark option but get an no pointer
exception.
     I think the problem is I didn't configure the spark adaptor, but
couldn't find any introduction or documents talking about it, except some
test specs.
     So my question is, is there any way to push calculations into spark,
or it just unavailable in current version.
     Forgive my bad English.

Re: Question about spark adaptor

Posted by Yuzhao Chen <yu...@gmail.com>.

Luso Clemens,

kindly reminder, you better not to tell your internal team work for your companies in the dev mailing list, because it would be recorded permanently.

Just put out your points or questions is okey and we would love to help you.

Best,
Danny Chan
在 2019年6月5日 +0800 PM11:52，Luso Clemens <u2...@gmail.com>，写道：
> Hi，
> I'm a system architect work for china merchant bank, I want to use
> calcite to do some federated queries across different databases and push
> calculations to spark, so I enable the spark option but get an no pointer
> exception.
> I think the problem is I didn't configure the spark adaptor, but
> couldn't find any introduction or documents talking about it, except some
> test specs.
> So my question is, is there any way to push calculations into spark,
> or it just unavailable in current version.
> Forgive my bad English.

Re: Question about spark adaptor

Posted by Zhu Feng <we...@gmail.com>.

Hi, Luso:
We are also developing such platform now. As Spark evolves rapidly, the
SparkHandler based on low-level RDD API in Calcite is unfriendly and almost
unavailable.
Instead, you can convert RelNode back to SQL queries, and execute them in
Spark or any computation engines.

select a.x, b.y from mysql.test a join pg.test b on a.id=b.id;

You can generate 2 queries and register them as views in Spark SQL.
(1)view(a): select id, x from mysql.test;
(2)view(b): select id, y from pg.test;

The final SQL executed in Spark SQL is:
select a.x, b.y from a join b on a.id=b.id;

best,
DonnyZone

Luso Clemens <u2...@gmail.com> 于2019年6月5日周三 下午11:52写道：

> Hi，
>      I'm a system architect work for china merchant bank, I want to use
> calcite to do some federated queries across different databases and push
> calculations to spark, so I enable the spark option but get an no pointer
> exception.
>      I think the problem is I didn't configure the spark adaptor, but
> couldn't find any introduction or documents talking about it, except some
> test specs.
>      So my question is, is there any way to push calculations into spark,
> or it just unavailable in current version.
>      Forgive my bad English.
>