You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Chin Wei Low <lo...@gmail.com> on 2016/01/06 11:05:29 UTC

How to use Spark Adapter

Hi All,

I am playing around Spark Adapter and my understanding of the adapter is it
will convert enumerable from other adapter like CSV Adapter to RDD for
processing in Spark and later collect the result.

I have a few issues and questions:

issues:
1. Looks like the upgrade of the Jetty server has break that as Spark 0.9
(used by Calcite) still using the old Jetty server. Tried to revert that
and there are further issues.
2. The Spark related rules does not used by planner, there is todo for this.
3. The implementation of the child's rel is null, there is todo for this.

questions:
1. Does the Spark Adapter working or complete implemented?
2. I am enabling Spark Adapter by specifying spark=true in connection
string, is there any documentation on this?

Regards,
Chin Wei

Re: How to use Spark Adapter

Posted by Julian Hyde <jh...@apache.org>.
Yes, I think of the Spark as “engine” not a data source. Drill and Flink also fit the description of engine, even though there aren’t adapters for them in Calcite at present.

Julian


> On Jan 6, 2016, at 6:22 PM, Chin Wei Low <lo...@gmail.com> wrote:
> 
> Thanks Julian.
> 
> My understanding on Spark Adapter is it can be the computation engine on
> top of various data sources.
> For now, I am exploring whether we can use Spark Adapter to run query on a
> set of Parquet files. So, knowing what it can do is important.
> 
> Regards,
> Chin Wei
> 
> On Wed, Jan 6, 2016 at 7:26 PM, Julian Hyde <jh...@apache.org> wrote:
> 
>> As you have noticed, we are using an old version of Spark. At some point
>> SparkAdapterTest worked — that is, we could execute queries with a VALUES
>> and WHERE clause — but wasn’t made part of the suite, and it stopped
>> working somewhere along the way.
>> 
>> So, I think the first step would be to upgrade Spark. If there is interest
>> in using a Spark adapter I will help you to get it working; and we can
>> start adding additional relational operators.
>> 
>> Julian
>> 
>>> On Jan 6, 2016, at 2:05 AM, Chin Wei Low <lo...@gmail.com> wrote:
>>> 
>>> Hi All,
>>> 
>>> I am playing around Spark Adapter and my understanding of the adapter is
>> it
>>> will convert enumerable from other adapter like CSV Adapter to RDD for
>>> processing in Spark and later collect the result.
>>> 
>>> I have a few issues and questions:
>>> 
>>> issues:
>>> 1. Looks like the upgrade of the Jetty server has break that as Spark 0.9
>>> (used by Calcite) still using the old Jetty server. Tried to revert that
>>> and there are further issues.
>>> 2. The Spark related rules does not used by planner, there is todo for
>> this.
>>> 3. The implementation of the child's rel is null, there is todo for this.
>>> 
>>> questions:
>>> 1. Does the Spark Adapter working or complete implemented?
>>> 2. I am enabling Spark Adapter by specifying spark=true in connection
>>> string, is there any documentation on this?
>>> 
>>> Regards,
>>> Chin Wei
>> 
>> 


Re: How to use Spark Adapter

Posted by Chin Wei Low <lo...@gmail.com>.
Thanks Julian.

My understanding on Spark Adapter is it can be the computation engine on
top of various data sources.
For now, I am exploring whether we can use Spark Adapter to run query on a
set of Parquet files. So, knowing what it can do is important.

Regards,
Chin Wei

On Wed, Jan 6, 2016 at 7:26 PM, Julian Hyde <jh...@apache.org> wrote:

> As you have noticed, we are using an old version of Spark. At some point
> SparkAdapterTest worked — that is, we could execute queries with a VALUES
> and WHERE clause — but wasn’t made part of the suite, and it stopped
> working somewhere along the way.
>
> So, I think the first step would be to upgrade Spark. If there is interest
> in using a Spark adapter I will help you to get it working; and we can
> start adding additional relational operators.
>
> Julian
>
> > On Jan 6, 2016, at 2:05 AM, Chin Wei Low <lo...@gmail.com> wrote:
> >
> > Hi All,
> >
> > I am playing around Spark Adapter and my understanding of the adapter is
> it
> > will convert enumerable from other adapter like CSV Adapter to RDD for
> > processing in Spark and later collect the result.
> >
> > I have a few issues and questions:
> >
> > issues:
> > 1. Looks like the upgrade of the Jetty server has break that as Spark 0.9
> > (used by Calcite) still using the old Jetty server. Tried to revert that
> > and there are further issues.
> > 2. The Spark related rules does not used by planner, there is todo for
> this.
> > 3. The implementation of the child's rel is null, there is todo for this.
> >
> > questions:
> > 1. Does the Spark Adapter working or complete implemented?
> > 2. I am enabling Spark Adapter by specifying spark=true in connection
> > string, is there any documentation on this?
> >
> > Regards,
> > Chin Wei
>
>

Re: How to use Spark Adapter

Posted by Julian Hyde <jh...@apache.org>.
As you have noticed, we are using an old version of Spark. At some point SparkAdapterTest worked — that is, we could execute queries with a VALUES and WHERE clause — but wasn’t made part of the suite, and it stopped working somewhere along the way.

So, I think the first step would be to upgrade Spark. If there is interest in using a Spark adapter I will help you to get it working; and we can start adding additional relational operators.

Julian

> On Jan 6, 2016, at 2:05 AM, Chin Wei Low <lo...@gmail.com> wrote:
> 
> Hi All,
> 
> I am playing around Spark Adapter and my understanding of the adapter is it
> will convert enumerable from other adapter like CSV Adapter to RDD for
> processing in Spark and later collect the result.
> 
> I have a few issues and questions:
> 
> issues:
> 1. Looks like the upgrade of the Jetty server has break that as Spark 0.9
> (used by Calcite) still using the old Jetty server. Tried to revert that
> and there are further issues.
> 2. The Spark related rules does not used by planner, there is todo for this.
> 3. The implementation of the child's rel is null, there is todo for this.
> 
> questions:
> 1. Does the Spark Adapter working or complete implemented?
> 2. I am enabling Spark Adapter by specifying spark=true in connection
> string, is there any documentation on this?
> 
> Regards,
> Chin Wei