You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Kishore Vajjala <kv...@gmail.com> on 2014/11/25 13:06:42 UTC

Using Calcite inside a JDBC Server

Hi,

Apologies if this is not the right forum for my questions below.

I have a few  high level question using calcite as virtualization platform.

My Scenario

I have multiple data sources and would require a service which exposes a
unified JDBC interface to query the data sources. I was looking at some of
the presentations and found this to be a good fit initially.

To the the questions now.

1. Does Calcite come any near to what I wanted like a jdbc driver and a
Server shell which parses and federates the query.

2. If it is not in your roadmap, I want to know if you have any ideas on
how I could achieve this.

3. Are there any code examples for implementing SQL over tables in mutiple
data sources (as seen in Julian's presentation using Splunk and MySQL)

I really appreciate your responses.

Thanks

Re: Using Calcite inside a JDBC Server

Posted by Julian Hyde <ju...@gmail.com>.

The JDBC driver is not remotable, so we can’t do this.

A few people have expressed interest in a remotable JDBC driver. It might connect to a REST server running Calcite, and it could build on APIs in the org.apache.calcite.avatica package [1]. But no work has been done yet.

For now your best option is to call the Calcite driver from your client application.

Julian

[1] http://www.postgresql.org/message-id/A70EDEC3-8E7C-44BF-B983-8E10C2677F6E@hydromatic.net

On Nov 25, 2014, at 4:17 PM, Kishore Vajjala <kv...@gmail.com> wrote:

> Hi Julian,
> 
> Thanks a lot. I was looking at how to run this is in a server container
> which exposes JDBC interface. I would like it run it as "Data as a
> Service".  If I were not looking to run it in embedded mode.
> 
> Any pointers on how I can embed this or should I build something of my own.
> I have checked out the code and I see a package "org.apache.calcite.server"
> And driver in org.apache.calcite.jdbc. I was looking if I can use the
> driver from a remote client to talk to the Calcite engine.
> 
> Thanks again
> 
> On Wed, Nov 26, 2014 at 10:55 AM, Julian Hyde <ju...@hydromatic.net> wrote:
> 
>> Yes, this is the right forum.
>> 
>> Calcite makes this kind of data virtualization really easy. You just need
>> to define a model file in JSON, with a schema for each data source. Then
>> you can connect to Calcite via JDBC, and Calcite will route the queries to
>> the right back-end.
>> 
>> Run through the tutorial,
>> https://github.com/apache/incubator-calcite/blob/master/example/csv/TUTORIAL.md,
>> to see how to connect from sqlline and edit models.
>> 
>> Use a model similar to
>> https://github.com/apache/incubator-calcite/blob/master/core/src/test/resources/mysql-foodmart-model.json,
>> but give it two schemas. You can write queries across those schemas. For
>> the sake of example, suppose you had
>> 
>> {
>>  version: '1.0',
>>  defaultSchema: 'foodmart',
>>  schemas: [
>>    {
>>      type: 'jdbc',
>>      name: 'foodmart',
>>      jdbcUser: 'foodmart',
>>      jdbcPassword: 'foodmart',
>>      jdbcUrl: 'jdbc:mysql://localhost',
>>      jdbcCatalog: 'foodmart',
>>      jdbcSchema: null
>>    },
>>    {
>>      type: 'jdbc',
>>      name: ‘bardmart',
>>      jdbcUser: 'foodmart',
>>      jdbcPassword: 'foodmart',
>>      jdbcUrl: 'jdbc:mysql://localhost',
>>      jdbcCatalog: 'foodmart',
>>      jdbcSchema: null
>>    }
>>  ]
>> }
>> 
>> Then you could write  a distributed query
>> 
>> select count(*)
>> from “foodmart”.”sales_fact_1997” as s
>> join “bardmart”.”customer” as c on s.”customer_id” = c.”customer_id”;
>> 
>> The two schemas have identical JDBC credentials in this example, but in
>> real life they would be different JDBC databases, or instances of any
>> schema adapter you like (e.g. CSV, Mongo).
>> 
>> Julian
>> 
>> 
>> On Nov 25, 2014, at 4:06 AM, Kishore Vajjala <kv...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Apologies if this is not the right forum for my questions below.
>>> 
>>> I have a few  high level question using calcite as virtualization
>> platform.
>>> 
>>> My Scenario
>>> 
>>> I have multiple data sources and would require a service which exposes a
>>> unified JDBC interface to query the data sources. I was looking at some
>> of
>>> the presentations and found this to be a good fit initially.
>>> 
>>> To the the questions now.
>>> 
>>> 1. Does Calcite come any near to what I wanted like a jdbc driver and a
>>> Server shell which parses and federates the query.
>>> 
>>> 2. If it is not in your roadmap, I want to know if you have any ideas on
>>> how I could achieve this.
>>> 
>>> 3. Are there any code examples for implementing SQL over tables in
>> mutiple
>>> data sources (as seen in Julian's presentation using Splunk and MySQL)
>>> 
>>> I really appreciate your responses.
>>> 
>>> Thanks
>> 
>>

Re: Using Calcite inside a JDBC Server

Posted by Kishore Vajjala <kv...@gmail.com>.

Hi Julian,

Thanks a lot. I was looking at how to run this is in a server container
which exposes JDBC interface. I would like it run it as "Data as a
Service".  If I were not looking to run it in embedded mode.

Any pointers on how I can embed this or should I build something of my own.
I have checked out the code and I see a package "org.apache.calcite.server"
And driver in org.apache.calcite.jdbc. I was looking if I can use the
driver from a remote client to talk to the Calcite engine.

Thanks again

On Wed, Nov 26, 2014 at 10:55 AM, Julian Hyde <ju...@hydromatic.net> wrote:

> Yes, this is the right forum.
>
> Calcite makes this kind of data virtualization really easy. You just need
> to define a model file in JSON, with a schema for each data source. Then
> you can connect to Calcite via JDBC, and Calcite will route the queries to
> the right back-end.
>
> Run through the tutorial,
> https://github.com/apache/incubator-calcite/blob/master/example/csv/TUTORIAL.md,
> to see how to connect from sqlline and edit models.
>
> Use a model similar to
> https://github.com/apache/incubator-calcite/blob/master/core/src/test/resources/mysql-foodmart-model.json,
> but give it two schemas. You can write queries across those schemas. For
> the sake of example, suppose you had
>
> {
>   version: '1.0',
>   defaultSchema: 'foodmart',
>   schemas: [
>     {
>       type: 'jdbc',
>       name: 'foodmart',
>       jdbcUser: 'foodmart',
>       jdbcPassword: 'foodmart',
>       jdbcUrl: 'jdbc:mysql://localhost',
>       jdbcCatalog: 'foodmart',
>       jdbcSchema: null
>     },
>     {
>       type: 'jdbc',
>       name: ‘bardmart',
>       jdbcUser: 'foodmart',
>       jdbcPassword: 'foodmart',
>       jdbcUrl: 'jdbc:mysql://localhost',
>       jdbcCatalog: 'foodmart',
>       jdbcSchema: null
>     }
>   ]
> }
>
> Then you could write  a distributed query
>
> select count(*)
> from “foodmart”.”sales_fact_1997” as s
> join “bardmart”.”customer” as c on s.”customer_id” = c.”customer_id”;
>
> The two schemas have identical JDBC credentials in this example, but in
> real life they would be different JDBC databases, or instances of any
> schema adapter you like (e.g. CSV, Mongo).
>
> Julian
>
>
> On Nov 25, 2014, at 4:06 AM, Kishore Vajjala <kv...@gmail.com> wrote:
>
> > Hi,
> >
> > Apologies if this is not the right forum for my questions below.
> >
> > I have a few  high level question using calcite as virtualization
> platform.
> >
> > My Scenario
> >
> > I have multiple data sources and would require a service which exposes a
> > unified JDBC interface to query the data sources. I was looking at some
> of
> > the presentations and found this to be a good fit initially.
> >
> > To the the questions now.
> >
> > 1. Does Calcite come any near to what I wanted like a jdbc driver and a
> > Server shell which parses and federates the query.
> >
> > 2. If it is not in your roadmap, I want to know if you have any ideas on
> > how I could achieve this.
> >
> > 3. Are there any code examples for implementing SQL over tables in
> mutiple
> > data sources (as seen in Julian's presentation using Splunk and MySQL)
> >
> > I really appreciate your responses.
> >
> > Thanks
>
>

Re: Using Calcite inside a JDBC Server

Posted by Julian Hyde <ju...@hydromatic.net>.

Yes, this is the right forum.

Calcite makes this kind of data virtualization really easy. You just need to define a model file in JSON, with a schema for each data source. Then you can connect to Calcite via JDBC, and Calcite will route the queries to the right back-end.

Run through the tutorial, https://github.com/apache/incubator-calcite/blob/master/example/csv/TUTORIAL.md, to see how to connect from sqlline and edit models.

Use a model similar to https://github.com/apache/incubator-calcite/blob/master/core/src/test/resources/mysql-foodmart-model.json, but give it two schemas. You can write queries across those schemas. For the sake of example, suppose you had

{
  version: '1.0',
  defaultSchema: 'foodmart',
  schemas: [
    {
      type: 'jdbc',
      name: 'foodmart',
      jdbcUser: 'foodmart',
      jdbcPassword: 'foodmart',
      jdbcUrl: 'jdbc:mysql://localhost',
      jdbcCatalog: 'foodmart',
      jdbcSchema: null
    },
    {
      type: 'jdbc',
      name: ‘bardmart',
      jdbcUser: 'foodmart',
      jdbcPassword: 'foodmart',
      jdbcUrl: 'jdbc:mysql://localhost',
      jdbcCatalog: 'foodmart',
      jdbcSchema: null
    }
  ]
}

Then you could write  a distributed query

select count(*)
from “foodmart”.”sales_fact_1997” as s
join “bardmart”.”customer” as c on s.”customer_id” = c.”customer_id”;

The two schemas have identical JDBC credentials in this example, but in real life they would be different JDBC databases, or instances of any schema adapter you like (e.g. CSV, Mongo).

Julian

On Nov 25, 2014, at 4:06 AM, Kishore Vajjala <kv...@gmail.com> wrote:

> Hi,
> 
> Apologies if this is not the right forum for my questions below.
> 
> I have a few  high level question using calcite as virtualization platform.
> 
> My Scenario
> 
> I have multiple data sources and would require a service which exposes a
> unified JDBC interface to query the data sources. I was looking at some of
> the presentations and found this to be a good fit initially.
> 
> To the the questions now.
> 
> 1. Does Calcite come any near to what I wanted like a jdbc driver and a
> Server shell which parses and federates the query.
> 
> 2. If it is not in your roadmap, I want to know if you have any ideas on
> how I could achieve this.
> 
> 3. Are there any code examples for implementing SQL over tables in mutiple
> data sources (as seen in Julian's presentation using Splunk and MySQL)
> 
> I really appreciate your responses.
> 
> Thanks