You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Ron Wheeler <rw...@artifact-software.com> on 2016/01/25 04:18:01 UTC
Calcite Packaging question from a newbee
In the "marketing" videos, Calcite seems to be based on the idea that
the whole question of database access can be best solved by a modular
approach.
I was surprised to see a lot of storage layer stuff mixed in with the core.
I want to add an adapter and incorporate this into an application that
has its own database abstraction which currently supports Jackrabbit
with an in-memory configuration.
I can not afford to add a whole bunch of code and third party libraries
that are not essential.
Is there a plan to produce a lightweight core package that is suitable
for use with an adapter?
Ron
--
Ron Wheeler
President
Artifact Software Inc
email: rwheeler@artifact-software.com
skype: ronaldmwheeler
phone: 866-970-2435, ext 102
Re: Calcite Packaging question from a newbee
Posted by Julian Hyde <jh...@apache.org>.
Sure - if you have more questions, ask on this list. I’ve found that you can achieve a lot just by implementing the table interface. Then you can start to push down filter, project etc. as you tune the adapter.
> On Jan 25, 2016, at 2:57 PM, Ron Wheeler <rw...@artifact-software.com> wrote:
>
> Very good explanation. There may be some elements in this that could be added to the website.
>
> I am just getting started and am pretty excited about the possibility of using Calcite in our ADTransform package.
> ADTransform has a pretty big footprint in it current state due to some pretty powerful libraries such as JasperReports that are pretty big.
>
> After I asked the question, I had to dig a bit deeper to get the CSV Adapter demo running and got a bit of a picture about the jar files that exist.
>
> I should have looked at the Maven Central repo to see that I can in fact get the core as a separate dependency even if the sources are in a single git project.
>
> I am sorry for putting your through such a long bit of writing but it is very helpful in looking forward.
>
> I have the suspicion that I would have saved a lot of grief and ode writing had I known about Calcite 4 years ago.
> I have been using SQL (Oracle and MySQL) since 1982 and am a big fan.
>
> I think that a lot of the plug-ins that we wrote to add functionality to ADTransform will be made obsolete by a plugin that accepts SQL statements instead of a string of parameters.
>
> I may have some questions and issues relating to our audit trail and our error log.
> For example, we identify individual records that fail a transformation or validation.
> "Uniqueness test failed on row 270 key "abc" is a duplicate of record 185."
> In row 200 "manager" rwheeler not found in "int_person".
> These things are not something SQL is very good at reporting.
>
> I am sure that I will have lots of questions if I write an adapter. Having never worked on the internals of ORACLE or MySQL, I am finding that the JavaDocs are a bit daunting at times and include concepts that I have never thought about.
>
> Thanks for your response.
>
> Ron
>
> On 25/01/2016 5:16 PM, Julian Hyde wrote:
>> Ron,
>>
>> I’ll attempt to answer but I might have misunderstood your use case. So if I am off the mark, please describe the architecture you want in more detail.
>>
>> Let’s suppose you have a NoSQL database X and your application wants to speak SQL to it. You would include three Calcite modules: core, avatica, and the module for database X. The adapter for each database tends to be in a separate module (mongodb, spark, splunk, csv). JDBC is in core, but doesn’t bring in any dependencies.
>>
>> Core doesn’t contain dependencies on third-party systems. It does depend on things needed by the core functionality, for example Jackson, because we want people to be able to write their models in JSON.
>>
>> Reading pom.xml, the runtime dependencies of core are as follows:
>> * calcite-apatica
>> * protobuf
>> * jackson
>> * calcite-linq4j
>> * commons-dbcp
>> * findbugs-jsr305
>> * guava
>> * eigenbase-properties
>> * janino
>> * pentaho-aggdesigner-algorithm
>>
>> I don’t think any of those has significant downstream dependencies.
>>
>> I’m very happy to have a discussion of what should be in core, and what its dependencies should be. There’s no “perfect” decomposition of a project into modules, but we can get a little nearer to perfection if we listen to how people (such as you) are deploying the project in the real world.
>>
>> After all that, if core is too heavyweight, then maybe you want *remote* Calcite rather than *embedded* Calcite. On your client you can include JUST avatica. On your server you would run avatica-server and include avatica and core and anything else you desire. Avatica has extremely low dependencies - you need either Jackson or Protobuf (depending on how you are encoding the RPCs). We didn’t even include Guava, even though we lean heavily on it elsewhere in Calcite.
>>
>> Julian
>>
>>> On Jan 24, 2016, at 7:18 PM, Ron Wheeler <rw...@artifact-software.com> wrote:
>>>
>>> In the "marketing" videos, Calcite seems to be based on the idea that the whole question of database access can be best solved by a modular approach.
>>> I was surprised to see a lot of storage layer stuff mixed in with the core.
>>>
>>> I want to add an adapter and incorporate this into an application that has its own database abstraction which currently supports Jackrabbit with an in-memory configuration.
>>> I can not afford to add a whole bunch of code and third party libraries that are not essential.
>>>
>>> Is there a plan to produce a lightweight core package that is suitable for use with an adapter?
>>>
>>> Ron
>>>
>>> --
>>> Ron Wheeler
>>> President
>>> Artifact Software Inc
>>> email: rwheeler@artifact-software.com
>>> skype: ronaldmwheeler
>>> phone: 866-970-2435, ext 102
>>>
>>
>
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: rwheeler@artifact-software.com
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
Re: Calcite Packaging question from a newbee
Posted by Ron Wheeler <rw...@artifact-software.com>.
Very good explanation. There may be some elements in this that could be
added to the website.
I am just getting started and am pretty excited about the possibility of
using Calcite in our ADTransform package.
ADTransform has a pretty big footprint in it current state due to some
pretty powerful libraries such as JasperReports that are pretty big.
After I asked the question, I had to dig a bit deeper to get the CSV
Adapter demo running and got a bit of a picture about the jar files that
exist.
I should have looked at the Maven Central repo to see that I can in fact
get the core as a separate dependency even if the sources are in a
single git project.
I am sorry for putting your through such a long bit of writing but it is
very helpful in looking forward.
I have the suspicion that I would have saved a lot of grief and ode
writing had I known about Calcite 4 years ago.
I have been using SQL (Oracle and MySQL) since 1982 and am a big fan.
I think that a lot of the plug-ins that we wrote to add functionality to
ADTransform will be made obsolete by a plugin that accepts SQL
statements instead of a string of parameters.
I may have some questions and issues relating to our audit trail and our
error log.
For example, we identify individual records that fail a transformation
or validation.
"Uniqueness test failed on row 270 key "abc" is a duplicate of record 185."
In row 200 "manager" rwheeler not found in "int_person".
These things are not something SQL is very good at reporting.
I am sure that I will have lots of questions if I write an adapter.
Having never worked on the internals of ORACLE or MySQL, I am finding
that the JavaDocs are a bit daunting at times and include concepts that
I have never thought about.
Thanks for your response.
Ron
On 25/01/2016 5:16 PM, Julian Hyde wrote:
> Ron,
>
> I’ll attempt to answer but I might have misunderstood your use case. So if I am off the mark, please describe the architecture you want in more detail.
>
> Let’s suppose you have a NoSQL database X and your application wants to speak SQL to it. You would include three Calcite modules: core, avatica, and the module for database X. The adapter for each database tends to be in a separate module (mongodb, spark, splunk, csv). JDBC is in core, but doesn’t bring in any dependencies.
>
> Core doesn’t contain dependencies on third-party systems. It does depend on things needed by the core functionality, for example Jackson, because we want people to be able to write their models in JSON.
>
> Reading pom.xml, the runtime dependencies of core are as follows:
> * calcite-apatica
> * protobuf
> * jackson
> * calcite-linq4j
> * commons-dbcp
> * findbugs-jsr305
> * guava
> * eigenbase-properties
> * janino
> * pentaho-aggdesigner-algorithm
>
> I don’t think any of those has significant downstream dependencies.
>
> I’m very happy to have a discussion of what should be in core, and what its dependencies should be. There’s no “perfect” decomposition of a project into modules, but we can get a little nearer to perfection if we listen to how people (such as you) are deploying the project in the real world.
>
> After all that, if core is too heavyweight, then maybe you want *remote* Calcite rather than *embedded* Calcite. On your client you can include JUST avatica. On your server you would run avatica-server and include avatica and core and anything else you desire. Avatica has extremely low dependencies - you need either Jackson or Protobuf (depending on how you are encoding the RPCs). We didn’t even include Guava, even though we lean heavily on it elsewhere in Calcite.
>
> Julian
>
>> On Jan 24, 2016, at 7:18 PM, Ron Wheeler <rw...@artifact-software.com> wrote:
>>
>> In the "marketing" videos, Calcite seems to be based on the idea that the whole question of database access can be best solved by a modular approach.
>> I was surprised to see a lot of storage layer stuff mixed in with the core.
>>
>> I want to add an adapter and incorporate this into an application that has its own database abstraction which currently supports Jackrabbit with an in-memory configuration.
>> I can not afford to add a whole bunch of code and third party libraries that are not essential.
>>
>> Is there a plan to produce a lightweight core package that is suitable for use with an adapter?
>>
>> Ron
>>
>> --
>> Ron Wheeler
>> President
>> Artifact Software Inc
>> email: rwheeler@artifact-software.com
>> skype: ronaldmwheeler
>> phone: 866-970-2435, ext 102
>>
>
--
Ron Wheeler
President
Artifact Software Inc
email: rwheeler@artifact-software.com
skype: ronaldmwheeler
phone: 866-970-2435, ext 102
Re: Calcite Packaging question from a newbee
Posted by Julian Hyde <jh...@apache.org>.
Ron,
I’ll attempt to answer but I might have misunderstood your use case. So if I am off the mark, please describe the architecture you want in more detail.
Let’s suppose you have a NoSQL database X and your application wants to speak SQL to it. You would include three Calcite modules: core, avatica, and the module for database X. The adapter for each database tends to be in a separate module (mongodb, spark, splunk, csv). JDBC is in core, but doesn’t bring in any dependencies.
Core doesn’t contain dependencies on third-party systems. It does depend on things needed by the core functionality, for example Jackson, because we want people to be able to write their models in JSON.
Reading pom.xml, the runtime dependencies of core are as follows:
* calcite-apatica
* protobuf
* jackson
* calcite-linq4j
* commons-dbcp
* findbugs-jsr305
* guava
* eigenbase-properties
* janino
* pentaho-aggdesigner-algorithm
I don’t think any of those has significant downstream dependencies.
I’m very happy to have a discussion of what should be in core, and what its dependencies should be. There’s no “perfect” decomposition of a project into modules, but we can get a little nearer to perfection if we listen to how people (such as you) are deploying the project in the real world.
After all that, if core is too heavyweight, then maybe you want *remote* Calcite rather than *embedded* Calcite. On your client you can include JUST avatica. On your server you would run avatica-server and include avatica and core and anything else you desire. Avatica has extremely low dependencies - you need either Jackson or Protobuf (depending on how you are encoding the RPCs). We didn’t even include Guava, even though we lean heavily on it elsewhere in Calcite.
Julian
> On Jan 24, 2016, at 7:18 PM, Ron Wheeler <rw...@artifact-software.com> wrote:
>
> In the "marketing" videos, Calcite seems to be based on the idea that the whole question of database access can be best solved by a modular approach.
> I was surprised to see a lot of storage layer stuff mixed in with the core.
>
> I want to add an adapter and incorporate this into an application that has its own database abstraction which currently supports Jackrabbit with an in-memory configuration.
> I can not afford to add a whole bunch of code and third party libraries that are not essential.
>
> Is there a plan to produce a lightweight core package that is suitable for use with an adapter?
>
> Ron
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: rwheeler@artifact-software.com
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>