You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hop.apache.org by Matt Casters <ma...@neo4j.com.INVALID> on 2021/06/15 08:54:52 UTC

[DISCUSS] SQL / JDBC / Calcite

Hello Hoppers,

Converting SQL into distinct operations is something that was not just done
in the glorious past of our tool but is happening in Apache Beam, Spark and
other frameworks.  Including the old code from said past is just not a good
idea (HOP-2956).

I would suggest that we start having the discussion to (also) move to using
Apache Calcite (great project from Julian btw) to achieve the following:

1. Implement the metadata ability to expose the output of a pipeline
transform as a SQL table (or re-use the new Web Service metadata)
2. Implement a multi-table SQL parser with very basic (or no) function
support
3. Implement the ability to use the output of the SQL parser to generate
pipeline metadata and then to execute the generated pipeline to stream the
data back to a SQL service on Hop Server.
4. Implement a JDBC driver on top of the SQL service.
5. Investigate how we can implement push-down optimisation on top of
Calcite.

Please think of your old use-cases.  Did you ever use the old thin client?
Do you have any specific requirements or tools to support?  Let us know
here or add them to HOP-2957.

Thanks,
Matt

Re: [DISCUSS] SQL / JDBC / Calcite

Posted by Brandon Jackson <us...@gmail.com>.
If I understand correctly the thin client is the functionality from the
past to create a data source using what we now call a pipeline and be able
to use SQL to interact with it.
It was amazing how SQL could be used to achieve all this.

Use case 1: System automation and reporting the results.
-- Required: Sending parameters to the transformation to act upon.
I used to use this functionality to send parameters to pipelines that would
then perform actions where kettle was running and then prepare the results
of those actions in the stream where it would be output as a SQL table
result. - Via a dummy step designated as the output step.

Use case 2: Data blending on demand
-- Required: Sending parameters or allowing default parameters to take
effect.
The pipelines allowed us to prepare and query several data sources and
blend the data together and provide it in a concise manner without
materializing it to a database table.  The data would always be fresh and a
blend of multiple data sources. - Via a dummy step designated as the output
step.


On Tue, Jun 15, 2021 at 3:55 AM Matt Casters <ma...@neo4j.com.invalid>
wrote:

> Hello Hoppers,
>
> Converting SQL into distinct operations is something that was not just done
> in the glorious past of our tool but is happening in Apache Beam, Spark and
> other frameworks.  Including the old code from said past is just not a good
> idea (HOP-2956).
>
> I would suggest that we start having the discussion to (also) move to using
> Apache Calcite (great project from Julian btw) to achieve the following:
>
> 1. Implement the metadata ability to expose the output of a pipeline
> transform as a SQL table (or re-use the new Web Service metadata)
> 2. Implement a multi-table SQL parser with very basic (or no) function
> support
> 3. Implement the ability to use the output of the SQL parser to generate
> pipeline metadata and then to execute the generated pipeline to stream the
> data back to a SQL service on Hop Server.
> 4. Implement a JDBC driver on top of the SQL service.
> 5. Investigate how we can implement push-down optimisation on top of
> Calcite.
>
> Please think of your old use-cases.  Did you ever use the old thin client?
> Do you have any specific requirements or tools to support?  Let us know
> here or add them to HOP-2957.
>
> Thanks,
> Matt
>