You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kenneth Knowles <ke...@apache.org> on 2019/01/14 03:53:46 UTC

Vendoring Calcite

After doing the Guava vendoring as a practice run [1], I started on Calcite
[2]. I have a couple issues, questions, suspicions I wanted to bring up to
see if anyone had good ideas.

 - Calcite has a bunch of transitive deps which I vendored with it.
 - Calcite's transitive deps include proto. Relocating would break
generated code unless it is also bundled. I don't really know the story
here.
 - Our SQL parser links into Calcite, so needs to share relocation. We do
exactly this between vendored gRPC and the generated portability classes.
Perhaps this parser package should be isolated from the main Beam SQL
package.
 - Codegen of SQL statements links into Calcite. Does it link the generated
code with its own support library by using reflection (in which case
relocation is probably fine) or by concatenating strings (in which case
relocation would break things if we cannot configure a custom prefix)?

It is a lot to deal with, but also a decent payoff. We know there are users
who wanted to use their own version of Calcite but could not. Also you have
to tweak your IntelliJ to step into Calcite's code while debugging, whereas
after this vendoring that won't be necessary.

Kenn

[1] https://github.com/apache/beam/pull/7494
[2] https://github.com/kennknowles/beam/commits/vendor-calcite

Re: Vendoring Calcite

Posted by Gleb Kanterov <gl...@spotify.com>.
Great initiative. I was thinking about making a similar proposal. I tried
using Beam SQL in a project that has Calcite dependency, and it doesn't
work because Calcite does internal JDBC connection on "jdbc:calcite:" URL,
and you can't register two drivers for the same scheme. Not sure how it's
going to work out with vendoring, for instance, see Frameworks.java#L153
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/tools/Frameworks.java#L153>
and
MaterializedViewTable.java#L59
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/schema/impl/MaterializedViewTable.java#L59>,
probably, it needs to be addressed in Calcite itself.

One of the use-cases I see for vendored Calcite is being able to rely on
"developer" API to extend Beam SQL capabilities.

Gleb

On Mon, Jan 14, 2019 at 4:54 AM Kenneth Knowles <ke...@apache.org> wrote:

> After doing the Guava vendoring as a practice run [1], I started on
> Calcite [2]. I have a couple issues, questions, suspicions I wanted to
> bring up to see if anyone had good ideas.
>
>  - Calcite has a bunch of transitive deps which I vendored with it.
>  - Calcite's transitive deps include proto. Relocating would break
> generated code unless it is also bundled. I don't really know the story
> here.
>  - Our SQL parser links into Calcite, so needs to share relocation. We do
> exactly this between vendored gRPC and the generated portability classes.
> Perhaps this parser package should be isolated from the main Beam SQL
> package.
>  - Codegen of SQL statements links into Calcite. Does it link the
> generated code with its own support library by using reflection (in which
> case relocation is probably fine) or by concatenating strings (in which
> case relocation would break things if we cannot configure a custom prefix)?
>
> It is a lot to deal with, but also a decent payoff. We know there are
> users who wanted to use their own version of Calcite but could not. Also
> you have to tweak your IntelliJ to step into Calcite's code while
> debugging, whereas after this vendoring that won't be necessary.
>
> Kenn
>
> [1] https://github.com/apache/beam/pull/7494
> [2] https://github.com/kennknowles/beam/commits/vendor-calcite
>


-- 
Cheers,
Gleb