You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Kevin Risden <kr...@apache.org> on 2019/04/25 23:08:03 UTC

Google BigQuery - zetasql parser/analyzer

https://github.com/google/zetasql

Saw this come by on twitter not too long ago and figured share here since
it definitely overlaps with Calcite.

Kevin Risden

Re: Google BigQuery - zetasql parser/analyzer

Posted by Andrew Pilloud <ap...@google.com.INVALID>.
Sorry for the delay. ZetaSQL is not a threat to Calcite. This project
is Google's internal efforts to standardize around a single SQL
dialect. It is a parser and analyzer, Calcite is much more. We also
have an implementation test suite, consisting of a large set of sample
queries and expected results, on our roadmap for open sourcing.

What we would like to contribute now is a thin API by which Calcite
can talk to ZetaSQL for parsing queries. We have a prototype of this
that we plan to contribute to Beam, but it seems like it would be
generally useful to other Calcite users. (The interface would likely
be useful for wrapping other parsers as well.) This will rely on the
ZetaSQL C++ parser, which creates significant limitations around
portability but gives the ability to strictly match BigQuery's
"Standard SQL" dialect.

Once we open source the test suite, I expect we would be interested in
using that to improve Babel's BigQuery mode. We do not have plans to
write a pure Java ZetaSQL parser, but have a need for one. ZetaSQL is
constantly evolving and has a significant number of teams working on
it inside Google. I don't think Babel can replace it or that the
projects can be merged, but we see the long term value in cooperating
with you to improve the Babel parser.

Andrew



From: Julian Hyde <jh...@apache.org>
Date: Fri, Apr 26, 2019 at 11:08 AM
To: dev

> Thanks for chiming in, Andrew.
>
> I’m a little confused about the purpose of ZetaSQL currently. (I’m sure it will start to click as time goes on.)
>
> My initial reaction is that ZetaSQL has similar goals to Calcite - to decompose the traditional DBMS into components from which people can roll their own database, adding their own planning, engine, data format, algorithms, etc. That is a revolutionary vision which I fully support.
>
> Calcite should not feel threatened by ZetaSQL. As Google pursues this vision of the deconstructed database, there will be more opportunities for us to work with the newly exposed components. Our integration with Beam is an example.
>
> I second Michael’s question “could you clarify what you’re hoping to contribute to Babel?” Would you contribute a (thin) API by which Babel could talk to ZetaSQL’s parser, or code that parses various dialects in native Java.
>
> It’s probably too much to hope for Babel to evolve into the java port of ZetaSQL, but I would be open to the idea. And if that doesn’t happen, another way of cooperating that would be fruitful would be for the projects to have separate code but share a test suite.
>
> Julian
>
>
> > On Apr 26, 2019, at 6:14 AM, Michael Mior <mm...@apache.org> wrote:
> >
> > This is basically the point of the Babel parser, to be as liberal as
> > possible in what SQL is accepted and configurable where things
> > necessarily conflict between different implementations.
> >
> > Andrew, could you clarify what you're hoping to contribute to Babel? I
> > think support for any dialect of SQL will generally be welcomed :)
> >
> > --
> > Michael Mior
> > mmior@apache.org
> >
> > Le jeu. 25 avr. 2019 à 23:39, Lai Zhou <hh...@gmail.com> a écrit :
> >>
> >> Good news.
> >> I found Julian created a new issue about various sql parsing.
> >> https://issues.apache.org/jira/browse/CALCITE-3022
> >> I think it may be helpful to support various sql parsing in a unified
> >> manner.
> >> Calcite did use the SqlConformance to enable various SQL compatibility
> >> modes,
> >> but it's not enough to solve all the problems.
> >> Except sql parsing, different sql engines also have differences on
> >> type inferring,
> >> type checking and  implicit casting.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Andrew Pilloud <ap...@google.com.invalid> 于2019年4月26日周五 上午7:28写道:
> >>
> >>> I was intending to send an email about this, thanks for starting the
> >>> discussion. I'm on the team at Google that is open sourcing ZetaSQL.
> >>> It is a C++ SQL parser used internally for the BigQuery standard sql
> >>> parser, among other things.
> >>>
> >>> We've open source the Java frontend and Rui currently working on a
> >>> adapter between ZetaSQL and Calcite for Apache Beam, but we'd probably
> >>> be interested in contributing it directly to Calcite Bable at some
> >>> point. Any thoughts on that?
> >>>
> >>> Andrew
> >>>
> >>> On Thu, Apr 25, 2019 at 4:08 PM Kevin Risden <kr...@apache.org> wrote:
> >>>>
> >>>> https://github.com/google/zetasql
> >>>>
> >>>> Saw this come by on twitter not too long ago and figured share here since
> >>>> it definitely overlaps with Calcite.
> >>>>
> >>>> Kevin Risden
> >>>
>

Re: Google BigQuery - zetasql parser/analyzer

Posted by Julian Hyde <jh...@apache.org>.
Thanks for chiming in, Andrew.

I’m a little confused about the purpose of ZetaSQL currently. (I’m sure it will start to click as time goes on.)

My initial reaction is that ZetaSQL has similar goals to Calcite - to decompose the traditional DBMS into components from which people can roll their own database, adding their own planning, engine, data format, algorithms, etc. That is a revolutionary vision which I fully support.

Calcite should not feel threatened by ZetaSQL. As Google pursues this vision of the deconstructed database, there will be more opportunities for us to work with the newly exposed components. Our integration with Beam is an example.

I second Michael’s question “could you clarify what you’re hoping to contribute to Babel?” Would you contribute a (thin) API by which Babel could talk to ZetaSQL’s parser, or code that parses various dialects in native Java.

It’s probably too much to hope for Babel to evolve into the java port of ZetaSQL, but I would be open to the idea. And if that doesn’t happen, another way of cooperating that would be fruitful would be for the projects to have separate code but share a test suite.

Julian


> On Apr 26, 2019, at 6:14 AM, Michael Mior <mm...@apache.org> wrote:
> 
> This is basically the point of the Babel parser, to be as liberal as
> possible in what SQL is accepted and configurable where things
> necessarily conflict between different implementations.
> 
> Andrew, could you clarify what you're hoping to contribute to Babel? I
> think support for any dialect of SQL will generally be welcomed :)
> 
> --
> Michael Mior
> mmior@apache.org
> 
> Le jeu. 25 avr. 2019 à 23:39, Lai Zhou <hh...@gmail.com> a écrit :
>> 
>> Good news.
>> I found Julian created a new issue about various sql parsing.
>> https://issues.apache.org/jira/browse/CALCITE-3022
>> I think it may be helpful to support various sql parsing in a unified
>> manner.
>> Calcite did use the SqlConformance to enable various SQL compatibility
>> modes,
>> but it's not enough to solve all the problems.
>> Except sql parsing, different sql engines also have differences on
>> type inferring,
>> type checking and  implicit casting.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Andrew Pilloud <ap...@google.com.invalid> 于2019年4月26日周五 上午7:28写道:
>> 
>>> I was intending to send an email about this, thanks for starting the
>>> discussion. I'm on the team at Google that is open sourcing ZetaSQL.
>>> It is a C++ SQL parser used internally for the BigQuery standard sql
>>> parser, among other things.
>>> 
>>> We've open source the Java frontend and Rui currently working on a
>>> adapter between ZetaSQL and Calcite for Apache Beam, but we'd probably
>>> be interested in contributing it directly to Calcite Bable at some
>>> point. Any thoughts on that?
>>> 
>>> Andrew
>>> 
>>> On Thu, Apr 25, 2019 at 4:08 PM Kevin Risden <kr...@apache.org> wrote:
>>>> 
>>>> https://github.com/google/zetasql
>>>> 
>>>> Saw this come by on twitter not too long ago and figured share here since
>>>> it definitely overlaps with Calcite.
>>>> 
>>>> Kevin Risden
>>> 


Re: Google BigQuery - zetasql parser/analyzer

Posted by Michael Mior <mm...@apache.org>.
This is basically the point of the Babel parser, to be as liberal as
possible in what SQL is accepted and configurable where things
necessarily conflict between different implementations.

Andrew, could you clarify what you're hoping to contribute to Babel? I
think support for any dialect of SQL will generally be welcomed :)

--
Michael Mior
mmior@apache.org

Le jeu. 25 avr. 2019 à 23:39, Lai Zhou <hh...@gmail.com> a écrit :
>
> Good news.
> I found Julian created a new issue about various sql parsing.
>  https://issues.apache.org/jira/browse/CALCITE-3022
> I think it may be helpful to support various sql parsing in a unified
> manner.
> Calcite did use the SqlConformance to enable various SQL compatibility
> modes,
> but it's not enough to solve all the problems.
> Except sql parsing, different sql engines also have differences on
> type inferring,
> type checking and  implicit casting.
>
>
>
>
>
>
>
> Andrew Pilloud <ap...@google.com.invalid> 于2019年4月26日周五 上午7:28写道:
>
> > I was intending to send an email about this, thanks for starting the
> > discussion. I'm on the team at Google that is open sourcing ZetaSQL.
> > It is a C++ SQL parser used internally for the BigQuery standard sql
> > parser, among other things.
> >
> > We've open source the Java frontend and Rui currently working on a
> > adapter between ZetaSQL and Calcite for Apache Beam, but we'd probably
> > be interested in contributing it directly to Calcite Bable at some
> > point. Any thoughts on that?
> >
> > Andrew
> >
> > On Thu, Apr 25, 2019 at 4:08 PM Kevin Risden <kr...@apache.org> wrote:
> > >
> > > https://github.com/google/zetasql
> > >
> > > Saw this come by on twitter not too long ago and figured share here since
> > > it definitely overlaps with Calcite.
> > >
> > > Kevin Risden
> >

Re: Google BigQuery - zetasql parser/analyzer

Posted by Lai Zhou <hh...@gmail.com>.
Good news.
I found Julian created a new issue about various sql parsing.
 https://issues.apache.org/jira/browse/CALCITE-3022
I think it may be helpful to support various sql parsing in a unified
manner.
Calcite did use the SqlConformance to enable various SQL compatibility
modes,
but it's not enough to solve all the problems.
Except sql parsing, different sql engines also have differences on
type inferring,
type checking and  implicit casting.







Andrew Pilloud <ap...@google.com.invalid> 于2019年4月26日周五 上午7:28写道:

> I was intending to send an email about this, thanks for starting the
> discussion. I'm on the team at Google that is open sourcing ZetaSQL.
> It is a C++ SQL parser used internally for the BigQuery standard sql
> parser, among other things.
>
> We've open source the Java frontend and Rui currently working on a
> adapter between ZetaSQL and Calcite for Apache Beam, but we'd probably
> be interested in contributing it directly to Calcite Bable at some
> point. Any thoughts on that?
>
> Andrew
>
> On Thu, Apr 25, 2019 at 4:08 PM Kevin Risden <kr...@apache.org> wrote:
> >
> > https://github.com/google/zetasql
> >
> > Saw this come by on twitter not too long ago and figured share here since
> > it definitely overlaps with Calcite.
> >
> > Kevin Risden
>

Re: Google BigQuery - zetasql parser/analyzer

Posted by Andrew Pilloud <ap...@google.com.INVALID>.
I was intending to send an email about this, thanks for starting the
discussion. I'm on the team at Google that is open sourcing ZetaSQL.
It is a C++ SQL parser used internally for the BigQuery standard sql
parser, among other things.

We've open source the Java frontend and Rui currently working on a
adapter between ZetaSQL and Calcite for Apache Beam, but we'd probably
be interested in contributing it directly to Calcite Bable at some
point. Any thoughts on that?

Andrew

On Thu, Apr 25, 2019 at 4:08 PM Kevin Risden <kr...@apache.org> wrote:
>
> https://github.com/google/zetasql
>
> Saw this come by on twitter not too long ago and figured share here since
> it definitely overlaps with Calcite.
>
> Kevin Risden