You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Fabian Hueske <fh...@gmail.com> on 2016/01/07 15:05:24 UTC

Effort to add SQL / StreamSQL to Flink

Hi everybody,

in the last days, Timo and I refined the design document for adding a SQL /
StreamSQL interface on top of Flink that was started by Stephan.

The document proposes an architecture that is centered around Apache
Calcite. Calcite is an Apache top-level project and includes a SQL parser,
a semantic validator for relational queries, and a rule- and cost-based
relational optimizer. Calcite is used by Apache Hive and Apache Drill
(among other projects). In a nutshell, the plan is to translate Table API
and SQL queries into Calcite's relational expression trees, optimize these
trees, and translate them into DataSet and DataStream programs.The document
breaks down the work into several tasks and subtasks.

Please review the design document and comment.

-- >
https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing

Unless there are major concerns with the design, Timo and I want to start
next week to move the current Table API on top of Apache Calcite (Task 1 in
the document). The goal of this task is to have the same functionality as
currently, but with Calcite in the translation process. This is a blocking
task that we hope to complete soon. Afterwards, we can independently work
on different aspects such as extending the Table API, adding a SQL
interface (basically just a parser), integration with external data
sources, better code generation, optimization rules, streaming support for
the Table API, StreamSQL, etc..

Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
the master branch once the task is completed. Of course, everybody is
welcome to contribute to this effort. Please let us know such that we can
coordinate our efforts.

Thanks,
Fabian

Re: Effort to add SQL / StreamSQL to Flink

Posted by Henry Saputra <he...@gmail.com>.

I am excited and nervous at the same time =)

- Henry

On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fh...@gmail.com> wrote:
> Hi everybody,
>
> in the last days, Timo and I refined the design document for adding a SQL /
> StreamSQL interface on top of Flink that was started by Stephan.
>
> The document proposes an architecture that is centered around Apache
> Calcite. Calcite is an Apache top-level project and includes a SQL parser,
> a semantic validator for relational queries, and a rule- and cost-based
> relational optimizer. Calcite is used by Apache Hive and Apache Drill
> (among other projects). In a nutshell, the plan is to translate Table API
> and SQL queries into Calcite's relational expression trees, optimize these
> trees, and translate them into DataSet and DataStream programs.The document
> breaks down the work into several tasks and subtasks.
>
> Please review the design document and comment.
>
> -- >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>
> Unless there are major concerns with the design, Timo and I want to start
> next week to move the current Table API on top of Apache Calcite (Task 1 in
> the document). The goal of this task is to have the same functionality as
> currently, but with Calcite in the translation process. This is a blocking
> task that we hope to complete soon. Afterwards, we can independently work
> on different aspects such as extending the Table API, adding a SQL
> interface (basically just a parser), integration with external data
> sources, better code generation, optimization rules, streaming support for
> the Table API, StreamSQL, etc..
>
> Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
> the master branch once the task is completed. Of course, everybody is
> welcome to contribute to this effort. Please let us know such that we can
> coordinate our efforts.
>
> Thanks,
> Fabian

RE: Effort to add SQL / StreamSQL to Flink

Posted by "Li, Chengxiang" <ch...@intel.com>.

Very cool work, look forward to contribute.

-----Original Message-----
From: Chiwan Park [mailto:chiwanpark@apache.org] 
Sent: Friday, January 8, 2016 9:36 AM
To: dev@flink.apache.org
Subject: Re: Effort to add SQL / StreamSQL to Flink

Really good! Many people want to use SQL. :)

> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <kt...@apache.org> wrote:
> 
> Wow! Thanks Fabian, this looks fantastic!
> 
> On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Super, thanks for that detailed effort, Fabian!
>> 
>> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <mj...@apache.org> wrote:
>> 
>>> Pretty cool!
>>> 
>>> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
>>>> Hi everybody,
>>>> 
>>>> in the last days, Timo and I refined the design document for adding 
>>>> a
>>> SQL /
>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>> 
>>>> The document proposes an architecture that is centered around 
>>>> Apache Calcite. Calcite is an Apache top-level project and includes 
>>>> a SQL
>>> parser,
>>>> a semantic validator for relational queries, and a rule- and 
>>>> cost-based relational optimizer. Calcite is used by Apache Hive and 
>>>> Apache Drill (among other projects). In a nutshell, the plan is to 
>>>> translate Table
>> API
>>>> and SQL queries into Calcite's relational expression trees, 
>>>> optimize
>>> these
>>>> trees, and translate them into DataSet and DataStream programs.The
>>> document
>>>> breaks down the work into several tasks and subtasks.
>>>> 
>>>> Please review the design document and comment.
>>>> 
>>>> -- >
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> cp1h2TVqdI/edit?usp=sharing
>>>> 
>>>> Unless there are major concerns with the design, Timo and I want to
>> start
>>>> next week to move the current Table API on top of Apache Calcite 
>>>> (Task
>> 1
>>> in
>>>> the document). The goal of this task is to have the same 
>>>> functionality
>> as
>>>> currently, but with Calcite in the translation process. This is a
>>> blocking
>>>> task that we hope to complete soon. Afterwards, we can 
>>>> independently
>> work
>>>> on different aspects such as extending the Table API, adding a SQL 
>>>> interface (basically just a parser), integration with external data 
>>>> sources, better code generation, optimization rules, streaming 
>>>> support
>>> for
>>>> the Table API, StreamSQL, etc..
>>>> 
>>>> Timo and I plan to work on a WIP branch to implement Task 1 and 
>>>> merge
>> it
>>> to
>>>> the master branch once the task is completed. Of course, everybody 
>>>> is welcome to contribute to this effort. Please let us know such 
>>>> that we
>> can
>>>> coordinate our efforts.
>>>> 
>>>> Thanks,
>>>> Fabian
>>>> 
>>> 
>>> 
>> 

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

Posted by Chiwan Park <ch...@apache.org>.

Really good! Many people want to use SQL. :)

> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <kt...@apache.org> wrote:
> 
> Wow! Thanks Fabian, this looks fantastic!
> 
> On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Super, thanks for that detailed effort, Fabian!
>> 
>> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <mj...@apache.org> wrote:
>> 
>>> Pretty cool!
>>> 
>>> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
>>>> Hi everybody,
>>>> 
>>>> in the last days, Timo and I refined the design document for adding a
>>> SQL /
>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>> 
>>>> The document proposes an architecture that is centered around Apache
>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>> parser,
>>>> a semantic validator for relational queries, and a rule- and cost-based
>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>> (among other projects). In a nutshell, the plan is to translate Table
>> API
>>>> and SQL queries into Calcite's relational expression trees, optimize
>>> these
>>>> trees, and translate them into DataSet and DataStream programs.The
>>> document
>>>> breaks down the work into several tasks and subtasks.
>>>> 
>>>> Please review the design document and comment.
>>>> 
>>>> -- >
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>> 
>>>> Unless there are major concerns with the design, Timo and I want to
>> start
>>>> next week to move the current Table API on top of Apache Calcite (Task
>> 1
>>> in
>>>> the document). The goal of this task is to have the same functionality
>> as
>>>> currently, but with Calcite in the translation process. This is a
>>> blocking
>>>> task that we hope to complete soon. Afterwards, we can independently
>> work
>>>> on different aspects such as extending the Table API, adding a SQL
>>>> interface (basically just a parser), integration with external data
>>>> sources, better code generation, optimization rules, streaming support
>>> for
>>>> the Table API, StreamSQL, etc..
>>>> 
>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>> it
>>> to
>>>> the master branch once the task is completed. Of course, everybody is
>>>> welcome to contribute to this effort. Please let us know such that we
>> can
>>>> coordinate our efforts.
>>>> 
>>>> Thanks,
>>>> Fabian
>>>> 
>>> 
>>> 
>> 

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

Posted by Kostas Tzoumas <kt...@apache.org>.

Wow! Thanks Fabian, this looks fantastic!

On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <se...@apache.org> wrote:

> Super, thanks for that detailed effort, Fabian!
>
> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <mj...@apache.org> wrote:
>
> > Pretty cool!
> >
> > On 01/07/2016 03:05 PM, Fabian Hueske wrote:
> > > Hi everybody,
> > >
> > > in the last days, Timo and I refined the design document for adding a
> > SQL /
> > > StreamSQL interface on top of Flink that was started by Stephan.
> > >
> > > The document proposes an architecture that is centered around Apache
> > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > parser,
> > > a semantic validator for relational queries, and a rule- and cost-based
> > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > (among other projects). In a nutshell, the plan is to translate Table
> API
> > > and SQL queries into Calcite's relational expression trees, optimize
> > these
> > > trees, and translate them into DataSet and DataStream programs.The
> > document
> > > breaks down the work into several tasks and subtasks.
> > >
> > > Please review the design document and comment.
> > >
> > > -- >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > >
> > > Unless there are major concerns with the design, Timo and I want to
> start
> > > next week to move the current Table API on top of Apache Calcite (Task
> 1
> > in
> > > the document). The goal of this task is to have the same functionality
> as
> > > currently, but with Calcite in the translation process. This is a
> > blocking
> > > task that we hope to complete soon. Afterwards, we can independently
> work
> > > on different aspects such as extending the Table API, adding a SQL
> > > interface (basically just a parser), integration with external data
> > > sources, better code generation, optimization rules, streaming support
> > for
> > > the Table API, StreamSQL, etc..
> > >
> > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> it
> > to
> > > the master branch once the task is completed. Of course, everybody is
> > > welcome to contribute to this effort. Please let us know such that we
> can
> > > coordinate our efforts.
> > >
> > > Thanks,
> > > Fabian
> > >
> >
> >
>

Re: Effort to add SQL / StreamSQL to Flink

Posted by Stephan Ewen <se...@apache.org>.

Super, thanks for that detailed effort, Fabian!

On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <mj...@apache.org> wrote:

> Pretty cool!
>
> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
> > Hi everybody,
> >
> > in the last days, Timo and I refined the design document for adding a
> SQL /
> > StreamSQL interface on top of Flink that was started by Stephan.
> >
> > The document proposes an architecture that is centered around Apache
> > Calcite. Calcite is an Apache top-level project and includes a SQL
> parser,
> > a semantic validator for relational queries, and a rule- and cost-based
> > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > (among other projects). In a nutshell, the plan is to translate Table API
> > and SQL queries into Calcite's relational expression trees, optimize
> these
> > trees, and translate them into DataSet and DataStream programs.The
> document
> > breaks down the work into several tasks and subtasks.
> >
> > Please review the design document and comment.
> >
> > -- >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> >
> > Unless there are major concerns with the design, Timo and I want to start
> > next week to move the current Table API on top of Apache Calcite (Task 1
> in
> > the document). The goal of this task is to have the same functionality as
> > currently, but with Calcite in the translation process. This is a
> blocking
> > task that we hope to complete soon. Afterwards, we can independently work
> > on different aspects such as extending the Table API, adding a SQL
> > interface (basically just a parser), integration with external data
> > sources, better code generation, optimization rules, streaming support
> for
> > the Table API, StreamSQL, etc..
> >
> > Timo and I plan to work on a WIP branch to implement Task 1 and merge it
> to
> > the master branch once the task is completed. Of course, everybody is
> > welcome to contribute to this effort. Please let us know such that we can
> > coordinate our efforts.
> >
> > Thanks,
> > Fabian
> >
>
>

Re: Effort to add SQL / StreamSQL to Flink

Posted by "Matthias J. Sax" <mj...@apache.org>.

Pretty cool!

On 01/07/2016 03:05 PM, Fabian Hueske wrote:
> Hi everybody,
> 
> in the last days, Timo and I refined the design document for adding a SQL /
> StreamSQL interface on top of Flink that was started by Stephan.
> 
> The document proposes an architecture that is centered around Apache
> Calcite. Calcite is an Apache top-level project and includes a SQL parser,
> a semantic validator for relational queries, and a rule- and cost-based
> relational optimizer. Calcite is used by Apache Hive and Apache Drill
> (among other projects). In a nutshell, the plan is to translate Table API
> and SQL queries into Calcite's relational expression trees, optimize these
> trees, and translate them into DataSet and DataStream programs.The document
> breaks down the work into several tasks and subtasks.
> 
> Please review the design document and comment.
> 
> -- >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> 
> Unless there are major concerns with the design, Timo and I want to start
> next week to move the current Table API on top of Apache Calcite (Task 1 in
> the document). The goal of this task is to have the same functionality as
> currently, but with Calcite in the translation process. This is a blocking
> task that we hope to complete soon. Afterwards, we can independently work
> on different aspects such as extending the Table API, adding a SQL
> interface (basically just a parser), integration with external data
> sources, better code generation, optimization rules, streaming support for
> the Table API, StreamSQL, etc..
> 
> Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
> the master branch once the task is completed. Of course, everybody is
> welcome to contribute to this effort. Please let us know such that we can
> coordinate our efforts.
> 
> Thanks,
> Fabian
>

Re: 答复: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Vasiliki Kalavri <va...@gmail.com>.

Great to see people excited about this :)
SQL is indeed coming up next. We should have the SQL on DataSets programs
(see FLINK-3640 [1]) pretty soon.

-Vasia.

[1]: https://issues.apache.org/jira/browse/FLINK-3640

On 29 March 2016 at 14:02, Jiangsong (Hi) <hi...@huawei.com> wrote:

> So excited!!   SQL on Flink is ready?
>
> Are there any show case or howto use?
>
>
>
> -----邮件原件-----
> 发件人: ewenstephan@gmail.com [mailto:ewenstephan@gmail.com] 代表 Stephan Ewen
> 发送时间: 2016年3月29日 20:00
> 收件人: dev@flink.apache.org
> 主题: Re: 答复: Effort to add SQL / StreamSQL to Flink
>
> Cool stuff!
>
> SQL coming up next? ;-)
>
>
> On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <mx...@apache.org>
> wrote:
>
> > Yeah! I'm a little late to the party but exciting stuff! :)
> >
> > On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <
> > vasilikikalavri@gmail.com
> > > wrote:
> >
> > > Hi all,
> > >
> > > tableOnCalcite has been merged to master :)
> > >
> > > Cheers,
> > > -Vasia.
> > >
> > > On 17 March 2016 at 11:11, Fabian Hueske <fh...@gmail.com> wrote:
> > >
> > > > Thanks for the initiative Vasia!
> > > > I went over the diff and didn't find anything crucial.
> > > >
> > > > I would like to do another pass over the tests though and improve
> > > > the exceptions for invalid joins before merging.
> > > > Will open a PR later today.
> > > >
> > > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri
> > > > <vasilikikalavri@gmail.com
> > >:
> > > >
> > > > > Yes, the current state corresponds to Task 1. PR #1770
> > > > > corresponds to
> > > > Task
> > > > > 5. Task 6 should come right after :)
> > > > >
> > > > > -V.
> > > > >
> > > > > On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org>
> > wrote:
> > > > >
> > > > > > Cool, this is great news!
> > > > > > So "Task 1" from the document [1] is done with the merge? And
> > > > > > PR
> > > #1770
> > > > is
> > > > > > going towards "Task 6".
> > > > > > I think good support for Stream SQL is a very interesting new
> > feature
> > > > for
> > > > > > Flink.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPc
> > p1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > > >
> > > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > > > vasilikikalavri@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > > finally
> > > > > > ready
> > > > > > > to be merged.
> > > > > > > It essentially provides the existing functionality of the
> > > > > > > Table
> > > API,
> > > > > but
> > > > > > > now the translation happens through Apache Calcite.
> > > > > > > You can find the changes rebased on top of the current
> > > > > > > master in
> > > [1].
> > > > > > > We have removed the prototype streaming Table API
> > > > > > > functionality,
> > > > which
> > > > > > will
> > > > > > > be added back once PR [2] is merged.
> > > > > > >
> > > > > > > We'll go through the changes once more and, if no
> > > > > > > objections, we
> > > > would
> > > > > > like
> > > > > > > to go ahead and merge this.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > -Vasia.
> > > > > > >
> > > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > > >
> > > > > > >
> > > > > > > On 15 January 2016 at 10:59, Fabian Hueske
> > > > > > > <fh...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi everybody,
> > > > > > > >
> > > > > > > > as previously announced, I pushed a feature branch called
> > > > > > > "tableOnCalcite"
> > > > > > > > to the Flink repository.
> > > > > > > > We will use this branch to work on FLINK-3221 and its
> > sub-issues.
> > > > > > > >
> > > > > > > > Cheers, Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fhueske@gmail.com
> >:
> > > > > > > >
> > > > > > > > > We haven't defined the StreamSQL syntax yet (and I think
> > > > > > > > > it
> > > will
> > > > > take
> > > > > > > > some
> > > > > > > > > time until we are at that point).
> > > > > > > > > So we are quite flexible with both featurs.
> > > > > > > > >
> > > > > > > > > Let's keep this opportunity in mind and coordinate when
> > before
> > > > > making
> > > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > > >
> > > > > > > > > Fabian
> > > > > > > > >
> > > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> > trohrmann@apache.org
> > > >:
> > > > > > > > >
> > > > > > > > >> First of all, it's a great design document. Looking
> > > > > > > > >> forward
> > > > having
> > > > > > > > stream
> > > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > > >>
> > > > > > > > >> I think it is a good idea to consolidate stream SQL and
> > > > > > > > >> CEP
> > in
> > > > the
> > > > > > > long
> > > > > > > > >> run. CEP's additional features compared to SQL boil
> > > > > > > > >> down to
> > > > > pattern
> > > > > > > > >> detection. Once we have this, it should be only a
> > > > > > > > >> question
> > of
> > > > > > defining
> > > > > > > > the
> > > > > > > > >> SQL syntax for event patterns in order to integrate CEP
> > > > > > > > >> with
> > > > > stream
> > > > > > > SQL.
> > > > > > > > >> Oracle has already defined an extension [1] to detect
> > patterns
> > > > in
> > > > > a
> > > > > > > set
> > > > > > > > of
> > > > > > > > >> table rows. This or Esper's event processing language
> > > > > > > > >> (EPL)
> > > [2]
> > > > > > could
> > > > > > > > be a
> > > > > > > > >> good starting point.
> > > > > > > > >>
> > > > > > > > >> [1]
> > > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG89
> > > > > > 59
> > > > > > > > >> [2]
> > > > > > >
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > > >>
> > > > > > > > >> Cheers,
> > > > > > > > >> Till
> > > > > > > > >>
> > > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > > fhueske@gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > Thanks for the feedback!
> > > > > > > > >> >
> > > > > > > > >> > We will start the SQL effort with putting the
> > > > > > > > >> > existing
> > > (batch)
> > > > > > Table
> > > > > > > > >> API on
> > > > > > > > >> > top of Apache Calcite.
> > > > > > > > >> > From there we continue to add streaming support for
> > > > > > > > >> > the
> > > Table
> > > > > API
> > > > > > > > >> before we
> > > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > > >> >
> > > > > > > > >> > Consolidating the efforts with the CEP library sounds
> > like a
> > > > > good
> > > > > > > idea
> > > > > > > > >> to
> > > > > > > > >> > me.
> > > > > > > > >> > Maybe it can be nicely integrated with the streaming
> > > > > > > > >> > table
> > > API
> > > > > and
> > > > > > > > >> later as
> > > > > > > > >> > well with the StreamSQL interface (the StreamSQL
> > > > > > > > >> > dialect
> > is
> > > > not
> > > > > > > > defined
> > > > > > > > >> > yet).
> > > > > > > > >> >
> > > > > > > > >> > @Till: What do you think about adding CEP features to
> > > > > > > > >> > the
> > > > Table
> > > > > > API.
> > > > > > > > >> From
> > > > > > > > >> > the CEP design doc, it looks like we need to add a
> > > > > > > > >> > pattern
> > > > > > matching
> > > > > > > > >> > operator in addition to the window features that we
> > > > > > > > >> > need
> > to
> > > > add
> > > > > > for
> > > > > > > > >> > streaming Table API in any case.
> > > > > > > > >> >
> > > > > > > > >> > Best, Fabian
> > > > > > > > >> >
> > > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > > hi.jiangsong@huawei.com
> > > > > > >:
> > > > > > > > >> >
> > > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > > SQL-standard
> > > > > > > language
> > > > > > > > >> > > extend to offering a cluster of window, pattern
> > matching.
> > > > EPL
> > > > > > can
> > > > > > > > >> both
> > > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > > >> > >
> > > > > > > > >> > > [1]
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper
> > _reference.pdf
> > > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > Regards
> > > > > > > > >> > > Song
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > -----邮件原件-----
> > > > > > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > > >> > > 收件人: dev@flink.apache.org
> > > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > > >> > >
> > > > > > > > >> > > We still don’t have a concensus about the streaming
> > > > > > > > >> > > SQL
> > > and
> > > > > CEP
> > > > > > > > >> library
> > > > > > > > >> > on
> > > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > > Maybe
> > > > we
> > > > > > > have
> > > > > > > > to
> > > > > > > > >> > > discuss about this in mailing list.
> > > > > > > > >> > >
> > > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > > ndimiduk@gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > What's the relationship between the streaming SQL
> > > proposed
> > > > > > here
> > > > > > > > and
> > > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > > henry.saputra@gmail.com
> > > > > > > > >> >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > > >> > > >>
> > > > > > > > >> > > >> - Henry
> > > > > > > > >> > > >>
> > > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > > fhueske@gmail.com
> > > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > > >> > > >>
> > > > > > > > >> > > >>> Hi Henry,
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> There is
> > > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > > and a
> > > > > > > > >> few
> > > > > > > > >> > > >>> subissues.
> > > > > > > > >> > > >>> I'll reorganize these and add more issues for
> > > > > > > > >> > > >>> the
> > > tasks
> > > > > > > > described
> > > > > > > > >> in
> > > > > > > > >> > > >>> the design document in the next days.
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > > henry.saputra@gmail.com
> > > > > > > > >> > > >> <javascript:;>
> > > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>>> HI Fabian,
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of
> > > > > > > > >> > > >>>> this
> > > new
> > > > > > > feature?
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> - Henry
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske
> > > > > > > > >> > > >>>> <
> > > > > > > > fhueske@gmail.com
> > > > > > > > >> > > >> <javascript:;>
> > > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> in the last days, Timo and I refined the
> > > > > > > > >> > > >>>>> design
> > > > document
> > > > > > for
> > > > > > > > >> > > >>>>> adding a
> > > > > > > > >> > > >>>> SQL /
> > > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> > started
> > > > by
> > > > > > > > Stephan.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> The document proposes an architecture that is
> > > centered
> > > > > > > around
> > > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache
> > > > > > > > >> > > >>>>> top-level
> > > project
> > > > > and
> > > > > > > > >> > > >>>>> includes a SQL
> > > > > > > > >> > > >>>> parser,
> > > > > > > > >> > > >>>>> a semantic validator for relational queries,
> > > > > > > > >> > > >>>>> and a
> > > > rule-
> > > > > > and
> > > > > > > > >> > > >> cost-based
> > > > > > > > >> > > >>>>> relational optimizer. Calcite is used by
> > > > > > > > >> > > >>>>> Apache
> > Hive
> > > > and
> > > > > > > > Apache
> > > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell,
> > > > > > > > >> > > >>>>> the
> > > plan
> > > > is
> > > > > > to
> > > > > > > > >> > > >>>>> translate Table
> > > > > > > > >> > > >>> API
> > > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> > expression
> > > > > > trees,
> > > > > > > > >> > > >>>>> optimize
> > > > > > > > >> > > >>>> these
> > > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > > DataStream
> > > > > > > > >> programs.The
> > > > > > > > >> > > >>>> document
> > > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > > subtasks.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> -- >
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>
> > > > > > > > >> > > >>>
> > > > > > > > >> > > >>
> > > > > > > > >>
> > > > > >
> > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRj
> > > P
> > > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Unless there are major concerns with the
> > > > > > > > >> > > >>>>> design,
> > > Timo
> > > > > and
> > > > > > I
> > > > > > > > want
> > > > > > > > >> > > >>>>> to
> > > > > > > > >> > > >>> start
> > > > > > > > >> > > >>>>> next week to move the current Table API on
> > > > > > > > >> > > >>>>> top of
> > > > Apache
> > > > > > > > Calcite
> > > > > > > > >> > > >> (Task
> > > > > > > > >> > > >>> 1
> > > > > > > > >> > > >>>> in
> > > > > > > > >> > > >>>>> the document). The goal of this task is to
> > > > > > > > >> > > >>>>> have
> > the
> > > > same
> > > > > > > > >> > > >> functionality
> > > > > > > > >> > > >>> as
> > > > > > > > >> > > >>>>> currently, but with Calcite in the
> > > > > > > > >> > > >>>>> translation
> > > > process.
> > > > > > This
> > > > > > > > is
> > > > > > > > >> a
> > > > > > > > >> > > >>>> blocking
> > > > > > > > >> > > >>>>> task that we hope to complete soon.
> > > > > > > > >> > > >>>>> Afterwards, we
> > > can
> > > > > > > > >> > > >>>>> independently
> > > > > > > > >> > > >>> work
> > > > > > > > >> > > >>>>> on different aspects such as extending the
> > > > > > > > >> > > >>>>> Table
> > > API,
> > > > > > > adding a
> > > > > > > > >> SQL
> > > > > > > > >> > > >>>>> interface (basically just a parser),
> > > > > > > > >> > > >>>>> integration
> > > with
> > > > > > > external
> > > > > > > > >> > > >>>>> data sources, better code generation,
> > > > > > > > >> > > >>>>> optimization
> > > > > rules,
> > > > > > > > >> > > >>>>> streaming
> > > > > > > > >> > > >> support
> > > > > > > > >> > > >>>> for
> > > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> > implement
> > > > > Task
> > > > > > 1
> > > > > > > > and
> > > > > > > > >> > > >>>>> merge
> > > > > > > > >> > > >>> it
> > > > > > > > >> > > >>>> to
> > > > > > > > >> > > >>>>> the master branch once the task is completed.
> > > > > > > > >> > > >>>>> Of
> > > > course,
> > > > > > > > >> everybody
> > > > > > > > >> > > >>>>> is welcome to contribute to this effort.
> > > > > > > > >> > > >>>>> Please
> > let
> > > us
> > > > > > know
> > > > > > > > such
> > > > > > > > >> > > >>>>> that we
> > > > > > > > >> > > >>> can
> > > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > > >> > > >>>>>
> > > > > > > > >> > > >>>>> Thanks,
> > > > > > > > >> > > >>>>> Fabian
> > > > > > > > >> > >
> > > > > > > > >> > > Regards,
> > > > > > > > >> > > Chiwan Park
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

答复: 答复: Effort to add SQL / StreamSQL to Flink

Posted by "Jiangsong (Hi)" <hi...@huawei.com>.

So excited!!   SQL on Flink is ready?  

Are there any show case or howto use?



-----邮件原件-----
发件人: ewenstephan@gmail.com [mailto:ewenstephan@gmail.com] 代表 Stephan Ewen
发送时间: 2016年3月29日 20:00
收件人: dev@flink.apache.org
主题: Re: 答复: Effort to add SQL / StreamSQL to Flink

Cool stuff!

SQL coming up next? ;-)


On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <mx...@apache.org> wrote:

> Yeah! I'm a little late to the party but exciting stuff! :)
>
> On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri < 
> vasilikikalavri@gmail.com
> > wrote:
>
> > Hi all,
> >
> > tableOnCalcite has been merged to master :)
> >
> > Cheers,
> > -Vasia.
> >
> > On 17 March 2016 at 11:11, Fabian Hueske <fh...@gmail.com> wrote:
> >
> > > Thanks for the initiative Vasia!
> > > I went over the diff and didn't find anything crucial.
> > >
> > > I would like to do another pass over the tests though and improve 
> > > the exceptions for invalid joins before merging.
> > > Will open a PR later today.
> > >
> > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri 
> > > <vasilikikalavri@gmail.com
> >:
> > >
> > > > Yes, the current state corresponds to Task 1. PR #1770 
> > > > corresponds to
> > > Task
> > > > 5. Task 6 should come right after :)
> > > >
> > > > -V.
> > > >
> > > > On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org>
> wrote:
> > > >
> > > > > Cool, this is great news!
> > > > > So "Task 1" from the document [1] is done with the merge? And 
> > > > > PR
> > #1770
> > > is
> > > > > going towards "Task 6".
> > > > > I think good support for Stream SQL is a very interesting new
> feature
> > > for
> > > > > Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPc
> p1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > >
> > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri < 
> > > > > vasilikikalavri@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > finally
> > > > > ready
> > > > > > to be merged.
> > > > > > It essentially provides the existing functionality of the 
> > > > > > Table
> > API,
> > > > but
> > > > > > now the translation happens through Apache Calcite.
> > > > > > You can find the changes rebased on top of the current 
> > > > > > master in
> > [1].
> > > > > > We have removed the prototype streaming Table API 
> > > > > > functionality,
> > > which
> > > > > will
> > > > > > be added back once PR [2] is merged.
> > > > > >
> > > > > > We'll go through the changes once more and, if no 
> > > > > > objections, we
> > > would
> > > > > like
> > > > > > to go ahead and merge this.
> > > > > >
> > > > > > Cheers,
> > > > > > -Vasia.
> > > > > >
> > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > >
> > > > > >
> > > > > > On 15 January 2016 at 10:59, Fabian Hueske 
> > > > > > <fh...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi everybody,
> > > > > > >
> > > > > > > as previously announced, I pushed a feature branch called
> > > > > > "tableOnCalcite"
> > > > > > > to the Flink repository.
> > > > > > > We will use this branch to work on FLINK-3221 and its
> sub-issues.
> > > > > > >
> > > > > > > Cheers, Fabian
> > > > > > >
> > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > > > > > >
> > > > > > > > We haven't defined the StreamSQL syntax yet (and I think 
> > > > > > > > it
> > will
> > > > take
> > > > > > > some
> > > > > > > > time until we are at that point).
> > > > > > > > So we are quite flexible with both featurs.
> > > > > > > >
> > > > > > > > Let's keep this opportunity in mind and coordinate when
> before
> > > > making
> > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > >
> > > > > > > > Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> trohrmann@apache.org
> > >:
> > > > > > > >
> > > > > > > >> First of all, it's a great design document. Looking 
> > > > > > > >> forward
> > > having
> > > > > > > stream
> > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > >>
> > > > > > > >> I think it is a good idea to consolidate stream SQL and 
> > > > > > > >> CEP
> in
> > > the
> > > > > > long
> > > > > > > >> run. CEP's additional features compared to SQL boil 
> > > > > > > >> down to
> > > > pattern
> > > > > > > >> detection. Once we have this, it should be only a 
> > > > > > > >> question
> of
> > > > > defining
> > > > > > > the
> > > > > > > >> SQL syntax for event patterns in order to integrate CEP 
> > > > > > > >> with
> > > > stream
> > > > > > SQL.
> > > > > > > >> Oracle has already defined an extension [1] to detect
> patterns
> > > in
> > > > a
> > > > > > set
> > > > > > > of
> > > > > > > >> table rows. This or Esper's event processing language 
> > > > > > > >> (EPL)
> > [2]
> > > > > could
> > > > > > > be a
> > > > > > > >> good starting point.
> > > > > > > >>
> > > > > > > >> [1]
> > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG89
> > > > > 59
> > > > > > > >> [2]
> > > > > >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > >>
> > > > > > > >> Cheers,
> > > > > > > >> Till
> > > > > > > >>
> > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > fhueske@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Thanks for the feedback!
> > > > > > > >> >
> > > > > > > >> > We will start the SQL effort with putting the 
> > > > > > > >> > existing
> > (batch)
> > > > > Table
> > > > > > > >> API on
> > > > > > > >> > top of Apache Calcite.
> > > > > > > >> > From there we continue to add streaming support for 
> > > > > > > >> > the
> > Table
> > > > API
> > > > > > > >> before we
> > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > >> >
> > > > > > > >> > Consolidating the efforts with the CEP library sounds
> like a
> > > > good
> > > > > > idea
> > > > > > > >> to
> > > > > > > >> > me.
> > > > > > > >> > Maybe it can be nicely integrated with the streaming 
> > > > > > > >> > table
> > API
> > > > and
> > > > > > > >> later as
> > > > > > > >> > well with the StreamSQL interface (the StreamSQL 
> > > > > > > >> > dialect
> is
> > > not
> > > > > > > defined
> > > > > > > >> > yet).
> > > > > > > >> >
> > > > > > > >> > @Till: What do you think about adding CEP features to 
> > > > > > > >> > the
> > > Table
> > > > > API.
> > > > > > > >> From
> > > > > > > >> > the CEP design doc, it looks like we need to add a 
> > > > > > > >> > pattern
> > > > > matching
> > > > > > > >> > operator in addition to the window features that we 
> > > > > > > >> > need
> to
> > > add
> > > > > for
> > > > > > > >> > streaming Table API in any case.
> > > > > > > >> >
> > > > > > > >> > Best, Fabian
> > > > > > > >> >
> > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > hi.jiangsong@huawei.com
> > > > > >:
> > > > > > > >> >
> > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > SQL-standard
> > > > > > language
> > > > > > > >> > > extend to offering a cluster of window, pattern
> matching.
> > > EPL
> > > > > can
> > > > > > > >> both
> > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > >> > >
> > > > > > > >> > > [1]
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper
> _reference.pdf
> > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > Regards
> > > > > > > >> > > Song
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > -----邮件原件-----
> > > > > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > >> > > 收件人: dev@flink.apache.org
> > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > >> > >
> > > > > > > >> > > We still don’t have a concensus about the streaming 
> > > > > > > >> > > SQL
> > and
> > > > CEP
> > > > > > > >> library
> > > > > > > >> > on
> > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > Maybe
> > > we
> > > > > > have
> > > > > > > to
> > > > > > > >> > > discuss about this in mailing list.
> > > > > > > >> > >
> > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > ndimiduk@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > What's the relationship between the streaming SQL
> > proposed
> > > > > here
> > > > > > > and
> > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > >> > > >
> > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > henry.saputra@gmail.com
> > > > > > > >> >
> > > > > > > >> > > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > >> > > >>
> > > > > > > >> > > >> - Henry
> > > > > > > >> > > >>
> > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > fhueske@gmail.com
> > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > >> > > >>
> > > > > > > >> > > >>> Hi Henry,
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> There is
> > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > and a
> > > > > > > >> few
> > > > > > > >> > > >>> subissues.
> > > > > > > >> > > >>> I'll reorganize these and add more issues for 
> > > > > > > >> > > >>> the
> > tasks
> > > > > > > described
> > > > > > > >> in
> > > > > > > >> > > >>> the design document in the next days.
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > henry.saputra@gmail.com
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > >> > > >>>
> > > > > > > >> > > >>>> HI Fabian,
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of 
> > > > > > > >> > > >>>> this
> > new
> > > > > > feature?
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> - Henry
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske 
> > > > > > > >> > > >>>> <
> > > > > > > fhueske@gmail.com
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> in the last days, Timo and I refined the 
> > > > > > > >> > > >>>>> design
> > > document
> > > > > for
> > > > > > > >> > > >>>>> adding a
> > > > > > > >> > > >>>> SQL /
> > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> started
> > > by
> > > > > > > Stephan.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> The document proposes an architecture that is
> > centered
> > > > > > around
> > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache 
> > > > > > > >> > > >>>>> top-level
> > project
> > > > and
> > > > > > > >> > > >>>>> includes a SQL
> > > > > > > >> > > >>>> parser,
> > > > > > > >> > > >>>>> a semantic validator for relational queries, 
> > > > > > > >> > > >>>>> and a
> > > rule-
> > > > > and
> > > > > > > >> > > >> cost-based
> > > > > > > >> > > >>>>> relational optimizer. Calcite is used by 
> > > > > > > >> > > >>>>> Apache
> Hive
> > > and
> > > > > > > Apache
> > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell, 
> > > > > > > >> > > >>>>> the
> > plan
> > > is
> > > > > to
> > > > > > > >> > > >>>>> translate Table
> > > > > > > >> > > >>> API
> > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> expression
> > > > > trees,
> > > > > > > >> > > >>>>> optimize
> > > > > > > >> > > >>>> these
> > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > DataStream
> > > > > > > >> programs.The
> > > > > > > >> > > >>>> document
> > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > subtasks.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> -- >
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>
> > > > > > > >> > > >>
> > > > > > > >>
> > > > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRj
> > P
> > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Unless there are major concerns with the 
> > > > > > > >> > > >>>>> design,
> > Timo
> > > > and
> > > > > I
> > > > > > > want
> > > > > > > >> > > >>>>> to
> > > > > > > >> > > >>> start
> > > > > > > >> > > >>>>> next week to move the current Table API on 
> > > > > > > >> > > >>>>> top of
> > > Apache
> > > > > > > Calcite
> > > > > > > >> > > >> (Task
> > > > > > > >> > > >>> 1
> > > > > > > >> > > >>>> in
> > > > > > > >> > > >>>>> the document). The goal of this task is to 
> > > > > > > >> > > >>>>> have
> the
> > > same
> > > > > > > >> > > >> functionality
> > > > > > > >> > > >>> as
> > > > > > > >> > > >>>>> currently, but with Calcite in the 
> > > > > > > >> > > >>>>> translation
> > > process.
> > > > > This
> > > > > > > is
> > > > > > > >> a
> > > > > > > >> > > >>>> blocking
> > > > > > > >> > > >>>>> task that we hope to complete soon. 
> > > > > > > >> > > >>>>> Afterwards, we
> > can
> > > > > > > >> > > >>>>> independently
> > > > > > > >> > > >>> work
> > > > > > > >> > > >>>>> on different aspects such as extending the 
> > > > > > > >> > > >>>>> Table
> > API,
> > > > > > adding a
> > > > > > > >> SQL
> > > > > > > >> > > >>>>> interface (basically just a parser), 
> > > > > > > >> > > >>>>> integration
> > with
> > > > > > external
> > > > > > > >> > > >>>>> data sources, better code generation, 
> > > > > > > >> > > >>>>> optimization
> > > > rules,
> > > > > > > >> > > >>>>> streaming
> > > > > > > >> > > >> support
> > > > > > > >> > > >>>> for
> > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> implement
> > > > Task
> > > > > 1
> > > > > > > and
> > > > > > > >> > > >>>>> merge
> > > > > > > >> > > >>> it
> > > > > > > >> > > >>>> to
> > > > > > > >> > > >>>>> the master branch once the task is completed. 
> > > > > > > >> > > >>>>> Of
> > > course,
> > > > > > > >> everybody
> > > > > > > >> > > >>>>> is welcome to contribute to this effort. 
> > > > > > > >> > > >>>>> Please
> let
> > us
> > > > > know
> > > > > > > such
> > > > > > > >> > > >>>>> that we
> > > > > > > >> > > >>> can
> > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Thanks,
> > > > > > > >> > > >>>>> Fabian
> > > > > > > >> > >
> > > > > > > >> > > Regards,
> > > > > > > >> > > Chiwan Park
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Stephan Ewen <se...@apache.org>.

Cool stuff!

SQL coming up next? ;-)


On Tue, Mar 29, 2016 at 1:39 PM, Maximilian Michels <mx...@apache.org> wrote:

> Yeah! I'm a little late to the party but exciting stuff! :)
>
> On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <
> vasilikikalavri@gmail.com
> > wrote:
>
> > Hi all,
> >
> > tableOnCalcite has been merged to master :)
> >
> > Cheers,
> > -Vasia.
> >
> > On 17 March 2016 at 11:11, Fabian Hueske <fh...@gmail.com> wrote:
> >
> > > Thanks for the initiative Vasia!
> > > I went over the diff and didn't find anything crucial.
> > >
> > > I would like to do another pass over the tests though and improve the
> > > exceptions for invalid joins before merging.
> > > Will open a PR later today.
> > >
> > > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <vasilikikalavri@gmail.com
> >:
> > >
> > > > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> > > Task
> > > > 5. Task 6 should come right after :)
> > > >
> > > > -V.
> > > >
> > > > On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org>
> wrote:
> > > >
> > > > > Cool, this is great news!
> > > > > So "Task 1" from the document [1] is done with the merge? And PR
> > #1770
> > > is
> > > > > going towards "Task 6".
> > > > > I think good support for Stream SQL is a very interesting new
> feature
> > > for
> > > > > Flink.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > > > >
> > > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > > vasilikikalavri@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We are happy to announce that the "tableOnCalcite" branch is
> > finally
> > > > > ready
> > > > > > to be merged.
> > > > > > It essentially provides the existing functionality of the Table
> > API,
> > > > but
> > > > > > now the translation happens through Apache Calcite.
> > > > > > You can find the changes rebased on top of the current master in
> > [1].
> > > > > > We have removed the prototype streaming Table API functionality,
> > > which
> > > > > will
> > > > > > be added back once PR [2] is merged.
> > > > > >
> > > > > > We'll go through the changes once more and, if no objections, we
> > > would
> > > > > like
> > > > > > to go ahead and merge this.
> > > > > >
> > > > > > Cheers,
> > > > > > -Vasia.
> > > > > >
> > > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > > >
> > > > > >
> > > > > > On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi everybody,
> > > > > > >
> > > > > > > as previously announced, I pushed a feature branch called
> > > > > > "tableOnCalcite"
> > > > > > > to the Flink repository.
> > > > > > > We will use this branch to work on FLINK-3221 and its
> sub-issues.
> > > > > > >
> > > > > > > Cheers, Fabian
> > > > > > >
> > > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > > > > > >
> > > > > > > > We haven't defined the StreamSQL syntax yet (and I think it
> > will
> > > > take
> > > > > > > some
> > > > > > > > time until we are at that point).
> > > > > > > > So we are quite flexible with both featurs.
> > > > > > > >
> > > > > > > > Let's keep this opportunity in mind and coordinate when
> before
> > > > making
> > > > > > > > decisions about CEP or StreamSQL.
> > > > > > > >
> > > > > > > > Fabian
> > > > > > > >
> > > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <
> trohrmann@apache.org
> > >:
> > > > > > > >
> > > > > > > >> First of all, it's a great design document. Looking forward
> > > having
> > > > > > > stream
> > > > > > > >> SQL in the foreseeable future :-)
> > > > > > > >>
> > > > > > > >> I think it is a good idea to consolidate stream SQL and CEP
> in
> > > the
> > > > > > long
> > > > > > > >> run. CEP's additional features compared to SQL boil down to
> > > > pattern
> > > > > > > >> detection. Once we have this, it should be only a question
> of
> > > > > defining
> > > > > > > the
> > > > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > > > stream
> > > > > > SQL.
> > > > > > > >> Oracle has already defined an extension [1] to detect
> patterns
> > > in
> > > > a
> > > > > > set
> > > > > > > of
> > > > > > > >> table rows. This or Esper's event processing language (EPL)
> > [2]
> > > > > could
> > > > > > > be a
> > > > > > > >> good starting point.
> > > > > > > >>
> > > > > > > >> [1]
> > > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > > > >> [2]
> > > > > >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > > >>
> > > > > > > >> Cheers,
> > > > > > > >> Till
> > > > > > > >>
> > > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > > fhueske@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > Thanks for the feedback!
> > > > > > > >> >
> > > > > > > >> > We will start the SQL effort with putting the existing
> > (batch)
> > > > > Table
> > > > > > > >> API on
> > > > > > > >> > top of Apache Calcite.
> > > > > > > >> > From there we continue to add streaming support for the
> > Table
> > > > API
> > > > > > > >> before we
> > > > > > > >> > put a StreamSQL interface on top.
> > > > > > > >> >
> > > > > > > >> > Consolidating the efforts with the CEP library sounds
> like a
> > > > good
> > > > > > idea
> > > > > > > >> to
> > > > > > > >> > me.
> > > > > > > >> > Maybe it can be nicely integrated with the streaming table
> > API
> > > > and
> > > > > > > >> later as
> > > > > > > >> > well with the StreamSQL interface (the StreamSQL dialect
> is
> > > not
> > > > > > > defined
> > > > > > > >> > yet).
> > > > > > > >> >
> > > > > > > >> > @Till: What do you think about adding CEP features to the
> > > Table
> > > > > API.
> > > > > > > >> From
> > > > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > > > matching
> > > > > > > >> > operator in addition to the window features that we need
> to
> > > add
> > > > > for
> > > > > > > >> > streaming Table API in any case.
> > > > > > > >> >
> > > > > > > >> > Best, Fabian
> > > > > > > >> >
> > > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > > hi.jiangsong@huawei.com
> > > > > >:
> > > > > > > >> >
> > > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> > SQL-standard
> > > > > > language
> > > > > > > >> > > extend to offering a cluster of window, pattern
> matching.
> > > EPL
> > > > > can
> > > > > > > >> both
> > > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > > >> > >
> > > > > > > >> > > [1]
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > Regards
> > > > > > > >> > > Song
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > -----邮件原件-----
> > > > > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > > >> > > 收件人: dev@flink.apache.org
> > > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > > >> > >
> > > > > > > >> > > We still don’t have a concensus about the streaming SQL
> > and
> > > > CEP
> > > > > > > >> library
> > > > > > > >> > on
> > > > > > > >> > > Flink. Some people want to merge these two libraries.
> > Maybe
> > > we
> > > > > > have
> > > > > > > to
> > > > > > > >> > > discuss about this in mailing list.
> > > > > > > >> > >
> > > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > > ndimiduk@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > What's the relationship between the streaming SQL
> > proposed
> > > > > here
> > > > > > > and
> > > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > > >> > > >
> > > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > > henry.saputra@gmail.com
> > > > > > > >> >
> > > > > > > >> > > wrote:
> > > > > > > >> > > >
> > > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > > >> > > >>
> > > > > > > >> > > >> - Henry
> > > > > > > >> > > >>
> > > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > > fhueske@gmail.com
> > > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > > >> > > >>
> > > > > > > >> > > >>> Hi Henry,
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> There is
> > > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > > and a
> > > > > > > >> few
> > > > > > > >> > > >>> subissues.
> > > > > > > >> > > >>> I'll reorganize these and add more issues for the
> > tasks
> > > > > > > described
> > > > > > > >> in
> > > > > > > >> > > >>> the design document in the next days.
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> Thanks, Fabian
> > > > > > > >> > > >>>
> > > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > > henry.saputra@gmail.com
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>>:
> > > > > > > >> > > >>>
> > > > > > > >> > > >>>> HI Fabian,
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> Have you created JIRA ticket to keep track of this
> > new
> > > > > > feature?
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> - Henry
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > > > fhueske@gmail.com
> > > > > > > >> > > >> <javascript:;>
> > > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > > >> > > >>>>> Hi everybody,
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> in the last days, Timo and I refined the design
> > > document
> > > > > for
> > > > > > > >> > > >>>>> adding a
> > > > > > > >> > > >>>> SQL /
> > > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was
> started
> > > by
> > > > > > > Stephan.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> The document proposes an architecture that is
> > centered
> > > > > > around
> > > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level
> > project
> > > > and
> > > > > > > >> > > >>>>> includes a SQL
> > > > > > > >> > > >>>> parser,
> > > > > > > >> > > >>>>> a semantic validator for relational queries, and a
> > > rule-
> > > > > and
> > > > > > > >> > > >> cost-based
> > > > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache
> Hive
> > > and
> > > > > > > Apache
> > > > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the
> > plan
> > > is
> > > > > to
> > > > > > > >> > > >>>>> translate Table
> > > > > > > >> > > >>> API
> > > > > > > >> > > >>>>> and SQL queries into Calcite's relational
> expression
> > > > > trees,
> > > > > > > >> > > >>>>> optimize
> > > > > > > >> > > >>>> these
> > > > > > > >> > > >>>>> trees, and translate them into DataSet and
> > DataStream
> > > > > > > >> programs.The
> > > > > > > >> > > >>>> document
> > > > > > > >> > > >>>>> breaks down the work into several tasks and
> > subtasks.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> -- >
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>
> > > > > > > >> > > >>>
> > > > > > > >> > > >>
> > > > > > > >>
> > > > >
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Unless there are major concerns with the design,
> > Timo
> > > > and
> > > > > I
> > > > > > > want
> > > > > > > >> > > >>>>> to
> > > > > > > >> > > >>> start
> > > > > > > >> > > >>>>> next week to move the current Table API on top of
> > > Apache
> > > > > > > Calcite
> > > > > > > >> > > >> (Task
> > > > > > > >> > > >>> 1
> > > > > > > >> > > >>>> in
> > > > > > > >> > > >>>>> the document). The goal of this task is to have
> the
> > > same
> > > > > > > >> > > >> functionality
> > > > > > > >> > > >>> as
> > > > > > > >> > > >>>>> currently, but with Calcite in the translation
> > > process.
> > > > > This
> > > > > > > is
> > > > > > > >> a
> > > > > > > >> > > >>>> blocking
> > > > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we
> > can
> > > > > > > >> > > >>>>> independently
> > > > > > > >> > > >>> work
> > > > > > > >> > > >>>>> on different aspects such as extending the Table
> > API,
> > > > > > adding a
> > > > > > > >> SQL
> > > > > > > >> > > >>>>> interface (basically just a parser), integration
> > with
> > > > > > external
> > > > > > > >> > > >>>>> data sources, better code generation, optimization
> > > > rules,
> > > > > > > >> > > >>>>> streaming
> > > > > > > >> > > >> support
> > > > > > > >> > > >>>> for
> > > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to
> implement
> > > > Task
> > > > > 1
> > > > > > > and
> > > > > > > >> > > >>>>> merge
> > > > > > > >> > > >>> it
> > > > > > > >> > > >>>> to
> > > > > > > >> > > >>>>> the master branch once the task is completed. Of
> > > course,
> > > > > > > >> everybody
> > > > > > > >> > > >>>>> is welcome to contribute to this effort. Please
> let
> > us
> > > > > know
> > > > > > > such
> > > > > > > >> > > >>>>> that we
> > > > > > > >> > > >>> can
> > > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > > >> > > >>>>>
> > > > > > > >> > > >>>>> Thanks,
> > > > > > > >> > > >>>>> Fabian
> > > > > > > >> > >
> > > > > > > >> > > Regards,
> > > > > > > >> > > Chiwan Park
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Maximilian Michels <mx...@apache.org>.

Yeah! I'm a little late to the party but exciting stuff! :)

On Fri, Mar 18, 2016 at 3:15 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com
> wrote:

> Hi all,
>
> tableOnCalcite has been merged to master :)
>
> Cheers,
> -Vasia.
>
> On 17 March 2016 at 11:11, Fabian Hueske <fh...@gmail.com> wrote:
>
> > Thanks for the initiative Vasia!
> > I went over the diff and didn't find anything crucial.
> >
> > I would like to do another pass over the tests though and improve the
> > exceptions for invalid joins before merging.
> > Will open a PR later today.
> >
> > 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <va...@gmail.com>:
> >
> > > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> > Task
> > > 5. Task 6 should come right after :)
> > >
> > > -V.
> > >
> > > On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org> wrote:
> > >
> > > > Cool, this is great news!
> > > > So "Task 1" from the document [1] is done with the merge? And PR
> #1770
> > is
> > > > going towards "Task 6".
> > > > I think good support for Stream SQL is a very interesting new feature
> > for
> > > > Flink.
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > > >
> > > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > > vasilikikalavri@gmail.com
> > > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > We are happy to announce that the "tableOnCalcite" branch is
> finally
> > > > ready
> > > > > to be merged.
> > > > > It essentially provides the existing functionality of the Table
> API,
> > > but
> > > > > now the translation happens through Apache Calcite.
> > > > > You can find the changes rebased on top of the current master in
> [1].
> > > > > We have removed the prototype streaming Table API functionality,
> > which
> > > > will
> > > > > be added back once PR [2] is merged.
> > > > >
> > > > > We'll go through the changes once more and, if no objections, we
> > would
> > > > like
> > > > > to go ahead and merge this.
> > > > >
> > > > > Cheers,
> > > > > -Vasia.
> > > > >
> > > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > > [2]: https://github.com/apache/flink/pull/1770
> > > > >
> > > > >
> > > > > On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi everybody,
> > > > > >
> > > > > > as previously announced, I pushed a feature branch called
> > > > > "tableOnCalcite"
> > > > > > to the Flink repository.
> > > > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > > > >
> > > > > > Cheers, Fabian
> > > > > >
> > > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > > > > >
> > > > > > > We haven't defined the StreamSQL syntax yet (and I think it
> will
> > > take
> > > > > > some
> > > > > > > time until we are at that point).
> > > > > > > So we are quite flexible with both featurs.
> > > > > > >
> > > > > > > Let's keep this opportunity in mind and coordinate when before
> > > making
> > > > > > > decisions about CEP or StreamSQL.
> > > > > > >
> > > > > > > Fabian
> > > > > > >
> > > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <trohrmann@apache.org
> >:
> > > > > > >
> > > > > > >> First of all, it's a great design document. Looking forward
> > having
> > > > > > stream
> > > > > > >> SQL in the foreseeable future :-)
> > > > > > >>
> > > > > > >> I think it is a good idea to consolidate stream SQL and CEP in
> > the
> > > > > long
> > > > > > >> run. CEP's additional features compared to SQL boil down to
> > > pattern
> > > > > > >> detection. Once we have this, it should be only a question of
> > > > defining
> > > > > > the
> > > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > > stream
> > > > > SQL.
> > > > > > >> Oracle has already defined an extension [1] to detect patterns
> > in
> > > a
> > > > > set
> > > > > > of
> > > > > > >> table rows. This or Esper's event processing language (EPL)
> [2]
> > > > could
> > > > > > be a
> > > > > > >> good starting point.
> > > > > > >>
> > > > > > >> [1]
> > > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > > >> [2]
> > > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > > >>
> > > > > > >> Cheers,
> > > > > > >> Till
> > > > > > >>
> > > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > > fhueske@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Thanks for the feedback!
> > > > > > >> >
> > > > > > >> > We will start the SQL effort with putting the existing
> (batch)
> > > > Table
> > > > > > >> API on
> > > > > > >> > top of Apache Calcite.
> > > > > > >> > From there we continue to add streaming support for the
> Table
> > > API
> > > > > > >> before we
> > > > > > >> > put a StreamSQL interface on top.
> > > > > > >> >
> > > > > > >> > Consolidating the efforts with the CEP library sounds like a
> > > good
> > > > > idea
> > > > > > >> to
> > > > > > >> > me.
> > > > > > >> > Maybe it can be nicely integrated with the streaming table
> API
> > > and
> > > > > > >> later as
> > > > > > >> > well with the StreamSQL interface (the StreamSQL dialect is
> > not
> > > > > > defined
> > > > > > >> > yet).
> > > > > > >> >
> > > > > > >> > @Till: What do you think about adding CEP features to the
> > Table
> > > > API.
> > > > > > >> From
> > > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > > matching
> > > > > > >> > operator in addition to the window features that we need to
> > add
> > > > for
> > > > > > >> > streaming Table API in any case.
> > > > > > >> >
> > > > > > >> > Best, Fabian
> > > > > > >> >
> > > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > > hi.jiangsong@huawei.com
> > > > >:
> > > > > > >> >
> > > > > > >> > > I suggest refering to Esper EPL[1], which is a
> SQL-standard
> > > > > language
> > > > > > >> > > extend to offering a cluster of window, pattern matching.
> > EPL
> > > > can
> > > > > > >> both
> > > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > > >> > >
> > > > > > >> > > [1]
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > Regards
> > > > > > >> > > Song
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > -----邮件原件-----
> > > > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > > >> > > 收件人: dev@flink.apache.org
> > > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > > >> > >
> > > > > > >> > > We still don’t have a concensus about the streaming SQL
> and
> > > CEP
> > > > > > >> library
> > > > > > >> > on
> > > > > > >> > > Flink. Some people want to merge these two libraries.
> Maybe
> > we
> > > > > have
> > > > > > to
> > > > > > >> > > discuss about this in mailing list.
> > > > > > >> > >
> > > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > > ndimiduk@gmail.com>
> > > > > > >> wrote:
> > > > > > >> > > >
> > > > > > >> > > > What's the relationship between the streaming SQL
> proposed
> > > > here
> > > > > > and
> > > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > > >> > > >
> > > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > > henry.saputra@gmail.com
> > > > > > >> >
> > > > > > >> > > wrote:
> > > > > > >> > > >
> > > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > > >> > > >>
> > > > > > >> > > >> - Henry
> > > > > > >> > > >>
> > > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > > fhueske@gmail.com
> > > > > > >> > > >> <javascript:;>> wrote:
> > > > > > >> > > >>
> > > > > > >> > > >>> Hi Henry,
> > > > > > >> > > >>>
> > > > > > >> > > >>> There is
> > https://issues.apache.org/jira/browse/FLINK-2099
> > > > > and a
> > > > > > >> few
> > > > > > >> > > >>> subissues.
> > > > > > >> > > >>> I'll reorganize these and add more issues for the
> tasks
> > > > > > described
> > > > > > >> in
> > > > > > >> > > >>> the design document in the next days.
> > > > > > >> > > >>>
> > > > > > >> > > >>> Thanks, Fabian
> > > > > > >> > > >>>
> > > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > > henry.saputra@gmail.com
> > > > > > >> > > >> <javascript:;>
> > > > > > >> > > >>> <javascript:;>>:
> > > > > > >> > > >>>
> > > > > > >> > > >>>> HI Fabian,
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> Have you created JIRA ticket to keep track of this
> new
> > > > > feature?
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> - Henry
> > > > > > >> > > >>>>
> > > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > > fhueske@gmail.com
> > > > > > >> > > >> <javascript:;>
> > > > > > >> > > >>> <javascript:;>> wrote:
> > > > > > >> > > >>>>> Hi everybody,
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> in the last days, Timo and I refined the design
> > document
> > > > for
> > > > > > >> > > >>>>> adding a
> > > > > > >> > > >>>> SQL /
> > > > > > >> > > >>>>> StreamSQL interface on top of Flink that was started
> > by
> > > > > > Stephan.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> The document proposes an architecture that is
> centered
> > > > > around
> > > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level
> project
> > > and
> > > > > > >> > > >>>>> includes a SQL
> > > > > > >> > > >>>> parser,
> > > > > > >> > > >>>>> a semantic validator for relational queries, and a
> > rule-
> > > > and
> > > > > > >> > > >> cost-based
> > > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive
> > and
> > > > > > Apache
> > > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the
> plan
> > is
> > > > to
> > > > > > >> > > >>>>> translate Table
> > > > > > >> > > >>> API
> > > > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > > > trees,
> > > > > > >> > > >>>>> optimize
> > > > > > >> > > >>>> these
> > > > > > >> > > >>>>> trees, and translate them into DataSet and
> DataStream
> > > > > > >> programs.The
> > > > > > >> > > >>>> document
> > > > > > >> > > >>>>> breaks down the work into several tasks and
> subtasks.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Please review the design document and comment.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> -- >
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>
> > > > > > >> > > >>>
> > > > > > >> > > >>
> > > > > > >>
> > > >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Unless there are major concerns with the design,
> Timo
> > > and
> > > > I
> > > > > > want
> > > > > > >> > > >>>>> to
> > > > > > >> > > >>> start
> > > > > > >> > > >>>>> next week to move the current Table API on top of
> > Apache
> > > > > > Calcite
> > > > > > >> > > >> (Task
> > > > > > >> > > >>> 1
> > > > > > >> > > >>>> in
> > > > > > >> > > >>>>> the document). The goal of this task is to have the
> > same
> > > > > > >> > > >> functionality
> > > > > > >> > > >>> as
> > > > > > >> > > >>>>> currently, but with Calcite in the translation
> > process.
> > > > This
> > > > > > is
> > > > > > >> a
> > > > > > >> > > >>>> blocking
> > > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we
> can
> > > > > > >> > > >>>>> independently
> > > > > > >> > > >>> work
> > > > > > >> > > >>>>> on different aspects such as extending the Table
> API,
> > > > > adding a
> > > > > > >> SQL
> > > > > > >> > > >>>>> interface (basically just a parser), integration
> with
> > > > > external
> > > > > > >> > > >>>>> data sources, better code generation, optimization
> > > rules,
> > > > > > >> > > >>>>> streaming
> > > > > > >> > > >> support
> > > > > > >> > > >>>> for
> > > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> > > Task
> > > > 1
> > > > > > and
> > > > > > >> > > >>>>> merge
> > > > > > >> > > >>> it
> > > > > > >> > > >>>> to
> > > > > > >> > > >>>>> the master branch once the task is completed. Of
> > course,
> > > > > > >> everybody
> > > > > > >> > > >>>>> is welcome to contribute to this effort. Please let
> us
> > > > know
> > > > > > such
> > > > > > >> > > >>>>> that we
> > > > > > >> > > >>> can
> > > > > > >> > > >>>>> coordinate our efforts.
> > > > > > >> > > >>>>>
> > > > > > >> > > >>>>> Thanks,
> > > > > > >> > > >>>>> Fabian
> > > > > > >> > >
> > > > > > >> > > Regards,
> > > > > > >> > > Chiwan Park
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Vasiliki Kalavri <va...@gmail.com>.

Hi all,

tableOnCalcite has been merged to master :)

Cheers,
-Vasia.

On 17 March 2016 at 11:11, Fabian Hueske <fh...@gmail.com> wrote:

> Thanks for the initiative Vasia!
> I went over the diff and didn't find anything crucial.
>
> I would like to do another pass over the tests though and improve the
> exceptions for invalid joins before merging.
> Will open a PR later today.
>
> 2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <va...@gmail.com>:
>
> > Yes, the current state corresponds to Task 1. PR #1770 corresponds to
> Task
> > 5. Task 6 should come right after :)
> >
> > -V.
> >
> > On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org> wrote:
> >
> > > Cool, this is great news!
> > > So "Task 1" from the document [1] is done with the merge? And PR #1770
> is
> > > going towards "Task 6".
> > > I think good support for Stream SQL is a very interesting new feature
> for
> > > Flink.
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> > >
> > > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > > vasilikikalavri@gmail.com
> > > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > We are happy to announce that the "tableOnCalcite" branch is finally
> > > ready
> > > > to be merged.
> > > > It essentially provides the existing functionality of the Table API,
> > but
> > > > now the translation happens through Apache Calcite.
> > > > You can find the changes rebased on top of the current master in [1].
> > > > We have removed the prototype streaming Table API functionality,
> which
> > > will
> > > > be added back once PR [2] is merged.
> > > >
> > > > We'll go through the changes once more and, if no objections, we
> would
> > > like
> > > > to go ahead and merge this.
> > > >
> > > > Cheers,
> > > > -Vasia.
> > > >
> > > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > > [2]: https://github.com/apache/flink/pull/1770
> > > >
> > > >
> > > > On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com>
> wrote:
> > > >
> > > > > Hi everybody,
> > > > >
> > > > > as previously announced, I pushed a feature branch called
> > > > "tableOnCalcite"
> > > > > to the Flink repository.
> > > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > > >
> > > > > Cheers, Fabian
> > > > >
> > > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > > > >
> > > > > > We haven't defined the StreamSQL syntax yet (and I think it will
> > take
> > > > > some
> > > > > > time until we are at that point).
> > > > > > So we are quite flexible with both featurs.
> > > > > >
> > > > > > Let's keep this opportunity in mind and coordinate when before
> > making
> > > > > > decisions about CEP or StreamSQL.
> > > > > >
> > > > > > Fabian
> > > > > >
> > > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
> > > > > >
> > > > > >> First of all, it's a great design document. Looking forward
> having
> > > > > stream
> > > > > >> SQL in the foreseeable future :-)
> > > > > >>
> > > > > >> I think it is a good idea to consolidate stream SQL and CEP in
> the
> > > > long
> > > > > >> run. CEP's additional features compared to SQL boil down to
> > pattern
> > > > > >> detection. Once we have this, it should be only a question of
> > > defining
> > > > > the
> > > > > >> SQL syntax for event patterns in order to integrate CEP with
> > stream
> > > > SQL.
> > > > > >> Oracle has already defined an extension [1] to detect patterns
> in
> > a
> > > > set
> > > > > of
> > > > > >> table rows. This or Esper's event processing language (EPL) [2]
> > > could
> > > > > be a
> > > > > >> good starting point.
> > > > > >>
> > > > > >> [1]
> > > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > > >> [2]
> > > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Till
> > > > > >>
> > > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> > fhueske@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Thanks for the feedback!
> > > > > >> >
> > > > > >> > We will start the SQL effort with putting the existing (batch)
> > > Table
> > > > > >> API on
> > > > > >> > top of Apache Calcite.
> > > > > >> > From there we continue to add streaming support for the Table
> > API
> > > > > >> before we
> > > > > >> > put a StreamSQL interface on top.
> > > > > >> >
> > > > > >> > Consolidating the efforts with the CEP library sounds like a
> > good
> > > > idea
> > > > > >> to
> > > > > >> > me.
> > > > > >> > Maybe it can be nicely integrated with the streaming table API
> > and
> > > > > >> later as
> > > > > >> > well with the StreamSQL interface (the StreamSQL dialect is
> not
> > > > > defined
> > > > > >> > yet).
> > > > > >> >
> > > > > >> > @Till: What do you think about adding CEP features to the
> Table
> > > API.
> > > > > >> From
> > > > > >> > the CEP design doc, it looks like we need to add a pattern
> > > matching
> > > > > >> > operator in addition to the window features that we need to
> add
> > > for
> > > > > >> > streaming Table API in any case.
> > > > > >> >
> > > > > >> > Best, Fabian
> > > > > >> >
> > > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> > hi.jiangsong@huawei.com
> > > >:
> > > > > >> >
> > > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > > > language
> > > > > >> > > extend to offering a cluster of window, pattern matching.
> EPL
> > > can
> > > > > >> both
> > > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > > >> > >
> > > > > >> > > [1]
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > Regards
> > > > > >> > > Song
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > -----邮件原件-----
> > > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > > >> > > 收件人: dev@flink.apache.org
> > > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > > >> > >
> > > > > >> > > We still don’t have a concensus about the streaming SQL and
> > CEP
> > > > > >> library
> > > > > >> > on
> > > > > >> > > Flink. Some people want to merge these two libraries. Maybe
> we
> > > > have
> > > > > to
> > > > > >> > > discuss about this in mailing list.
> > > > > >> > >
> > > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > > ndimiduk@gmail.com>
> > > > > >> wrote:
> > > > > >> > > >
> > > > > >> > > > What's the relationship between the streaming SQL proposed
> > > here
> > > > > and
> > > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > > >> > > >
> > > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > > henry.saputra@gmail.com
> > > > > >> >
> > > > > >> > > wrote:
> > > > > >> > > >
> > > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > > >> > > >>
> > > > > >> > > >> - Henry
> > > > > >> > > >>
> > > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > > fhueske@gmail.com
> > > > > >> > > >> <javascript:;>> wrote:
> > > > > >> > > >>
> > > > > >> > > >>> Hi Henry,
> > > > > >> > > >>>
> > > > > >> > > >>> There is
> https://issues.apache.org/jira/browse/FLINK-2099
> > > > and a
> > > > > >> few
> > > > > >> > > >>> subissues.
> > > > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > > > described
> > > > > >> in
> > > > > >> > > >>> the design document in the next days.
> > > > > >> > > >>>
> > > > > >> > > >>> Thanks, Fabian
> > > > > >> > > >>>
> > > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > > henry.saputra@gmail.com
> > > > > >> > > >> <javascript:;>
> > > > > >> > > >>> <javascript:;>>:
> > > > > >> > > >>>
> > > > > >> > > >>>> HI Fabian,
> > > > > >> > > >>>>
> > > > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > > > feature?
> > > > > >> > > >>>>
> > > > > >> > > >>>> - Henry
> > > > > >> > > >>>>
> > > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > > fhueske@gmail.com
> > > > > >> > > >> <javascript:;>
> > > > > >> > > >>> <javascript:;>> wrote:
> > > > > >> > > >>>>> Hi everybody,
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> in the last days, Timo and I refined the design
> document
> > > for
> > > > > >> > > >>>>> adding a
> > > > > >> > > >>>> SQL /
> > > > > >> > > >>>>> StreamSQL interface on top of Flink that was started
> by
> > > > > Stephan.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> The document proposes an architecture that is centered
> > > > around
> > > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project
> > and
> > > > > >> > > >>>>> includes a SQL
> > > > > >> > > >>>> parser,
> > > > > >> > > >>>>> a semantic validator for relational queries, and a
> rule-
> > > and
> > > > > >> > > >> cost-based
> > > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive
> and
> > > > > Apache
> > > > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan
> is
> > > to
> > > > > >> > > >>>>> translate Table
> > > > > >> > > >>> API
> > > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > > trees,
> > > > > >> > > >>>>> optimize
> > > > > >> > > >>>> these
> > > > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > > > >> programs.The
> > > > > >> > > >>>> document
> > > > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Please review the design document and comment.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> -- >
> > > > > >> > > >>>>>
> > > > > >> > > >>>>
> > > > > >> > > >>>
> > > > > >> > > >>
> > > > > >>
> > > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Unless there are major concerns with the design, Timo
> > and
> > > I
> > > > > want
> > > > > >> > > >>>>> to
> > > > > >> > > >>> start
> > > > > >> > > >>>>> next week to move the current Table API on top of
> Apache
> > > > > Calcite
> > > > > >> > > >> (Task
> > > > > >> > > >>> 1
> > > > > >> > > >>>> in
> > > > > >> > > >>>>> the document). The goal of this task is to have the
> same
> > > > > >> > > >> functionality
> > > > > >> > > >>> as
> > > > > >> > > >>>>> currently, but with Calcite in the translation
> process.
> > > This
> > > > > is
> > > > > >> a
> > > > > >> > > >>>> blocking
> > > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > > > >> > > >>>>> independently
> > > > > >> > > >>> work
> > > > > >> > > >>>>> on different aspects such as extending the Table API,
> > > > adding a
> > > > > >> SQL
> > > > > >> > > >>>>> interface (basically just a parser), integration with
> > > > external
> > > > > >> > > >>>>> data sources, better code generation, optimization
> > rules,
> > > > > >> > > >>>>> streaming
> > > > > >> > > >> support
> > > > > >> > > >>>> for
> > > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> > Task
> > > 1
> > > > > and
> > > > > >> > > >>>>> merge
> > > > > >> > > >>> it
> > > > > >> > > >>>> to
> > > > > >> > > >>>>> the master branch once the task is completed. Of
> course,
> > > > > >> everybody
> > > > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> > > know
> > > > > such
> > > > > >> > > >>>>> that we
> > > > > >> > > >>> can
> > > > > >> > > >>>>> coordinate our efforts.
> > > > > >> > > >>>>>
> > > > > >> > > >>>>> Thanks,
> > > > > >> > > >>>>> Fabian
> > > > > >> > >
> > > > > >> > > Regards,
> > > > > >> > > Chiwan Park
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Fabian Hueske <fh...@gmail.com>.

Thanks for the initiative Vasia!
I went over the diff and didn't find anything crucial.

I would like to do another pass over the tests though and improve the
exceptions for invalid joins before merging.
Will open a PR later today.

2016-03-16 21:17 GMT+01:00 Vasiliki Kalavri <va...@gmail.com>:

> Yes, the current state corresponds to Task 1. PR #1770 corresponds to Task
> 5. Task 6 should come right after :)
>
> -V.
>
> On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org> wrote:
>
> > Cool, this is great news!
> > So "Task 1" from the document [1] is done with the merge? And PR #1770 is
> > going towards "Task 6".
> > I think good support for Stream SQL is a very interesting new feature for
> > Flink.
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
> >
> > On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> > vasilikikalavri@gmail.com
> > > wrote:
> >
> > > Hello everyone,
> > >
> > > We are happy to announce that the "tableOnCalcite" branch is finally
> > ready
> > > to be merged.
> > > It essentially provides the existing functionality of the Table API,
> but
> > > now the translation happens through Apache Calcite.
> > > You can find the changes rebased on top of the current master in [1].
> > > We have removed the prototype streaming Table API functionality, which
> > will
> > > be added back once PR [2] is merged.
> > >
> > > We'll go through the changes once more and, if no objections, we would
> > like
> > > to go ahead and merge this.
> > >
> > > Cheers,
> > > -Vasia.
> > >
> > > [1]: https://github.com/vasia/flink/tree/merge-table
> > > [2]: https://github.com/apache/flink/pull/1770
> > >
> > >
> > > On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com> wrote:
> > >
> > > > Hi everybody,
> > > >
> > > > as previously announced, I pushed a feature branch called
> > > "tableOnCalcite"
> > > > to the Flink repository.
> > > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > > >
> > > > Cheers, Fabian
> > > >
> > > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > > >
> > > > > We haven't defined the StreamSQL syntax yet (and I think it will
> take
> > > > some
> > > > > time until we are at that point).
> > > > > So we are quite flexible with both featurs.
> > > > >
> > > > > Let's keep this opportunity in mind and coordinate when before
> making
> > > > > decisions about CEP or StreamSQL.
> > > > >
> > > > > Fabian
> > > > >
> > > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
> > > > >
> > > > >> First of all, it's a great design document. Looking forward having
> > > > stream
> > > > >> SQL in the foreseeable future :-)
> > > > >>
> > > > >> I think it is a good idea to consolidate stream SQL and CEP in the
> > > long
> > > > >> run. CEP's additional features compared to SQL boil down to
> pattern
> > > > >> detection. Once we have this, it should be only a question of
> > defining
> > > > the
> > > > >> SQL syntax for event patterns in order to integrate CEP with
> stream
> > > SQL.
> > > > >> Oracle has already defined an extension [1] to detect patterns in
> a
> > > set
> > > > of
> > > > >> table rows. This or Esper's event processing language (EPL) [2]
> > could
> > > > be a
> > > > >> good starting point.
> > > > >>
> > > > >> [1]
> > https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > > >> [2]
> > > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > > >>
> > > > >> Cheers,
> > > > >> Till
> > > > >>
> > > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <
> fhueske@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Thanks for the feedback!
> > > > >> >
> > > > >> > We will start the SQL effort with putting the existing (batch)
> > Table
> > > > >> API on
> > > > >> > top of Apache Calcite.
> > > > >> > From there we continue to add streaming support for the Table
> API
> > > > >> before we
> > > > >> > put a StreamSQL interface on top.
> > > > >> >
> > > > >> > Consolidating the efforts with the CEP library sounds like a
> good
> > > idea
> > > > >> to
> > > > >> > me.
> > > > >> > Maybe it can be nicely integrated with the streaming table API
> and
> > > > >> later as
> > > > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > > > defined
> > > > >> > yet).
> > > > >> >
> > > > >> > @Till: What do you think about adding CEP features to the Table
> > API.
> > > > >> From
> > > > >> > the CEP design doc, it looks like we need to add a pattern
> > matching
> > > > >> > operator in addition to the window features that we need to add
> > for
> > > > >> > streaming Table API in any case.
> > > > >> >
> > > > >> > Best, Fabian
> > > > >> >
> > > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <
> hi.jiangsong@huawei.com
> > >:
> > > > >> >
> > > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > > language
> > > > >> > > extend to offering a cluster of window, pattern matching.  EPL
> > can
> > > > >> both
> > > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > > >> > >
> > > > >> > > [1]
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > > >> > >
> > > > >> > >
> > > > >> > > Regards
> > > > >> > > Song
> > > > >> > >
> > > > >> > >
> > > > >> > > -----邮件原件-----
> > > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > > >> > > 发送时间: 2016年1月11日 10:31
> > > > >> > > 收件人: dev@flink.apache.org
> > > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > > >> > >
> > > > >> > > We still don’t have a concensus about the streaming SQL and
> CEP
> > > > >> library
> > > > >> > on
> > > > >> > > Flink. Some people want to merge these two libraries. Maybe we
> > > have
> > > > to
> > > > >> > > discuss about this in mailing list.
> > > > >> > >
> > > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> > ndimiduk@gmail.com>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > What's the relationship between the streaming SQL proposed
> > here
> > > > and
> > > > >> > > > the CEP syntax proposed earlier in the week?
> > > > >> > > >
> > > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > > henry.saputra@gmail.com
> > > > >> >
> > > > >> > > wrote:
> > > > >> > > >
> > > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > > >> > > >>
> > > > >> > > >> - Henry
> > > > >> > > >>
> > > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> > fhueske@gmail.com
> > > > >> > > >> <javascript:;>> wrote:
> > > > >> > > >>
> > > > >> > > >>> Hi Henry,
> > > > >> > > >>>
> > > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> > > and a
> > > > >> few
> > > > >> > > >>> subissues.
> > > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > > described
> > > > >> in
> > > > >> > > >>> the design document in the next days.
> > > > >> > > >>>
> > > > >> > > >>> Thanks, Fabian
> > > > >> > > >>>
> > > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > > henry.saputra@gmail.com
> > > > >> > > >> <javascript:;>
> > > > >> > > >>> <javascript:;>>:
> > > > >> > > >>>
> > > > >> > > >>>> HI Fabian,
> > > > >> > > >>>>
> > > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > > feature?
> > > > >> > > >>>>
> > > > >> > > >>>> - Henry
> > > > >> > > >>>>
> > > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > > fhueske@gmail.com
> > > > >> > > >> <javascript:;>
> > > > >> > > >>> <javascript:;>> wrote:
> > > > >> > > >>>>> Hi everybody,
> > > > >> > > >>>>>
> > > > >> > > >>>>> in the last days, Timo and I refined the design document
> > for
> > > > >> > > >>>>> adding a
> > > > >> > > >>>> SQL /
> > > > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > > > Stephan.
> > > > >> > > >>>>>
> > > > >> > > >>>>> The document proposes an architecture that is centered
> > > around
> > > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project
> and
> > > > >> > > >>>>> includes a SQL
> > > > >> > > >>>> parser,
> > > > >> > > >>>>> a semantic validator for relational queries, and a rule-
> > and
> > > > >> > > >> cost-based
> > > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > > > Apache
> > > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is
> > to
> > > > >> > > >>>>> translate Table
> > > > >> > > >>> API
> > > > >> > > >>>>> and SQL queries into Calcite's relational expression
> > trees,
> > > > >> > > >>>>> optimize
> > > > >> > > >>>> these
> > > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > > >> programs.The
> > > > >> > > >>>> document
> > > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > > >> > > >>>>>
> > > > >> > > >>>>> Please review the design document and comment.
> > > > >> > > >>>>>
> > > > >> > > >>>>> -- >
> > > > >> > > >>>>>
> > > > >> > > >>>>
> > > > >> > > >>>
> > > > >> > > >>
> > > > >>
> > https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > > >> > > >>>>>
> > > > >> > > >>>>> Unless there are major concerns with the design, Timo
> and
> > I
> > > > want
> > > > >> > > >>>>> to
> > > > >> > > >>> start
> > > > >> > > >>>>> next week to move the current Table API on top of Apache
> > > > Calcite
> > > > >> > > >> (Task
> > > > >> > > >>> 1
> > > > >> > > >>>> in
> > > > >> > > >>>>> the document). The goal of this task is to have the same
> > > > >> > > >> functionality
> > > > >> > > >>> as
> > > > >> > > >>>>> currently, but with Calcite in the translation process.
> > This
> > > > is
> > > > >> a
> > > > >> > > >>>> blocking
> > > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > > >> > > >>>>> independently
> > > > >> > > >>> work
> > > > >> > > >>>>> on different aspects such as extending the Table API,
> > > adding a
> > > > >> SQL
> > > > >> > > >>>>> interface (basically just a parser), integration with
> > > external
> > > > >> > > >>>>> data sources, better code generation, optimization
> rules,
> > > > >> > > >>>>> streaming
> > > > >> > > >> support
> > > > >> > > >>>> for
> > > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > > >> > > >>>>>
> > > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement
> Task
> > 1
> > > > and
> > > > >> > > >>>>> merge
> > > > >> > > >>> it
> > > > >> > > >>>> to
> > > > >> > > >>>>> the master branch once the task is completed. Of course,
> > > > >> everybody
> > > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> > know
> > > > such
> > > > >> > > >>>>> that we
> > > > >> > > >>> can
> > > > >> > > >>>>> coordinate our efforts.
> > > > >> > > >>>>>
> > > > >> > > >>>>> Thanks,
> > > > >> > > >>>>> Fabian
> > > > >> > >
> > > > >> > > Regards,
> > > > >> > > Chiwan Park
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Vasiliki Kalavri <va...@gmail.com>.

Yes, the current state corresponds to Task 1. PR #1770 corresponds to Task
5. Task 6 should come right after :)

-V.

On 16 March 2016 at 20:35, Robert Metzger <rm...@apache.org> wrote:

> Cool, this is great news!
> So "Task 1" from the document [1] is done with the merge? And PR #1770 is
> going towards "Task 6".
> I think good support for Stream SQL is a very interesting new feature for
> Flink.
>
> [1]
>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0
>
> On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <
> vasilikikalavri@gmail.com
> > wrote:
>
> > Hello everyone,
> >
> > We are happy to announce that the "tableOnCalcite" branch is finally
> ready
> > to be merged.
> > It essentially provides the existing functionality of the Table API, but
> > now the translation happens through Apache Calcite.
> > You can find the changes rebased on top of the current master in [1].
> > We have removed the prototype streaming Table API functionality, which
> will
> > be added back once PR [2] is merged.
> >
> > We'll go through the changes once more and, if no objections, we would
> like
> > to go ahead and merge this.
> >
> > Cheers,
> > -Vasia.
> >
> > [1]: https://github.com/vasia/flink/tree/merge-table
> > [2]: https://github.com/apache/flink/pull/1770
> >
> >
> > On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com> wrote:
> >
> > > Hi everybody,
> > >
> > > as previously announced, I pushed a feature branch called
> > "tableOnCalcite"
> > > to the Flink repository.
> > > We will use this branch to work on FLINK-3221 and its sub-issues.
> > >
> > > Cheers, Fabian
> > >
> > > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> > >
> > > > We haven't defined the StreamSQL syntax yet (and I think it will take
> > > some
> > > > time until we are at that point).
> > > > So we are quite flexible with both featurs.
> > > >
> > > > Let's keep this opportunity in mind and coordinate when before making
> > > > decisions about CEP or StreamSQL.
> > > >
> > > > Fabian
> > > >
> > > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
> > > >
> > > >> First of all, it's a great design document. Looking forward having
> > > stream
> > > >> SQL in the foreseeable future :-)
> > > >>
> > > >> I think it is a good idea to consolidate stream SQL and CEP in the
> > long
> > > >> run. CEP's additional features compared to SQL boil down to pattern
> > > >> detection. Once we have this, it should be only a question of
> defining
> > > the
> > > >> SQL syntax for event patterns in order to integrate CEP with stream
> > SQL.
> > > >> Oracle has already defined an extension [1] to detect patterns in a
> > set
> > > of
> > > >> table rows. This or Esper's event processing language (EPL) [2]
> could
> > > be a
> > > >> good starting point.
> > > >>
> > > >> [1]
> https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > > >> [2]
> > http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Thanks for the feedback!
> > > >> >
> > > >> > We will start the SQL effort with putting the existing (batch)
> Table
> > > >> API on
> > > >> > top of Apache Calcite.
> > > >> > From there we continue to add streaming support for the Table API
> > > >> before we
> > > >> > put a StreamSQL interface on top.
> > > >> >
> > > >> > Consolidating the efforts with the CEP library sounds like a good
> > idea
> > > >> to
> > > >> > me.
> > > >> > Maybe it can be nicely integrated with the streaming table API and
> > > >> later as
> > > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > > defined
> > > >> > yet).
> > > >> >
> > > >> > @Till: What do you think about adding CEP features to the Table
> API.
> > > >> From
> > > >> > the CEP design doc, it looks like we need to add a pattern
> matching
> > > >> > operator in addition to the window features that we need to add
> for
> > > >> > streaming Table API in any case.
> > > >> >
> > > >> > Best, Fabian
> > > >> >
> > > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi.jiangsong@huawei.com
> >:
> > > >> >
> > > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> > language
> > > >> > > extend to offering a cluster of window, pattern matching.  EPL
> can
> > > >> both
> > > >> > > support Streaming SQL and CEP with one unified syntax.
> > > >> > >
> > > >> > > [1]
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > > >> > >   (Chapter 5. EPL Reference: Clauses)
> > > >> > >
> > > >> > >
> > > >> > > Regards
> > > >> > > Song
> > > >> > >
> > > >> > >
> > > >> > > -----邮件原件-----
> > > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > >> > > 发送时间: 2016年1月11日 10:31
> > > >> > > 收件人: dev@flink.apache.org
> > > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > > >> > >
> > > >> > > We still don’t have a concensus about the streaming SQL and CEP
> > > >> library
> > > >> > on
> > > >> > > Flink. Some people want to merge these two libraries. Maybe we
> > have
> > > to
> > > >> > > discuss about this in mailing list.
> > > >> > >
> > > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <
> ndimiduk@gmail.com>
> > > >> wrote:
> > > >> > > >
> > > >> > > > What's the relationship between the streaming SQL proposed
> here
> > > and
> > > >> > > > the CEP syntax proposed earlier in the week?
> > > >> > > >
> > > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > > henry.saputra@gmail.com
> > > >> >
> > > >> > > wrote:
> > > >> > > >
> > > >> > > >> Awesome! Thanks for the reply, Fabian.
> > > >> > > >>
> > > >> > > >> - Henry
> > > >> > > >>
> > > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <
> fhueske@gmail.com
> > > >> > > >> <javascript:;>> wrote:
> > > >> > > >>
> > > >> > > >>> Hi Henry,
> > > >> > > >>>
> > > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> > and a
> > > >> few
> > > >> > > >>> subissues.
> > > >> > > >>> I'll reorganize these and add more issues for the tasks
> > > described
> > > >> in
> > > >> > > >>> the design document in the next days.
> > > >> > > >>>
> > > >> > > >>> Thanks, Fabian
> > > >> > > >>>
> > > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > > henry.saputra@gmail.com
> > > >> > > >> <javascript:;>
> > > >> > > >>> <javascript:;>>:
> > > >> > > >>>
> > > >> > > >>>> HI Fabian,
> > > >> > > >>>>
> > > >> > > >>>> Have you created JIRA ticket to keep track of this new
> > feature?
> > > >> > > >>>>
> > > >> > > >>>> - Henry
> > > >> > > >>>>
> > > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > > fhueske@gmail.com
> > > >> > > >> <javascript:;>
> > > >> > > >>> <javascript:;>> wrote:
> > > >> > > >>>>> Hi everybody,
> > > >> > > >>>>>
> > > >> > > >>>>> in the last days, Timo and I refined the design document
> for
> > > >> > > >>>>> adding a
> > > >> > > >>>> SQL /
> > > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > > Stephan.
> > > >> > > >>>>>
> > > >> > > >>>>> The document proposes an architecture that is centered
> > around
> > > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > > >> > > >>>>> includes a SQL
> > > >> > > >>>> parser,
> > > >> > > >>>>> a semantic validator for relational queries, and a rule-
> and
> > > >> > > >> cost-based
> > > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > > Apache
> > > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is
> to
> > > >> > > >>>>> translate Table
> > > >> > > >>> API
> > > >> > > >>>>> and SQL queries into Calcite's relational expression
> trees,
> > > >> > > >>>>> optimize
> > > >> > > >>>> these
> > > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > > >> programs.The
> > > >> > > >>>> document
> > > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > > >> > > >>>>>
> > > >> > > >>>>> Please review the design document and comment.
> > > >> > > >>>>>
> > > >> > > >>>>> -- >
> > > >> > > >>>>>
> > > >> > > >>>>
> > > >> > > >>>
> > > >> > > >>
> > > >>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > > >> > > >>>>>
> > > >> > > >>>>> Unless there are major concerns with the design, Timo and
> I
> > > want
> > > >> > > >>>>> to
> > > >> > > >>> start
> > > >> > > >>>>> next week to move the current Table API on top of Apache
> > > Calcite
> > > >> > > >> (Task
> > > >> > > >>> 1
> > > >> > > >>>> in
> > > >> > > >>>>> the document). The goal of this task is to have the same
> > > >> > > >> functionality
> > > >> > > >>> as
> > > >> > > >>>>> currently, but with Calcite in the translation process.
> This
> > > is
> > > >> a
> > > >> > > >>>> blocking
> > > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > >> > > >>>>> independently
> > > >> > > >>> work
> > > >> > > >>>>> on different aspects such as extending the Table API,
> > adding a
> > > >> SQL
> > > >> > > >>>>> interface (basically just a parser), integration with
> > external
> > > >> > > >>>>> data sources, better code generation, optimization rules,
> > > >> > > >>>>> streaming
> > > >> > > >> support
> > > >> > > >>>> for
> > > >> > > >>>>> the Table API, StreamSQL, etc..
> > > >> > > >>>>>
> > > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task
> 1
> > > and
> > > >> > > >>>>> merge
> > > >> > > >>> it
> > > >> > > >>>> to
> > > >> > > >>>>> the master branch once the task is completed. Of course,
> > > >> everybody
> > > >> > > >>>>> is welcome to contribute to this effort. Please let us
> know
> > > such
> > > >> > > >>>>> that we
> > > >> > > >>> can
> > > >> > > >>>>> coordinate our efforts.
> > > >> > > >>>>>
> > > >> > > >>>>> Thanks,
> > > >> > > >>>>> Fabian
> > > >> > >
> > > >> > > Regards,
> > > >> > > Chiwan Park
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Robert Metzger <rm...@apache.org>.

Cool, this is great news!
So "Task 1" from the document [1] is done with the merge? And PR #1770 is
going towards "Task 6".
I think good support for Stream SQL is a very interesting new feature for
Flink.

[1]
https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit#heading=h.28dvisn56su0

On Wed, Mar 16, 2016 at 6:17 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com
> wrote:

> Hello everyone,
>
> We are happy to announce that the "tableOnCalcite" branch is finally ready
> to be merged.
> It essentially provides the existing functionality of the Table API, but
> now the translation happens through Apache Calcite.
> You can find the changes rebased on top of the current master in [1].
> We have removed the prototype streaming Table API functionality, which will
> be added back once PR [2] is merged.
>
> We'll go through the changes once more and, if no objections, we would like
> to go ahead and merge this.
>
> Cheers,
> -Vasia.
>
> [1]: https://github.com/vasia/flink/tree/merge-table
> [2]: https://github.com/apache/flink/pull/1770
>
>
> On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com> wrote:
>
> > Hi everybody,
> >
> > as previously announced, I pushed a feature branch called
> "tableOnCalcite"
> > to the Flink repository.
> > We will use this branch to work on FLINK-3221 and its sub-issues.
> >
> > Cheers, Fabian
> >
> > 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> >
> > > We haven't defined the StreamSQL syntax yet (and I think it will take
> > some
> > > time until we are at that point).
> > > So we are quite flexible with both featurs.
> > >
> > > Let's keep this opportunity in mind and coordinate when before making
> > > decisions about CEP or StreamSQL.
> > >
> > > Fabian
> > >
> > > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
> > >
> > >> First of all, it's a great design document. Looking forward having
> > stream
> > >> SQL in the foreseeable future :-)
> > >>
> > >> I think it is a good idea to consolidate stream SQL and CEP in the
> long
> > >> run. CEP's additional features compared to SQL boil down to pattern
> > >> detection. Once we have this, it should be only a question of defining
> > the
> > >> SQL syntax for event patterns in order to integrate CEP with stream
> SQL.
> > >> Oracle has already defined an extension [1] to detect patterns in a
> set
> > of
> > >> table rows. This or Esper's event processing language (EPL) [2] could
> > be a
> > >> good starting point.
> > >>
> > >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> > >> [2]
> http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com>
> > >> wrote:
> > >>
> > >> > Thanks for the feedback!
> > >> >
> > >> > We will start the SQL effort with putting the existing (batch) Table
> > >> API on
> > >> > top of Apache Calcite.
> > >> > From there we continue to add streaming support for the Table API
> > >> before we
> > >> > put a StreamSQL interface on top.
> > >> >
> > >> > Consolidating the efforts with the CEP library sounds like a good
> idea
> > >> to
> > >> > me.
> > >> > Maybe it can be nicely integrated with the streaming table API and
> > >> later as
> > >> > well with the StreamSQL interface (the StreamSQL dialect is not
> > defined
> > >> > yet).
> > >> >
> > >> > @Till: What do you think about adding CEP features to the Table API.
> > >> From
> > >> > the CEP design doc, it looks like we need to add a pattern matching
> > >> > operator in addition to the window features that we need to add for
> > >> > streaming Table API in any case.
> > >> >
> > >> > Best, Fabian
> > >> >
> > >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:
> > >> >
> > >> > > I suggest refering to Esper EPL[1], which is a SQL-standard
> language
> > >> > > extend to offering a cluster of window, pattern matching.  EPL can
> > >> both
> > >> > > support Streaming SQL and CEP with one unified syntax.
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > >> > >   (Chapter 5. EPL Reference: Clauses)
> > >> > >
> > >> > >
> > >> > > Regards
> > >> > > Song
> > >> > >
> > >> > >
> > >> > > -----邮件原件-----
> > >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > >> > > 发送时间: 2016年1月11日 10:31
> > >> > > 收件人: dev@flink.apache.org
> > >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > >> > >
> > >> > > We still don’t have a concensus about the streaming SQL and CEP
> > >> library
> > >> > on
> > >> > > Flink. Some people want to merge these two libraries. Maybe we
> have
> > to
> > >> > > discuss about this in mailing list.
> > >> > >
> > >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > What's the relationship between the streaming SQL proposed here
> > and
> > >> > > > the CEP syntax proposed earlier in the week?
> > >> > > >
> > >> > > > On Sunday, January 10, 2016, Henry Saputra <
> > henry.saputra@gmail.com
> > >> >
> > >> > > wrote:
> > >> > > >
> > >> > > >> Awesome! Thanks for the reply, Fabian.
> > >> > > >>
> > >> > > >> - Henry
> > >> > > >>
> > >> > > >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> > >> > > >> <javascript:;>> wrote:
> > >> > > >>
> > >> > > >>> Hi Henry,
> > >> > > >>>
> > >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099
> and a
> > >> few
> > >> > > >>> subissues.
> > >> > > >>> I'll reorganize these and add more issues for the tasks
> > described
> > >> in
> > >> > > >>> the design document in the next days.
> > >> > > >>>
> > >> > > >>> Thanks, Fabian
> > >> > > >>>
> > >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> > henry.saputra@gmail.com
> > >> > > >> <javascript:;>
> > >> > > >>> <javascript:;>>:
> > >> > > >>>
> > >> > > >>>> HI Fabian,
> > >> > > >>>>
> > >> > > >>>> Have you created JIRA ticket to keep track of this new
> feature?
> > >> > > >>>>
> > >> > > >>>> - Henry
> > >> > > >>>>
> > >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> > fhueske@gmail.com
> > >> > > >> <javascript:;>
> > >> > > >>> <javascript:;>> wrote:
> > >> > > >>>>> Hi everybody,
> > >> > > >>>>>
> > >> > > >>>>> in the last days, Timo and I refined the design document for
> > >> > > >>>>> adding a
> > >> > > >>>> SQL /
> > >> > > >>>>> StreamSQL interface on top of Flink that was started by
> > Stephan.
> > >> > > >>>>>
> > >> > > >>>>> The document proposes an architecture that is centered
> around
> > >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > >> > > >>>>> includes a SQL
> > >> > > >>>> parser,
> > >> > > >>>>> a semantic validator for relational queries, and a rule- and
> > >> > > >> cost-based
> > >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> > Apache
> > >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > >> > > >>>>> translate Table
> > >> > > >>> API
> > >> > > >>>>> and SQL queries into Calcite's relational expression trees,
> > >> > > >>>>> optimize
> > >> > > >>>> these
> > >> > > >>>>> trees, and translate them into DataSet and DataStream
> > >> programs.The
> > >> > > >>>> document
> > >> > > >>>>> breaks down the work into several tasks and subtasks.
> > >> > > >>>>>
> > >> > > >>>>> Please review the design document and comment.
> > >> > > >>>>>
> > >> > > >>>>> -- >
> > >> > > >>>>>
> > >> > > >>>>
> > >> > > >>>
> > >> > > >>
> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > >> > > >> cp1h2TVqdI/edit?usp=sharing
> > >> > > >>>>>
> > >> > > >>>>> Unless there are major concerns with the design, Timo and I
> > want
> > >> > > >>>>> to
> > >> > > >>> start
> > >> > > >>>>> next week to move the current Table API on top of Apache
> > Calcite
> > >> > > >> (Task
> > >> > > >>> 1
> > >> > > >>>> in
> > >> > > >>>>> the document). The goal of this task is to have the same
> > >> > > >> functionality
> > >> > > >>> as
> > >> > > >>>>> currently, but with Calcite in the translation process. This
> > is
> > >> a
> > >> > > >>>> blocking
> > >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > >> > > >>>>> independently
> > >> > > >>> work
> > >> > > >>>>> on different aspects such as extending the Table API,
> adding a
> > >> SQL
> > >> > > >>>>> interface (basically just a parser), integration with
> external
> > >> > > >>>>> data sources, better code generation, optimization rules,
> > >> > > >>>>> streaming
> > >> > > >> support
> > >> > > >>>> for
> > >> > > >>>>> the Table API, StreamSQL, etc..
> > >> > > >>>>>
> > >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1
> > and
> > >> > > >>>>> merge
> > >> > > >>> it
> > >> > > >>>> to
> > >> > > >>>>> the master branch once the task is completed. Of course,
> > >> everybody
> > >> > > >>>>> is welcome to contribute to this effort. Please let us know
> > such
> > >> > > >>>>> that we
> > >> > > >>> can
> > >> > > >>>>> coordinate our efforts.
> > >> > > >>>>>
> > >> > > >>>>> Thanks,
> > >> > > >>>>> Fabian
> > >> > >
> > >> > > Regards,
> > >> > > Chiwan Park
> > >> > >
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Vasiliki Kalavri <va...@gmail.com>.

Hello everyone,

We are happy to announce that the "tableOnCalcite" branch is finally ready
to be merged.
It essentially provides the existing functionality of the Table API, but
now the translation happens through Apache Calcite.
You can find the changes rebased on top of the current master in [1].
We have removed the prototype streaming Table API functionality, which will
be added back once PR [2] is merged.

We'll go through the changes once more and, if no objections, we would like
to go ahead and merge this.

Cheers,
-Vasia.

[1]: https://github.com/vasia/flink/tree/merge-table
[2]: https://github.com/apache/flink/pull/1770


On 15 January 2016 at 10:59, Fabian Hueske <fh...@gmail.com> wrote:

> Hi everybody,
>
> as previously announced, I pushed a feature branch called "tableOnCalcite"
> to the Flink repository.
> We will use this branch to work on FLINK-3221 and its sub-issues.
>
> Cheers, Fabian
>
> 2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
>
> > We haven't defined the StreamSQL syntax yet (and I think it will take
> some
> > time until we are at that point).
> > So we are quite flexible with both featurs.
> >
> > Let's keep this opportunity in mind and coordinate when before making
> > decisions about CEP or StreamSQL.
> >
> > Fabian
> >
> > 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
> >
> >> First of all, it's a great design document. Looking forward having
> stream
> >> SQL in the foreseeable future :-)
> >>
> >> I think it is a good idea to consolidate stream SQL and CEP in the long
> >> run. CEP's additional features compared to SQL boil down to pattern
> >> detection. Once we have this, it should be only a question of defining
> the
> >> SQL syntax for event patterns in order to integrate CEP with stream SQL.
> >> Oracle has already defined an extension [1] to detect patterns in a set
> of
> >> table rows. This or Esper's event processing language (EPL) [2] could
> be a
> >> good starting point.
> >>
> >> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> >> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
> >>
> >> Cheers,
> >> Till
> >>
> >> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com>
> >> wrote:
> >>
> >> > Thanks for the feedback!
> >> >
> >> > We will start the SQL effort with putting the existing (batch) Table
> >> API on
> >> > top of Apache Calcite.
> >> > From there we continue to add streaming support for the Table API
> >> before we
> >> > put a StreamSQL interface on top.
> >> >
> >> > Consolidating the efforts with the CEP library sounds like a good idea
> >> to
> >> > me.
> >> > Maybe it can be nicely integrated with the streaming table API and
> >> later as
> >> > well with the StreamSQL interface (the StreamSQL dialect is not
> defined
> >> > yet).
> >> >
> >> > @Till: What do you think about adding CEP features to the Table API.
> >> From
> >> > the CEP design doc, it looks like we need to add a pattern matching
> >> > operator in addition to the window features that we need to add for
> >> > streaming Table API in any case.
> >> >
> >> > Best, Fabian
> >> >
> >> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:
> >> >
> >> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
> >> > > extend to offering a cluster of window, pattern matching.  EPL can
> >> both
> >> > > support Streaming SQL and CEP with one unified syntax.
> >> > >
> >> > > [1]
> >> > >
> >> >
> >>
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> >> > >   (Chapter 5. EPL Reference: Clauses)
> >> > >
> >> > >
> >> > > Regards
> >> > > Song
> >> > >
> >> > >
> >> > > -----邮件原件-----
> >> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> >> > > 发送时间: 2016年1月11日 10:31
> >> > > 收件人: dev@flink.apache.org
> >> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> >> > >
> >> > > We still don’t have a concensus about the streaming SQL and CEP
> >> library
> >> > on
> >> > > Flink. Some people want to merge these two libraries. Maybe we have
> to
> >> > > discuss about this in mailing list.
> >> > >
> >> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > What's the relationship between the streaming SQL proposed here
> and
> >> > > > the CEP syntax proposed earlier in the week?
> >> > > >
> >> > > > On Sunday, January 10, 2016, Henry Saputra <
> henry.saputra@gmail.com
> >> >
> >> > > wrote:
> >> > > >
> >> > > >> Awesome! Thanks for the reply, Fabian.
> >> > > >>
> >> > > >> - Henry
> >> > > >>
> >> > > >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> >> > > >> <javascript:;>> wrote:
> >> > > >>
> >> > > >>> Hi Henry,
> >> > > >>>
> >> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
> >> few
> >> > > >>> subissues.
> >> > > >>> I'll reorganize these and add more issues for the tasks
> described
> >> in
> >> > > >>> the design document in the next days.
> >> > > >>>
> >> > > >>> Thanks, Fabian
> >> > > >>>
> >> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <
> henry.saputra@gmail.com
> >> > > >> <javascript:;>
> >> > > >>> <javascript:;>>:
> >> > > >>>
> >> > > >>>> HI Fabian,
> >> > > >>>>
> >> > > >>>> Have you created JIRA ticket to keep track of this new feature?
> >> > > >>>>
> >> > > >>>> - Henry
> >> > > >>>>
> >> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <
> fhueske@gmail.com
> >> > > >> <javascript:;>
> >> > > >>> <javascript:;>> wrote:
> >> > > >>>>> Hi everybody,
> >> > > >>>>>
> >> > > >>>>> in the last days, Timo and I refined the design document for
> >> > > >>>>> adding a
> >> > > >>>> SQL /
> >> > > >>>>> StreamSQL interface on top of Flink that was started by
> Stephan.
> >> > > >>>>>
> >> > > >>>>> The document proposes an architecture that is centered around
> >> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> >> > > >>>>> includes a SQL
> >> > > >>>> parser,
> >> > > >>>>> a semantic validator for relational queries, and a rule- and
> >> > > >> cost-based
> >> > > >>>>> relational optimizer. Calcite is used by Apache Hive and
> Apache
> >> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> >> > > >>>>> translate Table
> >> > > >>> API
> >> > > >>>>> and SQL queries into Calcite's relational expression trees,
> >> > > >>>>> optimize
> >> > > >>>> these
> >> > > >>>>> trees, and translate them into DataSet and DataStream
> >> programs.The
> >> > > >>>> document
> >> > > >>>>> breaks down the work into several tasks and subtasks.
> >> > > >>>>>
> >> > > >>>>> Please review the design document and comment.
> >> > > >>>>>
> >> > > >>>>> -- >
> >> > > >>>>>
> >> > > >>>>
> >> > > >>>
> >> > > >>
> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> >> > > >> cp1h2TVqdI/edit?usp=sharing
> >> > > >>>>>
> >> > > >>>>> Unless there are major concerns with the design, Timo and I
> want
> >> > > >>>>> to
> >> > > >>> start
> >> > > >>>>> next week to move the current Table API on top of Apache
> Calcite
> >> > > >> (Task
> >> > > >>> 1
> >> > > >>>> in
> >> > > >>>>> the document). The goal of this task is to have the same
> >> > > >> functionality
> >> > > >>> as
> >> > > >>>>> currently, but with Calcite in the translation process. This
> is
> >> a
> >> > > >>>> blocking
> >> > > >>>>> task that we hope to complete soon. Afterwards, we can
> >> > > >>>>> independently
> >> > > >>> work
> >> > > >>>>> on different aspects such as extending the Table API, adding a
> >> SQL
> >> > > >>>>> interface (basically just a parser), integration with external
> >> > > >>>>> data sources, better code generation, optimization rules,
> >> > > >>>>> streaming
> >> > > >> support
> >> > > >>>> for
> >> > > >>>>> the Table API, StreamSQL, etc..
> >> > > >>>>>
> >> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1
> and
> >> > > >>>>> merge
> >> > > >>> it
> >> > > >>>> to
> >> > > >>>>> the master branch once the task is completed. Of course,
> >> everybody
> >> > > >>>>> is welcome to contribute to this effort. Please let us know
> such
> >> > > >>>>> that we
> >> > > >>> can
> >> > > >>>>> coordinate our efforts.
> >> > > >>>>>
> >> > > >>>>> Thanks,
> >> > > >>>>> Fabian
> >> > >
> >> > > Regards,
> >> > > Chiwan Park
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Fabian Hueske <fh...@gmail.com>.

Hi everybody,

as previously announced, I pushed a feature branch called "tableOnCalcite"
to the Flink repository.
We will use this branch to work on FLINK-3221 and its sub-issues.

Cheers, Fabian

2016-01-11 18:29 GMT+01:00 Fabian Hueske <fh...@gmail.com>:

> We haven't defined the StreamSQL syntax yet (and I think it will take some
> time until we are at that point).
> So we are quite flexible with both featurs.
>
> Let's keep this opportunity in mind and coordinate when before making
> decisions about CEP or StreamSQL.
>
> Fabian
>
> 2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:
>
>> First of all, it's a great design document. Looking forward having stream
>> SQL in the foreseeable future :-)
>>
>> I think it is a good idea to consolidate stream SQL and CEP in the long
>> run. CEP's additional features compared to SQL boil down to pattern
>> detection. Once we have this, it should be only a question of defining the
>> SQL syntax for event patterns in order to integrate CEP with stream SQL.
>> Oracle has already defined an extension [1] to detect patterns in a set of
>> table rows. This or Esper's event processing language (EPL) [2] could be a
>> good starting point.
>>
>> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
>> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
>>
>> Cheers,
>> Till
>>
>> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com>
>> wrote:
>>
>> > Thanks for the feedback!
>> >
>> > We will start the SQL effort with putting the existing (batch) Table
>> API on
>> > top of Apache Calcite.
>> > From there we continue to add streaming support for the Table API
>> before we
>> > put a StreamSQL interface on top.
>> >
>> > Consolidating the efforts with the CEP library sounds like a good idea
>> to
>> > me.
>> > Maybe it can be nicely integrated with the streaming table API and
>> later as
>> > well with the StreamSQL interface (the StreamSQL dialect is not defined
>> > yet).
>> >
>> > @Till: What do you think about adding CEP features to the Table API.
>> From
>> > the CEP design doc, it looks like we need to add a pattern matching
>> > operator in addition to the window features that we need to add for
>> > streaming Table API in any case.
>> >
>> > Best, Fabian
>> >
>> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:
>> >
>> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
>> > > extend to offering a cluster of window, pattern matching.  EPL can
>> both
>> > > support Streaming SQL and CEP with one unified syntax.
>> > >
>> > > [1]
>> > >
>> >
>> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
>> > >   (Chapter 5. EPL Reference: Clauses)
>> > >
>> > >
>> > > Regards
>> > > Song
>> > >
>> > >
>> > > -----邮件原件-----
>> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
>> > > 发送时间: 2016年1月11日 10:31
>> > > 收件人: dev@flink.apache.org
>> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
>> > >
>> > > We still don’t have a concensus about the streaming SQL and CEP
>> library
>> > on
>> > > Flink. Some people want to merge these two libraries. Maybe we have to
>> > > discuss about this in mailing list.
>> > >
>> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com>
>> wrote:
>> > > >
>> > > > What's the relationship between the streaming SQL proposed here and
>> > > > the CEP syntax proposed earlier in the week?
>> > > >
>> > > > On Sunday, January 10, 2016, Henry Saputra <henry.saputra@gmail.com
>> >
>> > > wrote:
>> > > >
>> > > >> Awesome! Thanks for the reply, Fabian.
>> > > >>
>> > > >> - Henry
>> > > >>
>> > > >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
>> > > >> <javascript:;>> wrote:
>> > > >>
>> > > >>> Hi Henry,
>> > > >>>
>> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
>> few
>> > > >>> subissues.
>> > > >>> I'll reorganize these and add more issues for the tasks described
>> in
>> > > >>> the design document in the next days.
>> > > >>>
>> > > >>> Thanks, Fabian
>> > > >>>
>> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
>> > > >> <javascript:;>
>> > > >>> <javascript:;>>:
>> > > >>>
>> > > >>>> HI Fabian,
>> > > >>>>
>> > > >>>> Have you created JIRA ticket to keep track of this new feature?
>> > > >>>>
>> > > >>>> - Henry
>> > > >>>>
>> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
>> > > >> <javascript:;>
>> > > >>> <javascript:;>> wrote:
>> > > >>>>> Hi everybody,
>> > > >>>>>
>> > > >>>>> in the last days, Timo and I refined the design document for
>> > > >>>>> adding a
>> > > >>>> SQL /
>> > > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
>> > > >>>>>
>> > > >>>>> The document proposes an architecture that is centered around
>> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
>> > > >>>>> includes a SQL
>> > > >>>> parser,
>> > > >>>>> a semantic validator for relational queries, and a rule- and
>> > > >> cost-based
>> > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
>> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
>> > > >>>>> translate Table
>> > > >>> API
>> > > >>>>> and SQL queries into Calcite's relational expression trees,
>> > > >>>>> optimize
>> > > >>>> these
>> > > >>>>> trees, and translate them into DataSet and DataStream
>> programs.The
>> > > >>>> document
>> > > >>>>> breaks down the work into several tasks and subtasks.
>> > > >>>>>
>> > > >>>>> Please review the design document and comment.
>> > > >>>>>
>> > > >>>>> -- >
>> > > >>>>>
>> > > >>>>
>> > > >>>
>> > > >>
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> > > >> cp1h2TVqdI/edit?usp=sharing
>> > > >>>>>
>> > > >>>>> Unless there are major concerns with the design, Timo and I want
>> > > >>>>> to
>> > > >>> start
>> > > >>>>> next week to move the current Table API on top of Apache Calcite
>> > > >> (Task
>> > > >>> 1
>> > > >>>> in
>> > > >>>>> the document). The goal of this task is to have the same
>> > > >> functionality
>> > > >>> as
>> > > >>>>> currently, but with Calcite in the translation process. This is
>> a
>> > > >>>> blocking
>> > > >>>>> task that we hope to complete soon. Afterwards, we can
>> > > >>>>> independently
>> > > >>> work
>> > > >>>>> on different aspects such as extending the Table API, adding a
>> SQL
>> > > >>>>> interface (basically just a parser), integration with external
>> > > >>>>> data sources, better code generation, optimization rules,
>> > > >>>>> streaming
>> > > >> support
>> > > >>>> for
>> > > >>>>> the Table API, StreamSQL, etc..
>> > > >>>>>
>> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
>> > > >>>>> merge
>> > > >>> it
>> > > >>>> to
>> > > >>>>> the master branch once the task is completed. Of course,
>> everybody
>> > > >>>>> is welcome to contribute to this effort. Please let us know such
>> > > >>>>> that we
>> > > >>> can
>> > > >>>>> coordinate our efforts.
>> > > >>>>>
>> > > >>>>> Thanks,
>> > > >>>>> Fabian
>> > >
>> > > Regards,
>> > > Chiwan Park
>> > >
>> > >
>> > >
>> >
>>
>
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Fabian Hueske <fh...@gmail.com>.

We haven't defined the StreamSQL syntax yet (and I think it will take some
time until we are at that point).
So we are quite flexible with both featurs.

Let's keep this opportunity in mind and coordinate when before making
decisions about CEP or StreamSQL.

Fabian

2016-01-11 17:29 GMT+01:00 Till Rohrmann <tr...@apache.org>:

> First of all, it's a great design document. Looking forward having stream
> SQL in the foreseeable future :-)
>
> I think it is a good idea to consolidate stream SQL and CEP in the long
> run. CEP's additional features compared to SQL boil down to pattern
> detection. Once we have this, it should be only a question of defining the
> SQL syntax for event patterns in order to integrate CEP with stream SQL.
> Oracle has already defined an extension [1] to detect patterns in a set of
> table rows. This or Esper's event processing language (EPL) [2] could be a
> good starting point.
>
> [1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
> [2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/
>
> Cheers,
> Till
>
> On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> > Thanks for the feedback!
> >
> > We will start the SQL effort with putting the existing (batch) Table API
> on
> > top of Apache Calcite.
> > From there we continue to add streaming support for the Table API before
> we
> > put a StreamSQL interface on top.
> >
> > Consolidating the efforts with the CEP library sounds like a good idea to
> > me.
> > Maybe it can be nicely integrated with the streaming table API and later
> as
> > well with the StreamSQL interface (the StreamSQL dialect is not defined
> > yet).
> >
> > @Till: What do you think about adding CEP features to the Table API. From
> > the CEP design doc, it looks like we need to add a pattern matching
> > operator in addition to the window features that we need to add for
> > streaming Table API in any case.
> >
> > Best, Fabian
> >
> > 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:
> >
> > > I suggest refering to Esper EPL[1], which is a SQL-standard language
> > > extend to offering a cluster of window, pattern matching.  EPL can both
> > > support Streaming SQL and CEP with one unified syntax.
> > >
> > > [1]
> > >
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> > >   (Chapter 5. EPL Reference: Clauses)
> > >
> > >
> > > Regards
> > > Song
> > >
> > >
> > > -----邮件原件-----
> > > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > > 发送时间: 2016年1月11日 10:31
> > > 收件人: dev@flink.apache.org
> > > 主题: Re: Effort to add SQL / StreamSQL to Flink
> > >
> > > We still don’t have a concensus about the streaming SQL and CEP library
> > on
> > > Flink. Some people want to merge these two libraries. Maybe we have to
> > > discuss about this in mailing list.
> > >
> > > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com>
> wrote:
> > > >
> > > > What's the relationship between the streaming SQL proposed here and
> > > > the CEP syntax proposed earlier in the week?
> > > >
> > > > On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com>
> > > wrote:
> > > >
> > > >> Awesome! Thanks for the reply, Fabian.
> > > >>
> > > >> - Henry
> > > >>
> > > >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> > > >> <javascript:;>> wrote:
> > > >>
> > > >>> Hi Henry,
> > > >>>
> > > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a
> few
> > > >>> subissues.
> > > >>> I'll reorganize these and add more issues for the tasks described
> in
> > > >>> the design document in the next days.
> > > >>>
> > > >>> Thanks, Fabian
> > > >>>
> > > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> > > >> <javascript:;>
> > > >>> <javascript:;>>:
> > > >>>
> > > >>>> HI Fabian,
> > > >>>>
> > > >>>> Have you created JIRA ticket to keep track of this new feature?
> > > >>>>
> > > >>>> - Henry
> > > >>>>
> > > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
> > > >> <javascript:;>
> > > >>> <javascript:;>> wrote:
> > > >>>>> Hi everybody,
> > > >>>>>
> > > >>>>> in the last days, Timo and I refined the design document for
> > > >>>>> adding a
> > > >>>> SQL /
> > > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> > > >>>>>
> > > >>>>> The document proposes an architecture that is centered around
> > > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > > >>>>> includes a SQL
> > > >>>> parser,
> > > >>>>> a semantic validator for relational queries, and a rule- and
> > > >> cost-based
> > > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> > > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > > >>>>> translate Table
> > > >>> API
> > > >>>>> and SQL queries into Calcite's relational expression trees,
> > > >>>>> optimize
> > > >>>> these
> > > >>>>> trees, and translate them into DataSet and DataStream
> programs.The
> > > >>>> document
> > > >>>>> breaks down the work into several tasks and subtasks.
> > > >>>>>
> > > >>>>> Please review the design document and comment.
> > > >>>>>
> > > >>>>> -- >
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > > >> cp1h2TVqdI/edit?usp=sharing
> > > >>>>>
> > > >>>>> Unless there are major concerns with the design, Timo and I want
> > > >>>>> to
> > > >>> start
> > > >>>>> next week to move the current Table API on top of Apache Calcite
> > > >> (Task
> > > >>> 1
> > > >>>> in
> > > >>>>> the document). The goal of this task is to have the same
> > > >> functionality
> > > >>> as
> > > >>>>> currently, but with Calcite in the translation process. This is a
> > > >>>> blocking
> > > >>>>> task that we hope to complete soon. Afterwards, we can
> > > >>>>> independently
> > > >>> work
> > > >>>>> on different aspects such as extending the Table API, adding a
> SQL
> > > >>>>> interface (basically just a parser), integration with external
> > > >>>>> data sources, better code generation, optimization rules,
> > > >>>>> streaming
> > > >> support
> > > >>>> for
> > > >>>>> the Table API, StreamSQL, etc..
> > > >>>>>
> > > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> > > >>>>> merge
> > > >>> it
> > > >>>> to
> > > >>>>> the master branch once the task is completed. Of course,
> everybody
> > > >>>>> is welcome to contribute to this effort. Please let us know such
> > > >>>>> that we
> > > >>> can
> > > >>>>> coordinate our efforts.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> Fabian
> > >
> > > Regards,
> > > Chiwan Park
> > >
> > >
> > >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Till Rohrmann <tr...@apache.org>.

First of all, it's a great design document. Looking forward having stream
SQL in the foreseeable future :-)

I think it is a good idea to consolidate stream SQL and CEP in the long
run. CEP's additional features compared to SQL boil down to pattern
detection. Once we have this, it should be only a question of defining the
SQL syntax for event patterns in order to integrate CEP with stream SQL.
Oracle has already defined an extension [1] to detect patterns in a set of
table rows. This or Esper's event processing language (EPL) [2] could be a
good starting point.

[1] https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8959
[2] http://www.espertech.com/esper/release-5.2.0/esper-reference/html/

Cheers,
Till

On Mon, Jan 11, 2016 at 10:12 AM, Fabian Hueske <fh...@gmail.com> wrote:

> Thanks for the feedback!
>
> We will start the SQL effort with putting the existing (batch) Table API on
> top of Apache Calcite.
> From there we continue to add streaming support for the Table API before we
> put a StreamSQL interface on top.
>
> Consolidating the efforts with the CEP library sounds like a good idea to
> me.
> Maybe it can be nicely integrated with the streaming table API and later as
> well with the StreamSQL interface (the StreamSQL dialect is not defined
> yet).
>
> @Till: What do you think about adding CEP features to the Table API. From
> the CEP design doc, it looks like we need to add a pattern matching
> operator in addition to the window features that we need to add for
> streaming Table API in any case.
>
> Best, Fabian
>
> 2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:
>
> > I suggest refering to Esper EPL[1], which is a SQL-standard language
> > extend to offering a cluster of window, pattern matching.  EPL can both
> > support Streaming SQL and CEP with one unified syntax.
> >
> > [1]
> >
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
> >   (Chapter 5. EPL Reference: Clauses)
> >
> >
> > Regards
> > Song
> >
> >
> > -----邮件原件-----
> > 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> > 发送时间: 2016年1月11日 10:31
> > 收件人: dev@flink.apache.org
> > 主题: Re: Effort to add SQL / StreamSQL to Flink
> >
> > We still don’t have a concensus about the streaming SQL and CEP library
> on
> > Flink. Some people want to merge these two libraries. Maybe we have to
> > discuss about this in mailing list.
> >
> > > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com> wrote:
> > >
> > > What's the relationship between the streaming SQL proposed here and
> > > the CEP syntax proposed earlier in the week?
> > >
> > > On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com>
> > wrote:
> > >
> > >> Awesome! Thanks for the reply, Fabian.
> > >>
> > >> - Henry
> > >>
> > >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> > >> <javascript:;>> wrote:
> > >>
> > >>> Hi Henry,
> > >>>
> > >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> > >>> subissues.
> > >>> I'll reorganize these and add more issues for the tasks described in
> > >>> the design document in the next days.
> > >>>
> > >>> Thanks, Fabian
> > >>>
> > >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> > >> <javascript:;>
> > >>> <javascript:;>>:
> > >>>
> > >>>> HI Fabian,
> > >>>>
> > >>>> Have you created JIRA ticket to keep track of this new feature?
> > >>>>
> > >>>> - Henry
> > >>>>
> > >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
> > >> <javascript:;>
> > >>> <javascript:;>> wrote:
> > >>>>> Hi everybody,
> > >>>>>
> > >>>>> in the last days, Timo and I refined the design document for
> > >>>>> adding a
> > >>>> SQL /
> > >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> > >>>>>
> > >>>>> The document proposes an architecture that is centered around
> > >>>>> Apache Calcite. Calcite is an Apache top-level project and
> > >>>>> includes a SQL
> > >>>> parser,
> > >>>>> a semantic validator for relational queries, and a rule- and
> > >> cost-based
> > >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> > >>>>> Drill (among other projects). In a nutshell, the plan is to
> > >>>>> translate Table
> > >>> API
> > >>>>> and SQL queries into Calcite's relational expression trees,
> > >>>>> optimize
> > >>>> these
> > >>>>> trees, and translate them into DataSet and DataStream programs.The
> > >>>> document
> > >>>>> breaks down the work into several tasks and subtasks.
> > >>>>>
> > >>>>> Please review the design document and comment.
> > >>>>>
> > >>>>> -- >
> > >>>>>
> > >>>>
> > >>>
> > >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> > >> cp1h2TVqdI/edit?usp=sharing
> > >>>>>
> > >>>>> Unless there are major concerns with the design, Timo and I want
> > >>>>> to
> > >>> start
> > >>>>> next week to move the current Table API on top of Apache Calcite
> > >> (Task
> > >>> 1
> > >>>> in
> > >>>>> the document). The goal of this task is to have the same
> > >> functionality
> > >>> as
> > >>>>> currently, but with Calcite in the translation process. This is a
> > >>>> blocking
> > >>>>> task that we hope to complete soon. Afterwards, we can
> > >>>>> independently
> > >>> work
> > >>>>> on different aspects such as extending the Table API, adding a SQL
> > >>>>> interface (basically just a parser), integration with external
> > >>>>> data sources, better code generation, optimization rules,
> > >>>>> streaming
> > >> support
> > >>>> for
> > >>>>> the Table API, StreamSQL, etc..
> > >>>>>
> > >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> > >>>>> merge
> > >>> it
> > >>>> to
> > >>>>> the master branch once the task is completed. Of course, everybody
> > >>>>> is welcome to contribute to this effort. Please let us know such
> > >>>>> that we
> > >>> can
> > >>>>> coordinate our efforts.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Fabian
> >
> > Regards,
> > Chiwan Park
> >
> >
> >
>

Re: 答复: Effort to add SQL / StreamSQL to Flink

Posted by Fabian Hueske <fh...@gmail.com>.

Thanks for the feedback!

We will start the SQL effort with putting the existing (batch) Table API on
top of Apache Calcite.
>From there we continue to add streaming support for the Table API before we
put a StreamSQL interface on top.

Consolidating the efforts with the CEP library sounds like a good idea to
me.
Maybe it can be nicely integrated with the streaming table API and later as
well with the StreamSQL interface (the StreamSQL dialect is not defined
yet).

@Till: What do you think about adding CEP features to the Table API. From
the CEP design doc, it looks like we need to add a pattern matching
operator in addition to the window features that we need to add for
streaming Table API in any case.

Best, Fabian

2016-01-11 4:03 GMT+01:00 Jiangsong (Hi) <hi...@huawei.com>:

> I suggest refering to Esper EPL[1], which is a SQL-standard language
> extend to offering a cluster of window, pattern matching.  EPL can both
> support Streaming SQL and CEP with one unified syntax.
>
> [1]
> http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf
>   (Chapter 5. EPL Reference: Clauses)
>
>
> Regards
> Song
>
>
> -----邮件原件-----
> 发件人: Chiwan Park [mailto:chiwanpark@apache.org]
> 发送时间: 2016年1月11日 10:31
> 收件人: dev@flink.apache.org
> 主题: Re: Effort to add SQL / StreamSQL to Flink
>
> We still don’t have a concensus about the streaming SQL and CEP library on
> Flink. Some people want to merge these two libraries. Maybe we have to
> discuss about this in mailing list.
>
> > On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com> wrote:
> >
> > What's the relationship between the streaming SQL proposed here and
> > the CEP syntax proposed earlier in the week?
> >
> > On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com>
> wrote:
> >
> >> Awesome! Thanks for the reply, Fabian.
> >>
> >> - Henry
> >>
> >> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> >> <javascript:;>> wrote:
> >>
> >>> Hi Henry,
> >>>
> >>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> >>> subissues.
> >>> I'll reorganize these and add more issues for the tasks described in
> >>> the design document in the next days.
> >>>
> >>> Thanks, Fabian
> >>>
> >>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> >> <javascript:;>
> >>> <javascript:;>>:
> >>>
> >>>> HI Fabian,
> >>>>
> >>>> Have you created JIRA ticket to keep track of this new feature?
> >>>>
> >>>> - Henry
> >>>>
> >>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
> >> <javascript:;>
> >>> <javascript:;>> wrote:
> >>>>> Hi everybody,
> >>>>>
> >>>>> in the last days, Timo and I refined the design document for
> >>>>> adding a
> >>>> SQL /
> >>>>> StreamSQL interface on top of Flink that was started by Stephan.
> >>>>>
> >>>>> The document proposes an architecture that is centered around
> >>>>> Apache Calcite. Calcite is an Apache top-level project and
> >>>>> includes a SQL
> >>>> parser,
> >>>>> a semantic validator for relational queries, and a rule- and
> >> cost-based
> >>>>> relational optimizer. Calcite is used by Apache Hive and Apache
> >>>>> Drill (among other projects). In a nutshell, the plan is to
> >>>>> translate Table
> >>> API
> >>>>> and SQL queries into Calcite's relational expression trees,
> >>>>> optimize
> >>>> these
> >>>>> trees, and translate them into DataSet and DataStream programs.The
> >>>> document
> >>>>> breaks down the work into several tasks and subtasks.
> >>>>>
> >>>>> Please review the design document and comment.
> >>>>>
> >>>>> -- >
> >>>>>
> >>>>
> >>>
> >> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
> >> cp1h2TVqdI/edit?usp=sharing
> >>>>>
> >>>>> Unless there are major concerns with the design, Timo and I want
> >>>>> to
> >>> start
> >>>>> next week to move the current Table API on top of Apache Calcite
> >> (Task
> >>> 1
> >>>> in
> >>>>> the document). The goal of this task is to have the same
> >> functionality
> >>> as
> >>>>> currently, but with Calcite in the translation process. This is a
> >>>> blocking
> >>>>> task that we hope to complete soon. Afterwards, we can
> >>>>> independently
> >>> work
> >>>>> on different aspects such as extending the Table API, adding a SQL
> >>>>> interface (basically just a parser), integration with external
> >>>>> data sources, better code generation, optimization rules,
> >>>>> streaming
> >> support
> >>>> for
> >>>>> the Table API, StreamSQL, etc..
> >>>>>
> >>>>> Timo and I plan to work on a WIP branch to implement Task 1 and
> >>>>> merge
> >>> it
> >>>> to
> >>>>> the master branch once the task is completed. Of course, everybody
> >>>>> is welcome to contribute to this effort. Please let us know such
> >>>>> that we
> >>> can
> >>>>> coordinate our efforts.
> >>>>>
> >>>>> Thanks,
> >>>>> Fabian
>
> Regards,
> Chiwan Park
>
>
>

答复: Effort to add SQL / StreamSQL to Flink

Posted by "Jiangsong (Hi)" <hi...@huawei.com>.

I suggest refering to Esper EPL[1], which is a SQL-standard language extend to offering a cluster of window, pattern matching.  EPL can both support Streaming SQL and CEP with one unified syntax.

[1] http://www.espertech.com/esper/release-5.2.0/esper-reference/pdf/esper_reference.pdf    (Chapter 5. EPL Reference: Clauses)


Regards
Song


-----邮件原件-----
发件人: Chiwan Park [mailto:chiwanpark@apache.org] 
发送时间: 2016年1月11日 10:31
收件人: dev@flink.apache.org
主题: Re: Effort to add SQL / StreamSQL to Flink

We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list.

> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com> wrote:
> 
> What's the relationship between the streaming SQL proposed here and 
> the CEP syntax proposed earlier in the week?
> 
> On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com> wrote:
> 
>> Awesome! Thanks for the reply, Fabian.
>> 
>> - Henry
>> 
>> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com 
>> <javascript:;>> wrote:
>> 
>>> Hi Henry,
>>> 
>>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few 
>>> subissues.
>>> I'll reorganize these and add more issues for the tasks described in 
>>> the design document in the next days.
>>> 
>>> Thanks, Fabian
>>> 
>>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
>> <javascript:;>
>>> <javascript:;>>:
>>> 
>>>> HI Fabian,
>>>> 
>>>> Have you created JIRA ticket to keep track of this new feature?
>>>> 
>>>> - Henry
>>>> 
>>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
>> <javascript:;>
>>> <javascript:;>> wrote:
>>>>> Hi everybody,
>>>>> 
>>>>> in the last days, Timo and I refined the design document for 
>>>>> adding a
>>>> SQL /
>>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>> 
>>>>> The document proposes an architecture that is centered around 
>>>>> Apache Calcite. Calcite is an Apache top-level project and 
>>>>> includes a SQL
>>>> parser,
>>>>> a semantic validator for relational queries, and a rule- and
>> cost-based
>>>>> relational optimizer. Calcite is used by Apache Hive and Apache 
>>>>> Drill (among other projects). In a nutshell, the plan is to 
>>>>> translate Table
>>> API
>>>>> and SQL queries into Calcite's relational expression trees, 
>>>>> optimize
>>>> these
>>>>> trees, and translate them into DataSet and DataStream programs.The
>>>> document
>>>>> breaks down the work into several tasks and subtasks.
>>>>> 
>>>>> Please review the design document and comment.
>>>>> 
>>>>> -- >
>>>>> 
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjP
>> cp1h2TVqdI/edit?usp=sharing
>>>>> 
>>>>> Unless there are major concerns with the design, Timo and I want 
>>>>> to
>>> start
>>>>> next week to move the current Table API on top of Apache Calcite
>> (Task
>>> 1
>>>> in
>>>>> the document). The goal of this task is to have the same
>> functionality
>>> as
>>>>> currently, but with Calcite in the translation process. This is a
>>>> blocking
>>>>> task that we hope to complete soon. Afterwards, we can 
>>>>> independently
>>> work
>>>>> on different aspects such as extending the Table API, adding a SQL 
>>>>> interface (basically just a parser), integration with external 
>>>>> data sources, better code generation, optimization rules, 
>>>>> streaming
>> support
>>>> for
>>>>> the Table API, StreamSQL, etc..
>>>>> 
>>>>> Timo and I plan to work on a WIP branch to implement Task 1 and 
>>>>> merge
>>> it
>>>> to
>>>>> the master branch once the task is completed. Of course, everybody 
>>>>> is welcome to contribute to this effort. Please let us know such 
>>>>> that we
>>> can
>>>>> coordinate our efforts.
>>>>> 
>>>>> Thanks,
>>>>> Fabian

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

Posted by Chiwan Park <ch...@apache.org>.

We still don’t have a concensus about the streaming SQL and CEP library on Flink. Some people want to merge these two libraries. Maybe we have to discuss about this in mailing list.

> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <nd...@gmail.com> wrote:
> 
> What's the relationship between the streaming SQL proposed here and the CEP
> syntax proposed earlier in the week?
> 
> On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com> wrote:
> 
>> Awesome! Thanks for the reply, Fabian.
>> 
>> - Henry
>> 
>> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
>> <javascript:;>> wrote:
>> 
>>> Hi Henry,
>>> 
>>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
>>> subissues.
>>> I'll reorganize these and add more issues for the tasks described in the
>>> design document in the next days.
>>> 
>>> Thanks, Fabian
>>> 
>>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
>> <javascript:;>
>>> <javascript:;>>:
>>> 
>>>> HI Fabian,
>>>> 
>>>> Have you created JIRA ticket to keep track of this new feature?
>>>> 
>>>> - Henry
>>>> 
>>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
>> <javascript:;>
>>> <javascript:;>> wrote:
>>>>> Hi everybody,
>>>>> 
>>>>> in the last days, Timo and I refined the design document for adding a
>>>> SQL /
>>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>> 
>>>>> The document proposes an architecture that is centered around Apache
>>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>>> parser,
>>>>> a semantic validator for relational queries, and a rule- and
>> cost-based
>>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>>> (among other projects). In a nutshell, the plan is to translate Table
>>> API
>>>>> and SQL queries into Calcite's relational expression trees, optimize
>>>> these
>>>>> trees, and translate them into DataSet and DataStream programs.The
>>>> document
>>>>> breaks down the work into several tasks and subtasks.
>>>>> 
>>>>> Please review the design document and comment.
>>>>> 
>>>>> -- >
>>>>> 
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>>> 
>>>>> Unless there are major concerns with the design, Timo and I want to
>>> start
>>>>> next week to move the current Table API on top of Apache Calcite
>> (Task
>>> 1
>>>> in
>>>>> the document). The goal of this task is to have the same
>> functionality
>>> as
>>>>> currently, but with Calcite in the translation process. This is a
>>>> blocking
>>>>> task that we hope to complete soon. Afterwards, we can independently
>>> work
>>>>> on different aspects such as extending the Table API, adding a SQL
>>>>> interface (basically just a parser), integration with external data
>>>>> sources, better code generation, optimization rules, streaming
>> support
>>>> for
>>>>> the Table API, StreamSQL, etc..
>>>>> 
>>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>>> it
>>>> to
>>>>> the master branch once the task is completed. Of course, everybody is
>>>>> welcome to contribute to this effort. Please let us know such that we
>>> can
>>>>> coordinate our efforts.
>>>>> 
>>>>> Thanks,
>>>>> Fabian

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

Posted by Nick Dimiduk <nd...@gmail.com>.

What's the relationship between the streaming SQL proposed here and the CEP
syntax proposed earlier in the week?

On Sunday, January 10, 2016, Henry Saputra <he...@gmail.com> wrote:

> Awesome! Thanks for the reply, Fabian.
>
> - Henry
>
> On Sunday, January 10, 2016, Fabian Hueske <fhueske@gmail.com
> <javascript:;>> wrote:
>
> > Hi Henry,
> >
> > There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> > subissues.
> > I'll reorganize these and add more issues for the tasks described in the
> > design document in the next days.
> >
> > Thanks, Fabian
> >
> > 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> <javascript:;>
> > <javascript:;>>:
> >
> > > HI Fabian,
> > >
> > > Have you created JIRA ticket to keep track of this new feature?
> > >
> > > - Henry
> > >
> > > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
> <javascript:;>
> > <javascript:;>> wrote:
> > > > Hi everybody,
> > > >
> > > > in the last days, Timo and I refined the design document for adding a
> > > SQL /
> > > > StreamSQL interface on top of Flink that was started by Stephan.
> > > >
> > > > The document proposes an architecture that is centered around Apache
> > > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > > parser,
> > > > a semantic validator for relational queries, and a rule- and
> cost-based
> > > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > > (among other projects). In a nutshell, the plan is to translate Table
> > API
> > > > and SQL queries into Calcite's relational expression trees, optimize
> > > these
> > > > trees, and translate them into DataSet and DataStream programs.The
> > > document
> > > > breaks down the work into several tasks and subtasks.
> > > >
> > > > Please review the design document and comment.
> > > >
> > > > -- >
> > > >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > > >
> > > > Unless there are major concerns with the design, Timo and I want to
> > start
> > > > next week to move the current Table API on top of Apache Calcite
> (Task
> > 1
> > > in
> > > > the document). The goal of this task is to have the same
> functionality
> > as
> > > > currently, but with Calcite in the translation process. This is a
> > > blocking
> > > > task that we hope to complete soon. Afterwards, we can independently
> > work
> > > > on different aspects such as extending the Table API, adding a SQL
> > > > interface (basically just a parser), integration with external data
> > > > sources, better code generation, optimization rules, streaming
> support
> > > for
> > > > the Table API, StreamSQL, etc..
> > > >
> > > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> > it
> > > to
> > > > the master branch once the task is completed. Of course, everybody is
> > > > welcome to contribute to this effort. Please let us know such that we
> > can
> > > > coordinate our efforts.
> > > >
> > > > Thanks,
> > > > Fabian
> > >
> >
>

Re: Effort to add SQL / StreamSQL to Flink

Posted by Henry Saputra <he...@gmail.com>.

Awesome! Thanks for the reply, Fabian.

- Henry

On Sunday, January 10, 2016, Fabian Hueske <fh...@gmail.com> wrote:

> Hi Henry,
>
> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
> subissues.
> I'll reorganize these and add more issues for the tasks described in the
> design document in the next days.
>
> Thanks, Fabian
>
> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> <javascript:;>>:
>
> > HI Fabian,
> >
> > Have you created JIRA ticket to keep track of this new feature?
> >
> > - Henry
> >
> > On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhueske@gmail.com
> <javascript:;>> wrote:
> > > Hi everybody,
> > >
> > > in the last days, Timo and I refined the design document for adding a
> > SQL /
> > > StreamSQL interface on top of Flink that was started by Stephan.
> > >
> > > The document proposes an architecture that is centered around Apache
> > > Calcite. Calcite is an Apache top-level project and includes a SQL
> > parser,
> > > a semantic validator for relational queries, and a rule- and cost-based
> > > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > > (among other projects). In a nutshell, the plan is to translate Table
> API
> > > and SQL queries into Calcite's relational expression trees, optimize
> > these
> > > trees, and translate them into DataSet and DataStream programs.The
> > document
> > > breaks down the work into several tasks and subtasks.
> > >
> > > Please review the design document and comment.
> > >
> > > -- >
> > >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> > >
> > > Unless there are major concerns with the design, Timo and I want to
> start
> > > next week to move the current Table API on top of Apache Calcite (Task
> 1
> > in
> > > the document). The goal of this task is to have the same functionality
> as
> > > currently, but with Calcite in the translation process. This is a
> > blocking
> > > task that we hope to complete soon. Afterwards, we can independently
> work
> > > on different aspects such as extending the Table API, adding a SQL
> > > interface (basically just a parser), integration with external data
> > > sources, better code generation, optimization rules, streaming support
> > for
> > > the Table API, StreamSQL, etc..
> > >
> > > Timo and I plan to work on a WIP branch to implement Task 1 and merge
> it
> > to
> > > the master branch once the task is completed. Of course, everybody is
> > > welcome to contribute to this effort. Please let us know such that we
> can
> > > coordinate our efforts.
> > >
> > > Thanks,
> > > Fabian
> >
>

Re: Effort to add SQL / StreamSQL to Flink

Posted by Fabian Hueske <fh...@gmail.com>.

Hi Henry,

There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
subissues.
I'll reorganize these and add more issues for the tasks described in the
design document in the next days.

Thanks, Fabian

2016-01-10 2:45 GMT+01:00 Henry Saputra <he...@gmail.com>:

> HI Fabian,
>
> Have you created JIRA ticket to keep track of this new feature?
>
> - Henry
>
> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fh...@gmail.com> wrote:
> > Hi everybody,
> >
> > in the last days, Timo and I refined the design document for adding a
> SQL /
> > StreamSQL interface on top of Flink that was started by Stephan.
> >
> > The document proposes an architecture that is centered around Apache
> > Calcite. Calcite is an Apache top-level project and includes a SQL
> parser,
> > a semantic validator for relational queries, and a rule- and cost-based
> > relational optimizer. Calcite is used by Apache Hive and Apache Drill
> > (among other projects). In a nutshell, the plan is to translate Table API
> > and SQL queries into Calcite's relational expression trees, optimize
> these
> > trees, and translate them into DataSet and DataStream programs.The
> document
> > breaks down the work into several tasks and subtasks.
> >
> > Please review the design document and comment.
> >
> > -- >
> >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
> >
> > Unless there are major concerns with the design, Timo and I want to start
> > next week to move the current Table API on top of Apache Calcite (Task 1
> in
> > the document). The goal of this task is to have the same functionality as
> > currently, but with Calcite in the translation process. This is a
> blocking
> > task that we hope to complete soon. Afterwards, we can independently work
> > on different aspects such as extending the Table API, adding a SQL
> > interface (basically just a parser), integration with external data
> > sources, better code generation, optimization rules, streaming support
> for
> > the Table API, StreamSQL, etc..
> >
> > Timo and I plan to work on a WIP branch to implement Task 1 and merge it
> to
> > the master branch once the task is completed. Of course, everybody is
> > welcome to contribute to this effort. Please let us know such that we can
> > coordinate our efforts.
> >
> > Thanks,
> > Fabian
>

Re: Effort to add SQL / StreamSQL to Flink

Posted by Henry Saputra <he...@gmail.com>.

HI Fabian,

Have you created JIRA ticket to keep track of this new feature?

- Henry

On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fh...@gmail.com> wrote:
> Hi everybody,
>
> in the last days, Timo and I refined the design document for adding a SQL /
> StreamSQL interface on top of Flink that was started by Stephan.
>
> The document proposes an architecture that is centered around Apache
> Calcite. Calcite is an Apache top-level project and includes a SQL parser,
> a semantic validator for relational queries, and a rule- and cost-based
> relational optimizer. Calcite is used by Apache Hive and Apache Drill
> (among other projects). In a nutshell, the plan is to translate Table API
> and SQL queries into Calcite's relational expression trees, optimize these
> trees, and translate them into DataSet and DataStream programs.The document
> breaks down the work into several tasks and subtasks.
>
> Please review the design document and comment.
>
> -- >
> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>
> Unless there are major concerns with the design, Timo and I want to start
> next week to move the current Table API on top of Apache Calcite (Task 1 in
> the document). The goal of this task is to have the same functionality as
> currently, but with Calcite in the translation process. This is a blocking
> task that we hope to complete soon. Afterwards, we can independently work
> on different aspects such as extending the Table API, adding a SQL
> interface (basically just a parser), integration with external data
> sources, better code generation, optimization rules, streaming support for
> the Table API, StreamSQL, etc..
>
> Timo and I plan to work on a WIP branch to implement Task 1 and merge it to
> the master branch once the task is completed. Of course, everybody is
> welcome to contribute to this effort. Please let us know such that we can
> coordinate our efforts.
>
> Thanks,
> Fabian