You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by godfrey he <go...@gmail.com> on 2020/02/05 09:44:44 UTC

Re: [DISCUSS] FLIP-91 - Support SQL Client Gateway

Hi all,

I also agree with Stephan and Timo that the SQL Client should be a simple
"shell around the table environment". About "making this a standalone
project", I agree with Timo, and I think keeping SQL client in Flink
codebase can ensure SQL client integrity (has both embedded mode and
gateway mode) and out-of-the-box experience.

> 1) Can we remove the JDBC discussion from this FLIP? It can be a
separate discussion. Let's focus on the gateway first.

JDBC is only a small part in the whole discussion, and only JDBC on batch
is involved in the FLIP.  So we can create another FLIP to discuss JDBC
later.

> 2) Flink's current REST API is a custom implementation and does not rely
on any REST framework. It might make sense to not reuse code from
flink-runtime but use one of the commonly used framework. Maybe @Chesnay
in CC might have an opinion here?

I think we can create a new module named flink-rest-server as the common
framework for Flink REST API and move the common REST-related code from
flink-runtime to flink-rest-server. What do you think?  @Chesnay

> 3) The SQL Client gateway can only be a thin implementation if we also
find a long-term solution for retrieving results. Currently, all APIs
use dataStream/dataSet.collect() that might fail for larger data. We
should solve this issue first. For example, by specifying a temporary
result connector that would write to Kafka or file? Esp. if we would
like to base a JDBC interface on top of this, retrieving results must
handle big amounts of data consistently.

I agree with Timo that the gateway must handle large result data on JDBC,
but it makes little sense that the CLI client displays the large SELECT
result.
Based on the discussion of [0] and [1], I think table.collect (returns a
iterator) can meet the requirement. If that, I think SQL client do not need
the temporary result connector solution to store large data set.

> 4) We should already start discussing changes to Table API before
implementing this FLIP. This includes parts of FLIP-84 for returning
"static result tables" for statements like "SHOW MODULES" or
table.collect().

I think we can push the both FLIPs forward concurrently. Because in the
discussion of FLIP-91, we reuse the original code as much as possible, such
as Executor, SqlCommandParser. When implement FLIP-84, we just need the
change the common code, no need to change special code for gateway.


[0] https://issues.apache.org/jira/browse/FLINK-13943
[1] https://issues.apache.org/jira/browse/FLINK-14807

Best,
godfrey

Timo Walther <tw...@apache.org> 于2020年1月22日周三 下午9:25写道：

> Hi everyone,
>
> I agree with Stephan that the SQL Client should be a simple "shell
> around the table environment". However, I see a contradiction in the
> mentioned advantages "not limited by Flink committer reviews" and
> "quicker independent releases". If most functionality must be contained
> in the table environment, most of the development will still happen in
> the main codebase and is limited by committers. The SQL Client is
> already part of the Flink codebase. Thus, I don't see an advantage of
> moving a thin REST API to some standalone project.
>
> Fabian, Aljoscha and I also went throught the proposal we had some
> concerns that are mentioned below. In general, this is a desirable
> feature that finalizes FLIP-24.
>
> 1) Can we remove the JDBC discussion from this FLIP? It can be a
> separate discussion. Let's focus on the gateway first.
>
> 2) Flink's current REST API is a custom implementation and does not rely
> on any REST framework. It might make sense to not reuse code from
> flink-runtime but use one of the commonly used framework. Maybe @Chesnay
> in CC might have an opinion here?
>
> 3) The SQL Client gateway can only be a thin implementation if we also
> find a long-term solution for retrieving results. Currently, all APIs
> use dataStream/dataSet.collect() that might fail for larger data. We
> should solve this issue first. For example, by specifying a temporary
> result connector that would write to Kafka or file? Esp. if we would
> like to base a JDBC interface on top of this, retrieving results must
> handle big amounts of data consistently.
>
> 4) We should already start discussing changes to Table API before
> implementing this FLIP. This includes parts of FLIP-84 for returning
> "static result tables" for statements like "SHOW MODULES" or
> table.collect().
>
> What do you think?
>
> Regards,
> Timo
>
>
>
> On 22.01.20 11:44, Stephan Ewen wrote:
> > Hi all!
> >
> > I think this is a useful feature.
> >
> > Two questions about this proposal:
> >
> > (1) The SQL client tried to be a hybrid between a SQL client and a
> gateway
> > server (which blew up in complexity and never finished). Would having a
> > dedicated gateway component mean that we can simplify the client and make
> > it a simple "shell around the table environment"? I think that would be
> > good, it would make it much easier to have new Table API features
> available
> > in the SQL client.
> >
> > (2) Have you considered making this a standalone project? This seems like
> > unit of functionality that would be useful to have separately, and it
> would
> > have a few advantages:
> >
> >    - Flink codebase is already very large and hard to maintain
> >    - A separate project is simpler to develop, not limited by Flink
> > committer reviews
> >    - Quicker independent releases when new features are added.
> >
> > I see other projects successfully putting ecosystem tools into separate
> > projects, like Livy for Spark.
> > Should we do the same here?
> >
> > Best,
> > Stephan
> >
> >
> > On Fri, Jan 17, 2020 at 1:48 PM godfrey he <go...@gmail.com> wrote:
> >
> >> Hi devs,
> >>
> >> I've updated the FLIP-91 [0] according to feedbacks. Please take another
> >> look.
> >>
> >> Best,
> >> godfrey
> >>
> >> [0]
> >>
> >>
> https://docs.google.com/document/d/1DKpFdov1o_ObvrCmU-5xi-VrT6nR2gxq-BbswSSI9j8/
> >> <
> >>
> https://docs.google.com/document/d/1DKpFdov1o_ObvrCmU-5xi-VrT6nR2gxq-BbswSSI9j8/edit#heading=h.cje99dt78an2
> >>>
> >>
> >> Kurt Young <yk...@gmail.com> 于2020年1月9日周四 下午4:21写道：
> >>
> >>> Hi,
> >>>
> >>> +1 to the general idea. Supporting sql client gateway mode will bridge
> >> the
> >>> connection
> >>> between Flink SQL and production environment. Also the JDBC driver is a
> >>> quite good
> >>> supplement for usability of Flink SQL, users will have more choices to
> >> try
> >>> out Flink SQL
> >>> such as Tableau.
> >>>
> >>> I went through the document and left some comments there.
> >>>
> >>> Best,
> >>> Kurt
> >>>
> >>>
> >>> On Sun, Jan 5, 2020 at 1:57 PM tison <wa...@gmail.com> wrote:
> >>>
> >>>> The general idea sounds great. I'm going to keep up with the progress
> >>> soon.
> >>>>
> >>>> Best,
> >>>> tison.
> >>>>
> >>>>
> >>>> Bowen Li <bo...@gmail.com> 于2020年1月5日周日 下午12:59写道：
> >>>>
> >>>>> +1. It will improve user experience quite a bit.
> >>>>>
> >>>>>
> >>>>> On Thu, Jan 2, 2020 at 22:07 Yangze Guo <ka...@gmail.com> wrote:
> >>>>>
> >>>>>> Thanks for driving this, Xiaoling!
> >>>>>>
> >>>>>> +1 for supporting SQL client gateway.
> >>>>>>
> >>>>>> Best,
> >>>>>> Yangze Guo
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Jan 2, 2020 at 9:58 AM 贺小令 <go...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Hey everyone,
> >>>>>>> FLIP-24
> >>>>>>> <
> >>>>>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client
> >>>>
> >>>>>>> proposes the whole conception and architecture of SQL Client. The
> >>>>>> embedded
> >>>>>>> mode is already supported since release-1.5, which is helpful for
> >>>>>>> debugging/demo purposes.
> >>>>>>> Many users ask that how to submit a Flink job to online
> >> environment
> >>>>>> without
> >>>>>>> programming on Flink API. To solve this, we create FLIP-91 [0]
> >>> which
> >>>>>>> supports sql client gateway mode, then users can submit a job
> >>> through
> >>>>> CLI
> >>>>>>> client, REST API or JDBC.
> >>>>>>>
> >>>>>>> I'm glad that you can give me more feedback about FLIP-91.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> godfreyhe
> >>>>>>>
> >>>>>>> [0]
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

Re: [DISCUSS] FLIP-91 - Support SQL Client Gateway

Posted by Sonam Mandal <so...@gmail.com>.

Hi everyone,

I was curious about the progress on this FLIP-91. Is this actively being developed?
I believe the code is in development at https://github.com/ververica/flink-sql-gateway, is this the right REPO?

I haven't seen much activity on this since sometime last year. I wanted to understand if there is still a plan to continue developing this, and if not, I wanted to understand why.

Appreciate your help!

Thanks,
Sonam

On 2020/02/05 09:44:44, godfrey he <go...@gmail.com> wrote: 
> Hi all,
> 
> I also agree with Stephan and Timo that the SQL Client should be a simple
> "shell around the table environment". About "making this a standalone
> project", I agree with Timo, and I think keeping SQL client in Flink
> codebase can ensure SQL client integrity (has both embedded mode and
> gateway mode) and out-of-the-box experience.
> 
> > 1) Can we remove the JDBC discussion from this FLIP? It can be a
> separate discussion. Let's focus on the gateway first.
> 
> JDBC is only a small part in the whole discussion, and only JDBC on batch
> is involved in the FLIP.  So we can create another FLIP to discuss JDBC
> later.
> 
> > 2) Flink's current REST API is a custom implementation and does not rely
> on any REST framework. It might make sense to not reuse code from
> flink-runtime but use one of the commonly used framework. Maybe @Chesnay
> in CC might have an opinion here?
> 
> I think we can create a new module named flink-rest-server as the common
> framework for Flink REST API and move the common REST-related code from
> flink-runtime to flink-rest-server. What do you think?  @Chesnay
> 
> > 3) The SQL Client gateway can only be a thin implementation if we also
> find a long-term solution for retrieving results. Currently, all APIs
> use dataStream/dataSet.collect() that might fail for larger data. We
> should solve this issue first. For example, by specifying a temporary
> result connector that would write to Kafka or file? Esp. if we would
> like to base a JDBC interface on top of this, retrieving results must
> handle big amounts of data consistently.
> 
> I agree with Timo that the gateway must handle large result data on JDBC,
> but it makes little sense that the CLI client displays the large SELECT
> result.
> Based on the discussion of [0] and [1], I think table.collect (returns a
> iterator) can meet the requirement. If that, I think SQL client do not need
> the temporary result connector solution to store large data set.
> 
> > 4) We should already start discussing changes to Table API before
> implementing this FLIP. This includes parts of FLIP-84 for returning
> "static result tables" for statements like "SHOW MODULES" or
> table.collect().
> 
> I think we can push the both FLIPs forward concurrently. Because in the
> discussion of FLIP-91, we reuse the original code as much as possible, such
> as Executor, SqlCommandParser. When implement FLIP-84, we just need the
> change the common code, no need to change special code for gateway.
> 
> 
> [0] https://issues.apache.org/jira/browse/FLINK-13943
> [1] https://issues.apache.org/jira/browse/FLINK-14807
> 
> Best,
> godfrey
> 
> Timo Walther <tw...@apache.org> 于2020年1月22日周三 下午9:25写道：
> 
> > Hi everyone,
> >
> > I agree with Stephan that the SQL Client should be a simple "shell
> > around the table environment". However, I see a contradiction in the
> > mentioned advantages "not limited by Flink committer reviews" and
> > "quicker independent releases". If most functionality must be contained
> > in the table environment, most of the development will still happen in
> > the main codebase and is limited by committers. The SQL Client is
> > already part of the Flink codebase. Thus, I don't see an advantage of
> > moving a thin REST API to some standalone project.
> >
> > Fabian, Aljoscha and I also went throught the proposal we had some
> > concerns that are mentioned below. In general, this is a desirable
> > feature that finalizes FLIP-24.
> >
> > 1) Can we remove the JDBC discussion from this FLIP? It can be a
> > separate discussion. Let's focus on the gateway first.
> >
> > 2) Flink's current REST API is a custom implementation and does not rely
> > on any REST framework. It might make sense to not reuse code from
> > flink-runtime but use one of the commonly used framework. Maybe @Chesnay
> > in CC might have an opinion here?
> >
> > 3) The SQL Client gateway can only be a thin implementation if we also
> > find a long-term solution for retrieving results. Currently, all APIs
> > use dataStream/dataSet.collect() that might fail for larger data. We
> > should solve this issue first. For example, by specifying a temporary
> > result connector that would write to Kafka or file? Esp. if we would
> > like to base a JDBC interface on top of this, retrieving results must
> > handle big amounts of data consistently.
> >
> > 4) We should already start discussing changes to Table API before
> > implementing this FLIP. This includes parts of FLIP-84 for returning
> > "static result tables" for statements like "SHOW MODULES" or
> > table.collect().
> >
> > What do you think?
> >
> > Regards,
> > Timo
> >
> >
> >
> > On 22.01.20 11:44, Stephan Ewen wrote:
> > > Hi all!
> > >
> > > I think this is a useful feature.
> > >
> > > Two questions about this proposal:
> > >
> > > (1) The SQL client tried to be a hybrid between a SQL client and a
> > gateway
> > > server (which blew up in complexity and never finished). Would having a
> > > dedicated gateway component mean that we can simplify the client and make
> > > it a simple "shell around the table environment"? I think that would be
> > > good, it would make it much easier to have new Table API features
> > available
> > > in the SQL client.
> > >
> > > (2) Have you considered making this a standalone project? This seems like
> > > unit of functionality that would be useful to have separately, and it
> > would
> > > have a few advantages:
> > >
> > >    - Flink codebase is already very large and hard to maintain
> > >    - A separate project is simpler to develop, not limited by Flink
> > > committer reviews
> > >    - Quicker independent releases when new features are added.
> > >
> > > I see other projects successfully putting ecosystem tools into separate
> > > projects, like Livy for Spark.
> > > Should we do the same here?
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Fri, Jan 17, 2020 at 1:48 PM godfrey he <go...@gmail.com> wrote:
> > >
> > >> Hi devs,
> > >>
> > >> I've updated the FLIP-91 [0] according to feedbacks. Please take another
> > >> look.
> > >>
> > >> Best,
> > >> godfrey
> > >>
> > >> [0]
> > >>
> > >>
> > https://docs.google.com/document/d/1DKpFdov1o_ObvrCmU-5xi-VrT6nR2gxq-BbswSSI9j8/
> > >> <
> > >>
> > https://docs.google.com/document/d/1DKpFdov1o_ObvrCmU-5xi-VrT6nR2gxq-BbswSSI9j8/edit#heading=h.cje99dt78an2
> > >>>
> > >>
> > >> Kurt Young <yk...@gmail.com> 于2020年1月9日周四 下午4:21写道：
> > >>
> > >>> Hi,
> > >>>
> > >>> +1 to the general idea. Supporting sql client gateway mode will bridge
> > >> the
> > >>> connection
> > >>> between Flink SQL and production environment. Also the JDBC driver is a
> > >>> quite good
> > >>> supplement for usability of Flink SQL, users will have more choices to
> > >> try
> > >>> out Flink SQL
> > >>> such as Tableau.
> > >>>
> > >>> I went through the document and left some comments there.
> > >>>
> > >>> Best,
> > >>> Kurt
> > >>>
> > >>>
> > >>> On Sun, Jan 5, 2020 at 1:57 PM tison <wa...@gmail.com> wrote:
> > >>>
> > >>>> The general idea sounds great. I'm going to keep up with the progress
> > >>> soon.
> > >>>>
> > >>>> Best,
> > >>>> tison.
> > >>>>
> > >>>>
> > >>>> Bowen Li <bo...@gmail.com> 于2020年1月5日周日 下午12:59写道：
> > >>>>
> > >>>>> +1. It will improve user experience quite a bit.
> > >>>>>
> > >>>>>
> > >>>>> On Thu, Jan 2, 2020 at 22:07 Yangze Guo <ka...@gmail.com> wrote:
> > >>>>>
> > >>>>>> Thanks for driving this, Xiaoling!
> > >>>>>>
> > >>>>>> +1 for supporting SQL client gateway.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Yangze Guo
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, Jan 2, 2020 at 9:58 AM 贺小令 <go...@gmail.com> wrote:
> > >>>>>>>
> > >>>>>>> Hey everyone,
> > >>>>>>> FLIP-24
> > >>>>>>> <
> > >>>>>
> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client
> > >>>>
> > >>>>>>> proposes the whole conception and architecture of SQL Client. The
> > >>>>>> embedded
> > >>>>>>> mode is already supported since release-1.5, which is helpful for
> > >>>>>>> debugging/demo purposes.
> > >>>>>>> Many users ask that how to submit a Flink job to online
> > >> environment
> > >>>>>> without
> > >>>>>>> programming on Flink API. To solve this, we create FLIP-91 [0]
> > >>> which
> > >>>>>>> supports sql client gateway mode, then users can submit a job
> > >>> through
> > >>>>> CLI
> > >>>>>>> client, REST API or JDBC.
> > >>>>>>>
> > >>>>>>> I'm glad that you can give me more feedback about FLIP-91.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> godfreyhe
> > >>>>>>>
> > >>>>>>> [0]
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>