You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by jincheng sun <su...@gmail.com> on 2018/12/10 10:05:41 UTC

[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Hi All,

According to the feedback from users, the design of TableEnvironment is
very inconvenient for users, and often mistakenly imported by IDE,
especially for Java users, such as:

ExecutionEnvironment env = ...

BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);


The user does not know which BatchTableEnvironment should be imported,
because there are three implementations of BatchTableEnvironment, shown as
below:


1. org.apache.flink.table.api.BatchTableEnvironment 2.
> org.apache.flink.table.api.java.BatchTableEnvironment 3.
> org.apache.flink.table.api.scala.BatchTableEnvironment


[image: image.png]


This brings unnecessary inconveniences to the flink user. To solve this
problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself discussed
offline a bit and propose to change the inheritance diagram of
TableEnvironment is shown as follows:

1. AbstractTaleEnvironment - rename current TableEnvironment to
> AbstractTableEnvironment, The functionality implemented by Abstract
> TableEnvironment is stream and batch shared.

2. TableEnvironment - Create a new TableEnvironment(abstract), and defined
> all methods in (java/scala)StreamTableEnvironment and
> (java/scala)BatchTableEnvironment. In the implementation of
> BatchTableEnviroment and StreamTableEnviroment, the unsupported operations
> will be reported as an error.

[image: image.png]
Then the usage as follows:

ExecutionEnvironment env = …

TableEnvironment tEnv = TableEnvironment.getTableEnvironment(env)


For detailed proposals please refer to the Google doc:
https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing

Any mail feedback and Google doc comment are welcome.


Thanks,

Jincheng

Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by jincheng sun <su...@gmail.com>.
Hi Timo,

Thanks for your summary of design in FLINK-11067's discuss!
This proposal has two core objectives which I mentioned:

 1. Must solve user import problems;

 2. Unify  interface definitions of TableEnvironment for stream and batch.


I think FLINK-11067 can fit the #1. and we need create a new JIRA. for #2.
i.e. User only take one TableEnvironment, e.g.:

ExecutionEnvironment env = …

TableEnvironment tEnv = TableEnvironment.getTableEnvironment(env)


 Of course, stream and batch unification will undermine existing
compatibility, but I still think this is an effort we must make. At the
SQL/TableAPI level, users don't need to know the existence of
`BatchTableEnvironment` and `StreamTableEnvironment`. What do you think?

Thanks,
Jincheng

jincheng sun <su...@gmail.com> 于2018年12月12日周三 下午3:44写道:

> HI Timo,
>
> Thanks for your feedback! And I'm glad to hear that you are already
> thinking about import issues!
>
> 1. I commented on the solution you mentioned in FLINK-11067. I have the
> same questions with Dian Fu, about the design of compatibility in the
> google doc, I look forward to your reply.
>
> 2. About unified stream batch interface definition
>
>> However, I don't like the design of putting all methods of Batch and
>> Stream environments into the base class and throw exceptions if not
>> supported by base classes. This sounds not like a nice object oriented
>> design and confuses users.
>
>
> At present, we have unified the stream and batch interface definitions on
> the Table, such as the `orderBy` operator. Although it only supports time
> order on the stream, we still have the interface definition unified, check
> it at runtime, if you want  `orderBy` string on the stream, will throw an
> exception.  So we should unify the interface definition of
> TableEnvironment in some way. When the stream and batch execution modes are
> unified and Stream/BatchSource/sink are unified , a job can be run in
> mix(Stream/Batch) mode. By then, a table can be either a toDataSet or a
> ToDataStream.
>
> 3. About Scala dependencies
>  IMO. It is not expected to solve the Scala dependency problem in this
> proposal(keep status quo). The Scala dependency problem is the goal of
> FLIP-28.
>  This proposal has two core objectives:
>  1) Must solve user import problems;
>  2) Do our best to unify  interface definitions of TableEnvironment for
> stream and batch.
>
> So, I think we can solve the user problem and unify the interface between
> Stream and Batch firstly. Regarding the separation of Scala and Java, I
> agree that when we do FLIP-28, we can have a Java abstraction and a Scala
> abstraction in `flink-table-api.java` and `flink-table-api.scala`
> respectively which we mentioned in Q/A session of google doc.
>
> Best,
> Jincheng
>
>
> Timo Walther <tw...@apache.org> 于2018年12月11日周二 下午3:13写道:
>
>> Hi Jincheng,
>>
>> thanks for the proposal. I totally agree with the problem of having 3
>> StreamTableEnvironments and 3 BatchTableEnvironments. We also identified
>> this problem when doing Flink trainings and introductions to the Table &
>> SQL API.
>>
>> Actually, @Dawid and I were already discussing to remove this
>> shortcoming while working on FLINK-11067 [1]. The porting allows to fix
>> the class hierarchy because some visibilities of members change as well
>> from Scala to Java. This would not break backwards compatibility as the
>> base classes should not be used by users anyway.
>>
>> However, I don't like the design of putting all methods of Batch and
>> Stream environments into the base class and throw exceptions if not
>> supported by base classes. This sounds not like a nice object oriented
>> design and confuses users.
>>
>> I added some comments to the document. I think we can improve the
>> current situation without breaking backwards compatibility. Methods that
>> interact with Scala and Java API such as toDataSet/toDataStream should
>> not be moved to an abstract class as they would otherwise pull in Scala
>> dependencies transitively or do not incoperate with the type extraction
>> logic of the target API.
>>
>> Regards,
>> Timo
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-11067
>>
>>
>> Am 11.12.18 um 06:12 schrieb Zhang, Xuefu:
>> > Hi Jincheng,
>> >
>> > Thanks for bringing this up. It seems making good sense to me. However,
>> one concern I have is about backward compatibility. Could you clarify
>> whether existing user program will break with the proposed changes?
>> >
>> > The answer to the question would largely determine when this can be
>> introduced.
>> >
>> > Thanks,
>> > Xuefu
>> >
>> >
>> > ------------------------------------------------------------------
>> > Sender:jincheng sun <su...@gmail.com>
>> > Sent at:2018 Dec 10 (Mon) 18:14
>> > Recipient:dev <de...@flink.apache.org>
>> > Subject:[DISCUSS] Enhance convenience of TableEnvironment in
>> TableAPI/SQL
>> >
>> > Hi All,
>> >
>> > According to the feedback from users, the design of TableEnvironment is
>> very inconvenient for users, and often mistakenly imported by IDE,
>> especially for Java users, such as:
>> >
>> > ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
>> TableEnvironment.getTableEnvironment(env);
>> >
>> > The user does not know which BatchTableEnvironment should be imported,
>> because there are three implementations of BatchTableEnvironment, shown as
>> below:
>> >
>> > 1. org.apache.flink.table.api.BatchTableEnvironment 2.
>> org.apache.flink.table.api.java.BatchTableEnvironment 3.
>> org.apache.flink.table.api.scala.BatchTableEnvironment
>> > [image.png]
>> >
>> >
>> > This brings unnecessary inconveniences to the flink user. To solve this
>> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
>> discussed offline a bit and propose to change the inheritance diagram of
>> TableEnvironment is shown as follows:
>> >   1. AbstractTaleEnvironment - rename current TableEnvironment to
>> AbstractTableEnvironment, The functionality implemented by Abstract
>> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
>> new TableEnvironment(abstract), and defined all methods in
>> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
>> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
>> the unsupported operations will be reported as an error.
>> > [image.png]
>> > Then the usage as follows:
>> >
>> > ExecutionEnvironment env = …TableEnvironment tEnv =
>> TableEnvironment.getTableEnvironment(env)
>> > For detailed proposals please refer to the Google doc:
>> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>> >
>> > Any mail feedback and Google doc comment are welcome.
>> >
>> > Thanks,
>> > Jincheng
>> >
>>
>>

Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by jincheng sun <su...@gmail.com>.
HI Timo,

Thanks for your feedback! And I'm glad to hear that you are already
thinking about import issues!

1. I commented on the solution you mentioned in FLINK-11067. I have the
same questions with Dian Fu, about the design of compatibility in the
google doc, I look forward to your reply.

2. About unified stream batch interface definition

> However, I don't like the design of putting all methods of Batch and
> Stream environments into the base class and throw exceptions if not
> supported by base classes. This sounds not like a nice object oriented
> design and confuses users.


At present, we have unified the stream and batch interface definitions on
the Table, such as the `orderBy` operator. Although it only supports time
order on the stream, we still have the interface definition unified, check
it at runtime, if you want  `orderBy` string on the stream, will throw an
exception.  So we should unify the interface definition of
TableEnvironment in some way. When the stream and batch execution modes are
unified and Stream/BatchSource/sink are unified , a job can be run in
mix(Stream/Batch) mode. By then, a table can be either a toDataSet or a
ToDataStream.

3. About Scala dependencies
 IMO. It is not expected to solve the Scala dependency problem in this
proposal(keep status quo). The Scala dependency problem is the goal of
FLIP-28.
 This proposal has two core objectives:
 1) Must solve user import problems;
 2) Do our best to unify  interface definitions of TableEnvironment for
stream and batch.

So, I think we can solve the user problem and unify the interface between
Stream and Batch firstly. Regarding the separation of Scala and Java, I
agree that when we do FLIP-28, we can have a Java abstraction and a Scala
abstraction in `flink-table-api.java` and `flink-table-api.scala`
respectively which we mentioned in Q/A session of google doc.

Best,
Jincheng


Timo Walther <tw...@apache.org> 于2018年12月11日周二 下午3:13写道:

> Hi Jincheng,
>
> thanks for the proposal. I totally agree with the problem of having 3
> StreamTableEnvironments and 3 BatchTableEnvironments. We also identified
> this problem when doing Flink trainings and introductions to the Table &
> SQL API.
>
> Actually, @Dawid and I were already discussing to remove this
> shortcoming while working on FLINK-11067 [1]. The porting allows to fix
> the class hierarchy because some visibilities of members change as well
> from Scala to Java. This would not break backwards compatibility as the
> base classes should not be used by users anyway.
>
> However, I don't like the design of putting all methods of Batch and
> Stream environments into the base class and throw exceptions if not
> supported by base classes. This sounds not like a nice object oriented
> design and confuses users.
>
> I added some comments to the document. I think we can improve the
> current situation without breaking backwards compatibility. Methods that
> interact with Scala and Java API such as toDataSet/toDataStream should
> not be moved to an abstract class as they would otherwise pull in Scala
> dependencies transitively or do not incoperate with the type extraction
> logic of the target API.
>
> Regards,
> Timo
>
> [1] https://issues.apache.org/jira/browse/FLINK-11067
>
>
> Am 11.12.18 um 06:12 schrieb Zhang, Xuefu:
> > Hi Jincheng,
> >
> > Thanks for bringing this up. It seems making good sense to me. However,
> one concern I have is about backward compatibility. Could you clarify
> whether existing user program will break with the proposed changes?
> >
> > The answer to the question would largely determine when this can be
> introduced.
> >
> > Thanks,
> > Xuefu
> >
> >
> > ------------------------------------------------------------------
> > Sender:jincheng sun <su...@gmail.com>
> > Sent at:2018 Dec 10 (Mon) 18:14
> > Recipient:dev <de...@flink.apache.org>
> > Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
> >
> > Hi All,
> >
> > According to the feedback from users, the design of TableEnvironment is
> very inconvenient for users, and often mistakenly imported by IDE,
> especially for Java users, such as:
> >
> > ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
> TableEnvironment.getTableEnvironment(env);
> >
> > The user does not know which BatchTableEnvironment should be imported,
> because there are three implementations of BatchTableEnvironment, shown as
> below:
> >
> > 1. org.apache.flink.table.api.BatchTableEnvironment 2.
> org.apache.flink.table.api.java.BatchTableEnvironment 3.
> org.apache.flink.table.api.scala.BatchTableEnvironment
> > [image.png]
> >
> >
> > This brings unnecessary inconveniences to the flink user. To solve this
> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
> discussed offline a bit and propose to change the inheritance diagram of
> TableEnvironment is shown as follows:
> >   1. AbstractTaleEnvironment - rename current TableEnvironment to
> AbstractTableEnvironment, The functionality implemented by Abstract
> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
> new TableEnvironment(abstract), and defined all methods in
> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
> the unsupported operations will be reported as an error.
> > [image.png]
> > Then the usage as follows:
> >
> > ExecutionEnvironment env = …TableEnvironment tEnv =
> TableEnvironment.getTableEnvironment(env)
> > For detailed proposals please refer to the Google doc:
> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
> >
> > Any mail feedback and Google doc comment are welcome.
> >
> > Thanks,
> > Jincheng
> >
>
>

Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by Timo Walther <tw...@apache.org>.
Hi Jincheng,

thanks for the proposal. I totally agree with the problem of having 3 
StreamTableEnvironments and 3 BatchTableEnvironments. We also identified 
this problem when doing Flink trainings and introductions to the Table & 
SQL API.

Actually, @Dawid and I were already discussing to remove this 
shortcoming while working on FLINK-11067 [1]. The porting allows to fix 
the class hierarchy because some visibilities of members change as well 
from Scala to Java. This would not break backwards compatibility as the 
base classes should not be used by users anyway.

However, I don't like the design of putting all methods of Batch and 
Stream environments into the base class and throw exceptions if not 
supported by base classes. This sounds not like a nice object oriented 
design and confuses users.

I added some comments to the document. I think we can improve the 
current situation without breaking backwards compatibility. Methods that 
interact with Scala and Java API such as toDataSet/toDataStream should 
not be moved to an abstract class as they would otherwise pull in Scala 
dependencies transitively or do not incoperate with the type extraction 
logic of the target API.

Regards,
Timo

[1] https://issues.apache.org/jira/browse/FLINK-11067


Am 11.12.18 um 06:12 schrieb Zhang, Xuefu:
> Hi Jincheng,
>
> Thanks for bringing this up. It seems making good sense to me. However, one concern I have is about backward compatibility. Could you clarify whether existing user program will break with the proposed changes?
>
> The answer to the question would largely determine when this can be introduced.
>
> Thanks,
> Xuefu
>
>
> ------------------------------------------------------------------
> Sender:jincheng sun <su...@gmail.com>
> Sent at:2018 Dec 10 (Mon) 18:14
> Recipient:dev <de...@flink.apache.org>
> Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
>
> Hi All,
>
> According to the feedback from users, the design of TableEnvironment is very inconvenient for users, and often mistakenly imported by IDE, especially for Java users, such as:
>
> ExecutionEnvironment env = ...BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);
>
> The user does not know which BatchTableEnvironment should be imported, because there are three implementations of BatchTableEnvironment, shown as below:
>
> 1. org.apache.flink.table.api.BatchTableEnvironment 2. org.apache.flink.table.api.java.BatchTableEnvironment 3. org.apache.flink.table.api.scala.BatchTableEnvironment
> [image.png]
>
>
> This brings unnecessary inconveniences to the flink user. To solve this problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself discussed offline a bit and propose to change the inheritance diagram of TableEnvironment is shown as follows:
>   1. AbstractTaleEnvironment - rename current TableEnvironment to AbstractTableEnvironment, The functionality implemented by Abstract TableEnvironment is stream and batch shared.2. TableEnvironment - Create a new TableEnvironment(abstract), and defined all methods in (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment. In the implementation of BatchTableEnviroment and StreamTableEnviroment, the unsupported operations will be reported as an error.
> [image.png]
> Then the usage as follows:
>
> ExecutionEnvironment env = …TableEnvironment tEnv = TableEnvironment.getTableEnvironment(env)
> For detailed proposals please refer to the Google doc: https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>
> Any mail feedback and Google doc comment are welcome.
>
> Thanks,
> Jincheng
>


Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by "fudian.fd" <fu...@alibaba-inc.com>.
Hi Timo,

Thanks a lot for sharing the solution so quickly. I have left some comments on the JIRA page mainly about the backwards compatibility. Looking forward to your reply.

Thanks,
Dian

> 在 2018年12月11日,下午10:48,Timo Walther <tw...@apache.org> 写道:
> 
> Hi Dian,
> 
> I proposed a solution that should be backwards compatible and solves our Maven dependency problems in the corresponding issue.
> 
> I'm happy about feedback.
> 
> Regards,
> Timo
> 
> 
> Am 11.12.18 um 11:23 schrieb fudian.fd:
>> Hi Timo,
>> 
>> Thanks a lot for your reply. I think the cause to this problem is that TableEnvironment.getTableEnvironment() returns the actual TableEnvironment implementations instead of an interface or an abstract base class. Even the porting of FLINK-11067 is done, I'm afraid that the problem may still exist. For example, for batch TableEnvironment, both java.BatchTableEnvironment and api.BatchTableEnvironment may be prompted for import. Could you share more information about what you want to do with the 7 TableEnvironments in FLINK-11067? Especially api.BatchTableEnvironment, api.StreamTableEnvironment and TableEnvironment.
>> 
>> Thanks,
>> Dian
>> 
>>> 在 2018年12月11日,下午3:41,jincheng sun <su...@gmail.com> 写道:
>>> 
>>> Hi Xuefu,
>>> 
>>> Thanks for your feedback, and mention the compatibility issues.
>>> You are right the change will result version incompatibility. And we my
>>> plan it's will be released in the version of 1.8.x.
>>> 
>>> To be frank, we have considered the compatibility approach, which is to
>>> retain the current TableEnvironment, and then create a new one, such as
>>> "GeneralTableEnvironment" for unified abstraction, and then Deprecated the
>>> TableEnvironment. But we feel that the code is not clean enough, and the
>>> long-term goal is that we need to make StreamTableEnvironment and
>>> BatchTableEnvironment transparent to the user, so I tend to release this
>>> change in 1.8.x, keeping the status quo in 1.7.x. What do you think? Any
>>> feedback is welcome!
>>> 
>>> Thanks,
>>> Jincheng
>>> 
>>> 
>>> Zhang, Xuefu <xu...@alibaba-inc.com> 于2018年12月11日周二 下午1:13写道:
>>> 
>>>> Hi Jincheng,
>>>> 
>>>> Thanks for bringing this up. It seems making good sense to me. However,
>>>> one concern I have is about backward compatibility. Could you clarify
>>>> whether existing user program will break with the proposed changes?
>>>> 
>>>> The answer to the question would largely determine when this can be
>>>> introduced.
>>>> 
>>>> Thanks,
>>>> Xuefu
>>>> 
>>>> 
>>>> ------------------------------------------------------------------
>>>> Sender:jincheng sun <su...@gmail.com>
>>>> Sent at:2018 Dec 10 (Mon) 18:14
>>>> Recipient:dev <de...@flink.apache.org>
>>>> Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
>>>> 
>>>> Hi All,
>>>> 
>>>> According to the feedback from users, the design of TableEnvironment is
>>>> very inconvenient for users, and often mistakenly imported by IDE,
>>>> especially for Java users, such as:
>>>> 
>>>> ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
>>>> TableEnvironment.getTableEnvironment(env);
>>>> 
>>>> The user does not know which BatchTableEnvironment should be imported,
>>>> because there are three implementations of BatchTableEnvironment, shown as
>>>> below:
>>>> 
>>>> 1. org.apache.flink.table.api.BatchTableEnvironment 2.
>>>> org.apache.flink.table.api.java.BatchTableEnvironment 3.
>>>> org.apache.flink.table.api.scala.BatchTableEnvironment
>>>> [image.png]
>>>> 
>>>> 
>>>> This brings unnecessary inconveniences to the flink user. To solve this
>>>> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
>>>> discussed offline a bit and propose to change the inheritance diagram of
>>>> TableEnvironment is shown as follows:
>>>> 1. AbstractTaleEnvironment - rename current TableEnvironment to
>>>> AbstractTableEnvironment, The functionality implemented by Abstract
>>>> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
>>>> new TableEnvironment(abstract), and defined all methods in
>>>> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
>>>> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
>>>> the unsupported operations will be reported as an error.
>>>> [image.png]
>>>> Then the usage as follows:
>>>> 
>>>> ExecutionEnvironment env = …TableEnvironment tEnv =
>>>> TableEnvironment.getTableEnvironment(env)
>>>> For detailed proposals please refer to the Google doc:
>>>> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>>>> 
>>>> Any mail feedback and Google doc comment are welcome.
>>>> 
>>>> Thanks,
>>>> Jincheng
>>>> 
>>>> 


Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by Timo Walther <tw...@apache.org>.
Hi Dian,

I proposed a solution that should be backwards compatible and solves our 
Maven dependency problems in the corresponding issue.

I'm happy about feedback.

Regards,
Timo


Am 11.12.18 um 11:23 schrieb fudian.fd:
> Hi Timo,
>
> Thanks a lot for your reply. I think the cause to this problem is that TableEnvironment.getTableEnvironment() returns the actual TableEnvironment implementations instead of an interface or an abstract base class. Even the porting of FLINK-11067 is done, I'm afraid that the problem may still exist. For example, for batch TableEnvironment, both java.BatchTableEnvironment and api.BatchTableEnvironment may be prompted for import. Could you share more information about what you want to do with the 7 TableEnvironments in FLINK-11067? Especially api.BatchTableEnvironment, api.StreamTableEnvironment and TableEnvironment.
>
> Thanks,
> Dian
>
>> 在 2018年12月11日,下午3:41,jincheng sun <su...@gmail.com> 写道:
>>
>> Hi Xuefu,
>>
>> Thanks for your feedback, and mention the compatibility issues.
>> You are right the change will result version incompatibility. And we my
>> plan it's will be released in the version of 1.8.x.
>>
>> To be frank, we have considered the compatibility approach, which is to
>> retain the current TableEnvironment, and then create a new one, such as
>> "GeneralTableEnvironment" for unified abstraction, and then Deprecated the
>> TableEnvironment. But we feel that the code is not clean enough, and the
>> long-term goal is that we need to make StreamTableEnvironment and
>> BatchTableEnvironment transparent to the user, so I tend to release this
>> change in 1.8.x, keeping the status quo in 1.7.x. What do you think? Any
>> feedback is welcome!
>>
>> Thanks,
>> Jincheng
>>
>>
>> Zhang, Xuefu <xu...@alibaba-inc.com> 于2018年12月11日周二 下午1:13写道:
>>
>>> Hi Jincheng,
>>>
>>> Thanks for bringing this up. It seems making good sense to me. However,
>>> one concern I have is about backward compatibility. Could you clarify
>>> whether existing user program will break with the proposed changes?
>>>
>>> The answer to the question would largely determine when this can be
>>> introduced.
>>>
>>> Thanks,
>>> Xuefu
>>>
>>>
>>> ------------------------------------------------------------------
>>> Sender:jincheng sun <su...@gmail.com>
>>> Sent at:2018 Dec 10 (Mon) 18:14
>>> Recipient:dev <de...@flink.apache.org>
>>> Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
>>>
>>> Hi All,
>>>
>>> According to the feedback from users, the design of TableEnvironment is
>>> very inconvenient for users, and often mistakenly imported by IDE,
>>> especially for Java users, such as:
>>>
>>> ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
>>> TableEnvironment.getTableEnvironment(env);
>>>
>>> The user does not know which BatchTableEnvironment should be imported,
>>> because there are three implementations of BatchTableEnvironment, shown as
>>> below:
>>>
>>> 1. org.apache.flink.table.api.BatchTableEnvironment 2.
>>> org.apache.flink.table.api.java.BatchTableEnvironment 3.
>>> org.apache.flink.table.api.scala.BatchTableEnvironment
>>> [image.png]
>>>
>>>
>>> This brings unnecessary inconveniences to the flink user. To solve this
>>> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
>>> discussed offline a bit and propose to change the inheritance diagram of
>>> TableEnvironment is shown as follows:
>>> 1. AbstractTaleEnvironment - rename current TableEnvironment to
>>> AbstractTableEnvironment, The functionality implemented by Abstract
>>> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
>>> new TableEnvironment(abstract), and defined all methods in
>>> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
>>> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
>>> the unsupported operations will be reported as an error.
>>> [image.png]
>>> Then the usage as follows:
>>>
>>> ExecutionEnvironment env = …TableEnvironment tEnv =
>>> TableEnvironment.getTableEnvironment(env)
>>> For detailed proposals please refer to the Google doc:
>>> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>>>
>>> Any mail feedback and Google doc comment are welcome.
>>>
>>> Thanks,
>>> Jincheng
>>>
>>>


Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by "fudian.fd" <fu...@alibaba-inc.com>.
Hi Timo,

Thanks a lot for your reply. I think the cause to this problem is that TableEnvironment.getTableEnvironment() returns the actual TableEnvironment implementations instead of an interface or an abstract base class. Even the porting of FLINK-11067 is done, I'm afraid that the problem may still exist. For example, for batch TableEnvironment, both java.BatchTableEnvironment and api.BatchTableEnvironment may be prompted for import. Could you share more information about what you want to do with the 7 TableEnvironments in FLINK-11067? Especially api.BatchTableEnvironment, api.StreamTableEnvironment and TableEnvironment.

Thanks,
Dian

> 在 2018年12月11日,下午3:41,jincheng sun <su...@gmail.com> 写道:
> 
> Hi Xuefu,
> 
> Thanks for your feedback, and mention the compatibility issues.
> You are right the change will result version incompatibility. And we my
> plan it's will be released in the version of 1.8.x.
> 
> To be frank, we have considered the compatibility approach, which is to
> retain the current TableEnvironment, and then create a new one, such as
> "GeneralTableEnvironment" for unified abstraction, and then Deprecated the
> TableEnvironment. But we feel that the code is not clean enough, and the
> long-term goal is that we need to make StreamTableEnvironment and
> BatchTableEnvironment transparent to the user, so I tend to release this
> change in 1.8.x, keeping the status quo in 1.7.x. What do you think? Any
> feedback is welcome!
> 
> Thanks,
> Jincheng
> 
> 
> Zhang, Xuefu <xu...@alibaba-inc.com> 于2018年12月11日周二 下午1:13写道:
> 
>> Hi Jincheng,
>> 
>> Thanks for bringing this up. It seems making good sense to me. However,
>> one concern I have is about backward compatibility. Could you clarify
>> whether existing user program will break with the proposed changes?
>> 
>> The answer to the question would largely determine when this can be
>> introduced.
>> 
>> Thanks,
>> Xuefu
>> 
>> 
>> ------------------------------------------------------------------
>> Sender:jincheng sun <su...@gmail.com>
>> Sent at:2018 Dec 10 (Mon) 18:14
>> Recipient:dev <de...@flink.apache.org>
>> Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
>> 
>> Hi All,
>> 
>> According to the feedback from users, the design of TableEnvironment is
>> very inconvenient for users, and often mistakenly imported by IDE,
>> especially for Java users, such as:
>> 
>> ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
>> TableEnvironment.getTableEnvironment(env);
>> 
>> The user does not know which BatchTableEnvironment should be imported,
>> because there are three implementations of BatchTableEnvironment, shown as
>> below:
>> 
>> 1. org.apache.flink.table.api.BatchTableEnvironment 2.
>> org.apache.flink.table.api.java.BatchTableEnvironment 3.
>> org.apache.flink.table.api.scala.BatchTableEnvironment
>> [image.png]
>> 
>> 
>> This brings unnecessary inconveniences to the flink user. To solve this
>> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
>> discussed offline a bit and propose to change the inheritance diagram of
>> TableEnvironment is shown as follows:
>> 1. AbstractTaleEnvironment - rename current TableEnvironment to
>> AbstractTableEnvironment, The functionality implemented by Abstract
>> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
>> new TableEnvironment(abstract), and defined all methods in
>> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
>> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
>> the unsupported operations will be reported as an error.
>> [image.png]
>> Then the usage as follows:
>> 
>> ExecutionEnvironment env = …TableEnvironment tEnv =
>> TableEnvironment.getTableEnvironment(env)
>> For detailed proposals please refer to the Google doc:
>> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>> 
>> Any mail feedback and Google doc comment are welcome.
>> 
>> Thanks,
>> Jincheng
>> 
>> 


Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by jincheng sun <su...@gmail.com>.
Hi Xuefu,

Thanks for your feedback, and mention the compatibility issues.
You are right the change will result version incompatibility. And we my
plan it's will be released in the version of 1.8.x.

To be frank, we have considered the compatibility approach, which is to
retain the current TableEnvironment, and then create a new one, such as
"GeneralTableEnvironment" for unified abstraction, and then Deprecated the
TableEnvironment. But we feel that the code is not clean enough, and the
long-term goal is that we need to make StreamTableEnvironment and
BatchTableEnvironment transparent to the user, so I tend to release this
change in 1.8.x, keeping the status quo in 1.7.x. What do you think? Any
feedback is welcome!

Thanks,
Jincheng


Zhang, Xuefu <xu...@alibaba-inc.com> 于2018年12月11日周二 下午1:13写道:

> Hi Jincheng,
>
> Thanks for bringing this up. It seems making good sense to me. However,
> one concern I have is about backward compatibility. Could you clarify
> whether existing user program will break with the proposed changes?
>
> The answer to the question would largely determine when this can be
> introduced.
>
> Thanks,
> Xuefu
>
>
> ------------------------------------------------------------------
> Sender:jincheng sun <su...@gmail.com>
> Sent at:2018 Dec 10 (Mon) 18:14
> Recipient:dev <de...@flink.apache.org>
> Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL
>
> Hi All,
>
> According to the feedback from users, the design of TableEnvironment is
> very inconvenient for users, and often mistakenly imported by IDE,
> especially for Java users, such as:
>
> ExecutionEnvironment env = ...BatchTableEnvironment tEnv =
> TableEnvironment.getTableEnvironment(env);
>
> The user does not know which BatchTableEnvironment should be imported,
> because there are three implementations of BatchTableEnvironment, shown as
> below:
>
> 1. org.apache.flink.table.api.BatchTableEnvironment 2.
> org.apache.flink.table.api.java.BatchTableEnvironment 3.
> org.apache.flink.table.api.scala.BatchTableEnvironment
> [image.png]
>
>
> This brings unnecessary inconveniences to the flink user. To solve this
> problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself
> discussed offline a bit and propose to change the inheritance diagram of
> TableEnvironment is shown as follows:
>  1. AbstractTaleEnvironment - rename current TableEnvironment to
> AbstractTableEnvironment, The functionality implemented by Abstract
> TableEnvironment is stream and batch shared.2. TableEnvironment - Create a
> new TableEnvironment(abstract), and defined all methods in
> (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment.
> In the implementation of BatchTableEnviroment and StreamTableEnviroment,
> the unsupported operations will be reported as an error.
> [image.png]
> Then the usage as follows:
>
> ExecutionEnvironment env = …TableEnvironment tEnv =
> TableEnvironment.getTableEnvironment(env)
> For detailed proposals please refer to the Google doc:
> https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing
>
> Any mail feedback and Google doc comment are welcome.
>
> Thanks,
> Jincheng
>
>

Re: [DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Posted by "Zhang, Xuefu" <xu...@alibaba-inc.com>.
Hi Jincheng,

Thanks for bringing this up. It seems making good sense to me. However, one concern I have is about backward compatibility. Could you clarify whether existing user program will break with the proposed changes?

The answer to the question would largely determine when this can be introduced.

Thanks,
Xuefu


------------------------------------------------------------------
Sender:jincheng sun <su...@gmail.com>
Sent at:2018 Dec 10 (Mon) 18:14
Recipient:dev <de...@flink.apache.org>
Subject:[DISCUSS] Enhance convenience of TableEnvironment in TableAPI/SQL

Hi All,

According to the feedback from users, the design of TableEnvironment is very inconvenient for users, and often mistakenly imported by IDE, especially for Java users, such as:

ExecutionEnvironment env = ...BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);

The user does not know which BatchTableEnvironment should be imported, because there are three implementations of BatchTableEnvironment, shown as below:

1. org.apache.flink.table.api.BatchTableEnvironment 2. org.apache.flink.table.api.java.BatchTableEnvironment 3. org.apache.flink.table.api.scala.BatchTableEnvironment
[image.png]


This brings unnecessary inconveniences to the flink user. To solve this problem, Wei Zhong, Hequn Cheng, Dian Fu, Shaoxuan Wang and myself discussed offline a bit and propose to change the inheritance diagram of TableEnvironment is shown as follows:
 1. AbstractTaleEnvironment - rename current TableEnvironment to AbstractTableEnvironment, The functionality implemented by Abstract TableEnvironment is stream and batch shared.2. TableEnvironment - Create a new TableEnvironment(abstract), and defined all methods in (java/scala)StreamTableEnvironment and (java/scala)BatchTableEnvironment. In the implementation of BatchTableEnviroment and StreamTableEnviroment, the unsupported operations will be reported as an error.
[image.png]
Then the usage as follows:

ExecutionEnvironment env = …TableEnvironment tEnv = TableEnvironment.getTableEnvironment(env)
For detailed proposals please refer to the Google doc: https://docs.google.com/document/d/1t-AUGuaChADddyJi6e0WLsTDEnf9ZkupvvBiQ4yTTEI/edit?usp=sharing

Any mail feedback and Google doc comment are welcome.

Thanks,
Jincheng