You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Timo Walther <tw...@apache.org> on 2018/11/22 09:54:42 UTC

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Hi everyone,

I would like to continue this discussion thread and convert the outcome 
into a FLIP such that users and contributors know what to expect in the 
upcoming releases.

I created a design document [1] that clarifies our motivation why we 
want to do this, how a Maven module structure could look like, and a 
suggestion for a migration plan.

It would be great to start with the efforts for the 1.8 release such 
that new features can be developed in Java and major refactorings such 
as improvements to the connectors and external catalog support are not 
blocked.

Please let me know what you think.

Regards,
Timo

[1] 
https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing


Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> Hi Piotr,
>
> thanks for bumping this thread and thanks for Xingcan for the comments.
>
> I think the first step would be to separate the flink-table module into
> multiple sub modules. These could be:
>
> - flink-table-api: All API facing classes. Can be later divided further
> into Java/Scala Table API/SQL
> - flink-table-planning: involves all planning (basically everything we do
> with Calcite)
> - flink-table-runtime: the runtime code
>
> IMO, a realistic mid-term goal is to have the runtime module and certain
> parts of the planning module ported to Java.
> The api module will be much harder to port because of several dependencies
> to Scala core classes (the parser framework, tree iterations, etc.). I'm
> not saying we should not port this to Java, but it is not clear to me (yet)
> how to do it.
>
> I think flink-table-runtime should not be too hard to port. The code does
> not make use of many Scala features, i.e., it's writing very Java-like.
> Also, there are not many dependencies and operators can be individually
> ported step-by-step.
> For flink-table-planning, we can have certain packages that we port to Java
> like planning rules or plan nodes. The related classes mostly extend
> Calcite's Java interfaces/classes and would be natural choices for being
> ported. The code generation classes will require more effort to port. There
> are also some dependencies in planning on the api module that we would need
> to resolve somehow.
>
> For SQL most work when adding new features is done in the planning and
> runtime modules. So, this separation should already reduce "technological
> dept" quite a lot.
> The Table API depends much more on Scala than SQL.
>
> Cheers, Fabian
>
>
>
> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
>
>> Hi all,
>>
>> I also think about this problem these days and here are my thoughts.
>>
>> 1) We must admit that it’s really a tough task to interoperate with Java
>> and Scala. E.g., they have different collection types (Scala collections
>> v.s. java.util.*) and in Java, it's hard to implement a method which takes
>> Scala functions as parameters. Considering the major part of the code base
>> is implemented in Java, +1 for this goal from a long-term view.
>>
>> 2) The ideal solution would be to just expose a Scala API and make all the
>> other parts Scala-free. But I am not sure if it could be achieved even in a
>> long-term. Thus as Timo suggested, keep the Scala codes in
>> "flink-table-core" would be a compromise solution.
>>
>> 3) If the community makes the final decision, maybe any new features
>> should be added in Java (regardless of the modules), in order to prevent
>> the Scala codes from growing.
>>
>> Best,
>> Xingcan
>>
>>
>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <pi...@data-artisans.com>
>> wrote:
>>> Bumping the topic.
>>>
>>> If we want to do this, the sooner we decide, the less code we will have
>> to rewrite. I have some objections/counter proposals to Fabian's proposal
>> of doing it module wise and one module at a time.
>>> First, I do not see a problem of having java/scala code even within one
>> module, especially not if there are clean boundaries. Like we could have
>> API in Scala and optimizer rules/logical nodes written in Java in the same
>> module. However I haven’t previously maintained mixed scala/java code bases
>> before, so I might be missing something here.
>>> Secondly this whole migration might and most like will take longer then
>> expected, so that creates a problem for a new code that we will be
>> creating. After making a decision to migrate to Java, almost any new Scala
>> line of code will be immediately a technological debt and we will have to
>> rewrite it to Java later.
>>> Thus I would propose first to state our end goal - modules structure and
>> which parts of modules we want to have eventually Scala-free. Secondly
>> taking all steps necessary that will allow us to write new code complaint
>> with our end goal. Only after that we should/could focus on incrementally
>> rewriting the old code. Otherwise we could be stuck/blocked for years
>> writing new code in Scala (and increasing technological debt), because
>> nobody have found a time to rewrite some non important and not actively
>> developed part of some module.
>>> Piotrek
>>>
>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> In general, I think this is a good effort. However, it won't be easy
>> and I
>>>> think we have to plan this well.
>>>> I don't like the idea of having the whole code base fragmented into Java
>>>> and Scala code for too long.
>>>>
>>>> I think we should do this one step at a time and focus on migrating one
>>>> module at a time.
>>>> IMO, the easiest start would be to port the runtime to Java.
>>>> Extracting the API classes into an own module, porting them to Java, and
>>>> removing the Scala dependency won't be possible without breaking the API
>>>> since a few classes depend on the Scala Table API.
>>>>
>>>> Best, Fabian
>>>>
>>>>
>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
>>>>
>>>>> I think that is a noble and honorable goal and we should strive for it.
>>>>> This, however, must be an iterative process given the sheer size of the
>>>>> code base. I like the approach to define common Java modules which are
>> used
>>>>> by more specific Scala modules and slowly moving classes from Scala to
>>>>> Java. Thus +1 for the proposal.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>> piotr@data-artisans.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I do not have an experience with how scala and java interacts with
>> each
>>>>>> other, so I can not fully validate your proposal, but generally
>> speaking
>>>>> +1
>>>>>> from me.
>>>>>>
>>>>>> Does it also mean, that we should slowly migrate `flink-table-core` to
>>>>>> Java? How would you envision it? It would be nice to be able to add
>> new
>>>>>> classes/features written in Java and so that they can coexist with old
>>>>>> Scala code until we gradually switch from Scala to Java.
>>>>>>
>>>>>> Piotrek
>>>>>>
>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org> wrote:
>>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> as you all know, currently the Table & SQL API is implemented in
>> Scala.
>>>>>> This decision was made a long-time ago when the initital code base was
>>>>>> created as part of a master's thesis. The community kept Scala
>> because of
>>>>>> the nice language features that enable a fluent Table API like
>>>>>> table.select('field.trim()) and because Scala allows for quick
>>>>> prototyping
>>>>>> (e.g. multi-line comments for code generation). The committers
>> enforced
>>>>> not
>>>>>> splitting the code-base into two programming languages.
>>>>>>> However, nowadays the flink-table module more and more becomes an
>>>>>> important part in the Flink ecosystem. Connectors, formats, and SQL
>>>>> client
>>>>>> are actually implemented in Java but need to interoperate with
>>>>> flink-table
>>>>>> which makes these modules dependent on Scala. As mentioned in an
>> earlier
>>>>>> mail thread, using Scala for API classes also exposes member variables
>>>>> and
>>>>>> methods in Java that should not be exposed to users [1]. Java is still
>>>>> the
>>>>>> most important API language and right now we treat it as a
>> second-class
>>>>>> citizen. I just noticed that you even need to add Scala if you just
>> want
>>>>> to
>>>>>> implement a ScalarFunction because of method clashes between `public
>>>>> String
>>>>>> toString()` and `public scala.Predef.String toString()`.
>>>>>>> Given the size of the current code base, reimplementing the entire
>>>>>> flink-table code in Java is a goal that we might never reach.
>> However, we
>>>>>> should at least treat the symptoms and have this as a long-term goal
>> in
>>>>>> mind. My suggestion would be to convert user-facing and runtime
>> classes
>>>>> and
>>>>>> split the code base into multiple modules:
>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>> Implemented in Java. Java users can use this. This would require to
>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>
>>>>>>>> flink-table-common
>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use this. It
>>>>>> contains interface classes such as descriptors, table sink, table
>> source.
>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>> flink-table-runtime}
>>>>>>> Implemented in Scala. Contains the current main code base.
>>>>>>>
>>>>>>>> flink-table-runtime
>>>>>>> Implemented in Java. This would require to convert classes in
>>>>>> o.a.f.table.runtime but would improve the runtime potentially.
>>>>>>>
>>>>>>> What do you think?
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Timo
>>>>>>>
>>>>>>> [1]
>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>> traits-tp21335.html
>>>>>>
>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Hi everyone,

I updated FLIP-28 according to the feedback that I received (online and 
offline).

The biggest change is that a user now needs to add two dependencies (api 
and planner) if a table program should be runnable in an IDE (as 
Aljoscha suggested). This allows for a clear separation of API and 
planner/runtime. It might even be possible to *not* expose Calcite 
through the API and thus have minimal external dependencies.

Furthermore, I renamed `flink-table-spi` back to `flink-table-common` 
because `spi` looks too similar to `api` and could cause confusion. 
Aljoscha and Stephan both mentioned that `common` would fit better in 
our current naming scheme.

I will open a PR for FLIP-28 step 1 shortly and looking forward to feedback.

Thanks,
Timo


Am 11.12.18 um 09:10 schrieb Timo Walther:
> Hi Aljoscha,
>
> thanks for your feedback. I also don't like the fact that an API 
> depends on runtime. I will try to come up with a better design while 
> implementing a PoC. The general goal should be to make table programs 
> still runnable in an IDE. So maybe there is a better way of doing it.
>
> Regards,
> Timo
>
>
> Am 07.12.18 um 16:20 schrieb Aljoscha Krettek:
>> Hi,
>>
>> this is a very nice effort!
>>
>> There is one thing that we should change, though. In the batch API we 
>> have a clear separation between API and runtime, and using the API 
>> (depending on flink-batch) does not "expose" the runtime classes that 
>> are in flink-runtime. For the streaming API, we made the mistake of 
>> letting flink-streaming depend on flink-runtime. This means that 
>> depending on flink-streaming pulls in flink-runtime transitively, 
>> which enlarges the surface that users see from Flink and (for 
>> example) makes it harder to package a user fat jar (we have the 
>> excludes/provided, whatnot).
>>
>> We should avoid this error and have flink-table-api not depend on 
>> flink-table-runtime, but the other way round, as we have it for the 
>> batch API.
>>
>> Btw, another project that has gotten this separation very nicely is 
>> Beam, where there is an sdk package, that has all the user facing API 
>> that people use to create programs  and they see nothing of the 
>> runner/runtime specifics. In this project it comes out of necessity, 
>> because there can be widely different runners, but we should still 
>> strive for this here.
>>
>> Off topic: we also have to achieve this for the streaming API.
>>
>> Best,
>> Aljoscha
>>
>>> On 29. Nov 2018, at 16:58, Timo Walther <tw...@apache.org> wrote:
>>>
>>> Thanks for the feedback, everyone!
>>>
>>> I created a FLIP for these efforts: 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
>>>
>>> I will open an umbrella Jira ticket for FLIP-28 with concrete 
>>> subtasks shortly.
>>>
>>> Thanks,
>>> Timo
>>>
>>> Am 29.11.18 um 12:44 schrieb Jark Wu:
>>>> Thanks Timo,
>>>>
>>>> That makes sense to me. And I left the comment about code 
>>>> generation in doc.
>>>>
>>>> Looking forward to participate in it!
>>>>
>>>> Best,
>>>> Jark
>>>>
>>>> On Thu, 29 Nov 2018 at 16:42, Timo Walther <tw...@apache.org> wrote:
>>>>
>>>>> @Kurt: Yes, I don't think that that forks of Flink will have a 
>>>>> hard time
>>>>> to keep up with the porting. That is also why I called this 
>>>>> `long-term
>>>>> goal` because I don't see big resources for the porting to happen
>>>>> quicker. But at least new features, API, and runtime profit from 
>>>>> Java to
>>>>> Scala conversion.
>>>>>
>>>>> @Jark: I updated the document:
>>>>>
>>>>> 1. flink-table-common has been renamed to flink-table-spi by request.
>>>>>
>>>>> 2. Yes, good point. flink-sql-client can be moved there as well.
>>>>>
>>>>> 3. I added a paragraph to the document. Porting the code 
>>>>> generation to
>>>>> Java makes only sense if acceptable tooling for it is in place.
>>>>>
>>>>>
>>>>> Thanks for the feedback,
>>>>>
>>>>> Timo
>>>>>
>>>>>
>>>>> Am 29.11.18 um 08:28 schrieb Jark Wu:
>>>>>> Hi Timo,
>>>>>>
>>>>>> Thanks for the great work!
>>>>>>
>>>>>> Moving flink-table to Java is a long-awaited things but will 
>>>>>> involve much
>>>>>> effort. Agree with that we should make it as a long-term goal.
>>>>>>
>>>>>> I have read the google doc and +1 for the proposal. Here I have some
>>>>>> questions:
>>>>>>
>>>>>> 1. Where should the flink-table-common module place ? Will we 
>>>>>> move the
>>>>>> flink-table-common classes to the new modules?
>>>>>> 2. Should flink-sql-client also as a sub-module under flink-table ?
>>>>>> 3. The flink-table-planner contains code generation and will be 
>>>>>> converted
>>>>>> to Java. Actually, I prefer using Scala to code generate because 
>>>>>> of the
>>>>>> Multiline-String and String-Interpolation (i.e. s"hello $user") 
>>>>>> features
>>>>> in
>>>>>> Scala. It makes code of code-generation more readable. Do we really
>>>>>> want to migrate
>>>>>> code generation to Java?
>>>>>>
>>>>>> Best,
>>>>>> Jark
>>>>>>
>>>>>>
>>>>>> On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Timo and Vino,
>>>>>>>
>>>>>>> I agree that table is very active and there is no guarantee for not
>>>>>>> producing any conflicts if you decide
>>>>>>> to develop based on community version. I think this part is the 
>>>>>>> risk
>>>>> what
>>>>>>> we can imagine in the first place. But massively
>>>>>>> language replacing is something you can not imagine and be ready 
>>>>>>> for,
>>>>> there
>>>>>>> is no feature added, no refactor is done, simply changing
>>>>>>> from scala to java will cause lots of conflicts.
>>>>>>>
>>>>>>> But I also agree that this is a "technical debt" that we should
>>>>> eventually
>>>>>>> pay, as you said, we can do this slowly, even one file each time,
>>>>>>> let other people have more time to resolve the conflicts.
>>>>>>>
>>>>>>> Best,
>>>>>>> Kurt
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org>
>>>>> wrote:
>>>>>>>> Hi Kurt,
>>>>>>>>
>>>>>>>> I understand your concerns. However, there is no concrete 
>>>>>>>> roadmap for
>>>>>>>> Flink 2.0 and (as Vino said) the flink-table is developed very
>>>>> actively.
>>>>>>>> Major refactorings happened in the past and will also happen 
>>>>>>>> with or
>>>>>>>> without Scala migration. A good example, is the proper catalog 
>>>>>>>> support
>>>>>>>> which will refactor big parts of the TableEnvironment class. Or 
>>>>>>>> the
>>>>>>>> introduction of "retractions" which needed a big refactoring of 
>>>>>>>> the
>>>>>>>> planning phase. Stability is only guaranteed for the API and the
>>>>> general
>>>>>>>> behavior, however, currently flink-table is not using @Public or
>>>>>>>> @PublicEvolving annotations for a reason.
>>>>>>>>
>>>>>>>> I think the migration will still happen slowly because it needs 
>>>>>>>> people
>>>>>>>> that allocate time for that. Therefore, even Flink forks can 
>>>>>>>> slowly
>>>>>>>> adapt to the evolving Scala-to-Java code base.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 27.11.18 um 13:16 schrieb vino yang:
>>>>>>>>> Hi Kurt,
>>>>>>>>>
>>>>>>>>> Currently, there is still a long time to go from flink 2.0.
>>>>> Considering
>>>>>>>>> that the flink-table
>>>>>>>>> is one of the most active modules in the current flink 
>>>>>>>>> project, each
>>>>>>>>> version has
>>>>>>>>> a number of changes and features added. I think that refactoring
>>>>> faster
>>>>>>>>> will reduce subsequent
>>>>>>>>> complexity and workload. And this may be a gradual and long 
>>>>>>>>> process.
>>>>> We
>>>>>>>>> should be able to
>>>>>>>>>     regard it as a "technical debt", and if it does not change 
>>>>>>>>> it, it
>>>>>>> will
>>>>>>>>> also affect the decision-making of other issues.
>>>>>>>>>
>>>>>>>>> Thanks, vino.
>>>>>>>>>
>>>>>>>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>>>>>>>>>
>>>>>>>>>> Hi Timo,
>>>>>>>>>>
>>>>>>>>>> Thanks for writing up the document. I'm +1 for reorganizing the
>>>>> module
>>>>>>>>>> structure and make table scala free. But I have
>>>>>>>>>> a little concern abount the timing. Is it more appropriate to 
>>>>>>>>>> get
>>>>> this
>>>>>>>> done
>>>>>>>>>> when Flink decide to bump to next big version, like 2.x.
>>>>>>>>>> It's true you can keep all the class's package path as it is, 
>>>>>>>>>> and
>>>>> will
>>>>>>>> not
>>>>>>>>>> introduce API change. But if some company are developing 
>>>>>>>>>> their own
>>>>>>>>>> Flink, and sync with community version by rebasing, may face 
>>>>>>>>>> a lot of
>>>>>>>>>> conflicts. Although you can avoid conflicts by always moving 
>>>>>>>>>> source
>>>>>>>> codes
>>>>>>>>>> between packages, but I assume you still need to delete the 
>>>>>>>>>> original
>>>>>>>> scala
>>>>>>>>>> file and add a new java file when you want to change program
>>>>> language.
>>>>>>>>>> Best,
>>>>>>>>>> Kurt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther 
>>>>>>>>>> <tw...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>> Hi Hequn,
>>>>>>>>>>>
>>>>>>>>>>> thanks for your feedback. Yes, migrating the test cases is 
>>>>>>>>>>> another
>>>>>>>> issue
>>>>>>>>>>> that is not represented in the document but should naturally go
>>>>> along
>>>>>>>>>>> with the migration.
>>>>>>>>>>>
>>>>>>>>>>> I agree that we should migrate the main API classes quickly 
>>>>>>>>>>> within
>>>>>>> this
>>>>>>>>>>> 1.8 release after the module split has been performed. Help 
>>>>>>>>>>> here is
>>>>>>>>>>> highly appreciated!
>>>>>>>>>>>
>>>>>>>>>>> I forgot that Java supports static methods in interfaces 
>>>>>>>>>>> now, but
>>>>>>>>>>> actually I don't like the design of calling
>>>>>>>> `TableEnvironment.get(env)`.
>>>>>>>>>>> Because people often use `TableEnvironment tEnd =
>>>>>>>>>>> TableEnvironment.get(env)` and then wonder why there is no
>>>>>>>>>>> `toAppendStream` or `toDataSet` because they are using the base
>>>>>>> class.
>>>>>>>>>>> However, things like that can be discussed in the corresponding
>>>>> issue
>>>>>>>>>>> when it comes to implementation.
>>>>>>>>>>>
>>>>>>>>>>> @Vino: I think your work fits nicely to these efforts.
>>>>>>>>>>>
>>>>>>>>>>> @everyone: I will wait for more feedback until end of this 
>>>>>>>>>>> week.
>>>>>>> Then I
>>>>>>>>>>> will convert the design document into a FLIP and open 
>>>>>>>>>>> subtasks in
>>>>>>> Jira,
>>>>>>>>>>> if there are no objections?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Timo
>>>>>>>>>>>
>>>>>>>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>>>>>>>>>> Hi hequn,
>>>>>>>>>>>>
>>>>>>>>>>>> I am very glad to hear that you are interested in this work.
>>>>>>>>>>>> As we all know, this process involves a lot.
>>>>>>>>>>>> Currently, the migration work has begun. I started with the
>>>>>>>>>>>> Kafka connector's dependency on flink-table and moved the
>>>>>>>>>>>> related dependencies to flink-table-common.
>>>>>>>>>>>> This work is tracked by FLINK-9461.  [1]
>>>>>>>>>>>> I don't know if it will conflict with what you expect to 
>>>>>>>>>>>> do, but
>>>>>>> from
>>>>>>>>>> the
>>>>>>>>>>>> impact I have observed,
>>>>>>>>>>>> it will involve many classes that are currently in 
>>>>>>>>>>>> flink-table.
>>>>>>>>>>>>
>>>>>>>>>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, vino.
>>>>>>>>>>>>
>>>>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>>>>>>>>>
>>>>>>>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 
>>>>>>>>>>>> 下午7:20写道:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the effort and writing up this document. I like 
>>>>>>>>>>>>> the
>>>>> idea
>>>>>>>> to
>>>>>>>>>>> make
>>>>>>>>>>>>> flink-table scala free, so +1 for the proposal!
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's good to make Java the first-class citizen. For a long 
>>>>>>>>>>>>> time,
>>>>> we
>>>>>>>>>> have
>>>>>>>>>>>>> neglected java so that many features in Table are missed 
>>>>>>>>>>>>> in Java
>>>>>>> Test
>>>>>>>>>>>>> cases, such as this one[1] I found recently. And I think 
>>>>>>>>>>>>> we may
>>>>>>> also
>>>>>>>>>>> need
>>>>>>>>>>>>> to migrate our test cases, i.e, add java tests.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This definitely is a big change and will break API 
>>>>>>>>>>>>> compatible. In
>>>>>>>>>> order
>>>>>>>>>>> to
>>>>>>>>>>>>> bring a smaller impact on users, I think we should go fast 
>>>>>>>>>>>>> when we
>>>>>>>>>>> migrate
>>>>>>>>>>>>> APIs targeted to users. It's better to introduce the user
>>>>> sensitive
>>>>>>>>>>> changes
>>>>>>>>>>>>> within a release. However, it may be not that easy. I can 
>>>>>>>>>>>>> help to
>>>>>>>>>>>>> contribute.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Separation of interface and implementation is a good idea. 
>>>>>>>>>>>>> This
>>>>> may
>>>>>>>>>>>>> introduce a minimum of dependencies or even no 
>>>>>>>>>>>>> dependencies. I saw
>>>>>>>>>> your
>>>>>>>>>>>>> reply in the google doc. Java8 has already supported 
>>>>>>>>>>>>> static method
>>>>>>>> for
>>>>>>>>>>>>> interfaces, I think we can make use of it?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Hequn
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther 
>>>>>>>>>>>>> <tw...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks for the great feedback so far. I updated the 
>>>>>>>>>>>>>> document with
>>>>>>>> the
>>>>>>>>>>>>>> input I got so far
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Fabian: I moved the porting of flink-table-runtime 
>>>>>>>>>>>>>> classes up in
>>>>>>>> the
>>>>>>>>>>>>> list.
>>>>>>>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means 
>>>>>>>>>>>>>> to you?
>>>>>>> Do
>>>>>>>>>>> you
>>>>>>>>>>>>>> mean a module containing pure Java `interface`s? Or is the
>>>>>>>> validation
>>>>>>>>>>>>>> logic also part of the API module? Are 50+ expression 
>>>>>>>>>>>>>> classes
>>>>> part
>>>>>>>> of
>>>>>>>>>>>>>> the API interface or already too implementation-specific?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @Xuefu: I extended the document by almost a page to 
>>>>>>>>>>>>>> clarify when
>>>>>>> we
>>>>>>>>>>>>>> should develop in Scala and when in Java. As Piotr said, 
>>>>>>>>>>>>>> every
>>>>> new
>>>>>>>>>>> Scala
>>>>>>>>>>>>>> line is instant technical debt.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the 
>>>>>>>>>>>>>>>> interim when
>>>>>>>>>> Java
>>>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I 
>>>>>>>>>>>>>> found
>>>>>>> that
>>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> current code base there are cases where a Scala class 
>>>>>>>>>>>>>> extends
>>>>> Java
>>>>>>>>>> and
>>>>>>>>>>>>> vise
>>>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could 
>>>>>>>>>>>>>> say that
>>>>>>>>>>> extension
>>>>>>>>>>>>>> can only be from Java to Scala, which will help the 
>>>>>>>>>>>>>> situation.
>>>>>>>>>> However,
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, 
>>>>>>>>>>>>>>> probably
>>>>>>> we
>>>>>>>>>>>>> will
>>>>>>>>>>>>>> have to work it out as we go. One thing to consider is 
>>>>>>>>>>>>>> that from
>>>>>>> now
>>>>>>>>>>> on,
>>>>>>>>>>>>>> every single new code line written in Scala anywhere in
>>>>>>> Flink-table
>>>>>>>>>>>>> (except
>>>>>>>>>>>>>> of Flink-table-api-scala) is an instant technological 
>>>>>>>>>>>>>> debt. From
>>>>>>>> this
>>>>>>>>>>>>>> perspective I would be in favour of tolerating quite big
>>>>>>>>>> inchonvieneces
>>>>>>>>>>>>>> just to avoid any new Scala code.
>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <
>>>>> xuefu.z@alibaba-inc.com
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the effort and the Google writeup. During our
>>>>>>> external
>>>>>>>>>>>>>> catalog rework, we found much confusion between Java and 
>>>>>>>>>>>>>> Scala,
>>>>>>> and
>>>>>>>>>>> this
>>>>>>>>>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the 
>>>>>>>>>>>>>>>> interim when
>>>>>>>>>> Java
>>>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I 
>>>>>>>>>>>>>> found
>>>>>>> that
>>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> current code base there are cases where a Scala class 
>>>>>>>>>>>>>> extends
>>>>> Java
>>>>>>>>>> and
>>>>>>>>>>>>> vise
>>>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could 
>>>>>>>>>>>>>> say that
>>>>>>>>>>> extension
>>>>>>>>>>>>>> can only be from Java to Scala, which will help the 
>>>>>>>>>>>>>> situation.
>>>>>>>>>> However,
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Xuefu
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>>>>>>>>>> Scala-free
>>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Currently when using SQL/TableAPI should include many
>>>>>>> dependence.
>>>>>>>>>> In
>>>>>>>>>>>>>>>> particular, it is not necessary to introduce the specific
>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>>> dependencies which users do not care about. So I am 
>>>>>>>>>>>>>>>> glad to see
>>>>>>>>>> your
>>>>>>>>>>>>>>>> proposal, and hope when we consider splitting the API 
>>>>>>>>>>>>>>>> interface
>>>>>>>>>> into
>>>>>>>>>>> a
>>>>>>>>>>>>>>>> separate module, so that the user can introduce minimum of
>>>>>>>>>>>>> dependencies.
>>>>>>>>>>>>>>>> So, +1 to [separation of interface and implementation; 
>>>>>>>>>>>>>>>> e.g.
>>>>>>>>>> `Table` &
>>>>>>>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 
>>>>>>>>>>>>>>>> 下午10:50写道:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is 
>>>>>>>>>>>>>>>>> a nice
>>>>>>>>>> thing
>>>>>>>>>>>>> to
>>>>>>>>>>>>>> do.
>>>>>>>>>>>>>>>>> While we are doing this, can we also keep in mind that 
>>>>>>>>>>>>>>>>> we want
>>>>>>> to
>>>>>>>>>>>>>>>>> eventually have a TableAPI interface only module which 
>>>>>>>>>>>>>>>>> users
>>>>>>> can
>>>>>>>>>>> take
>>>>>>>>>>>>>>>>> dependency on, but without including any implementation
>>>>>>> details?
>>>>>>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
>>>>>>> fhueske@gmail.com
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>>>>>>>>>> I like the new structure and agree to prioritize the 
>>>>>>>>>>>>>>>>>> porting
>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> flink-table-common classes.
>>>>>>>>>>>>>>>>>> Since flink-table-runtime is (or should be) 
>>>>>>>>>>>>>>>>>> independent of
>>>>> the
>>>>>>>>>> API
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> planner modules, we could start porting these classes 
>>>>>>>>>>>>>>>>>> once
>>>>> the
>>>>>>>>>> code
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> split into the new module structure.
>>>>>>>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime 
>>>>>>>>>>>>>>>>>> would be a
>>>>>>>>>>>>> Scala-free
>>>>>>>>>>>>>>>>>> execution Jar.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo 
>>>>>>>>>>>>>>>>>> Walther <
>>>>>>>>>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I would like to continue this discussion thread and 
>>>>>>>>>>>>>>>>>>> convert
>>>>>>> the
>>>>>>>>>>>>>> outcome
>>>>>>>>>>>>>>>>>>> into a FLIP such that users and contributors know 
>>>>>>>>>>>>>>>>>>> what to
>>>>>>>> expect
>>>>>>>>>>> in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I created a design document [1] that clarifies our
>>>>> motivation
>>>>>>>>>> why
>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>> want to do this, how a Maven module structure could 
>>>>>>>>>>>>>>>>>>> look
>>>>>>> like,
>>>>>>>>>> and
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
>>>>>>> release
>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>> that new features can be developed in Java and major
>>>>>>>>>> refactorings
>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>> as improvements to the connectors and external catalog
>>>>>>> support
>>>>>>>>>> are
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing 
>>>>>
>>>>>>>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> thanks for bumping this thread and thanks for 
>>>>>>>>>>>>>>>>>>>> Xingcan for
>>>>>>> the
>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>> I think the first step would be to separate the 
>>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>> module
>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be 
>>>>>>>>>>>>>>>>>>>> later
>>>>>>>> divided
>>>>>>>>>>>>>>>>> further
>>>>>>>>>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>>>>>>>>>> - flink-table-planning: involves all planning 
>>>>>>>>>>>>>>>>>>>> (basically
>>>>>>>>>>>>> everything
>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime
>>>>> module
>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>>>>>>>>>> The api module will be much harder to port because of
>>>>>>> several
>>>>>>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
>>>>>>> iterations,
>>>>>>>>>>>>> etc.).
>>>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>>>> not saying we should not port this to Java, but it 
>>>>>>>>>>>>>>>>>>>> is not
>>>>>>>> clear
>>>>>>>>>>> to
>>>>>>>>>>>>>> me
>>>>>>>>>>>>>>>>>>> (yet)
>>>>>>>>>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard 
>>>>>>>>>>>>>>>>>>>> to port.
>>>>>>>> The
>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's 
>>>>>>>>>>>>>>>>>>>> writing
>>>>> very
>>>>>>>>>>>>>>>>> Java-like.
>>>>>>>>>>>>>>>>>>>> Also, there are not many dependencies and operators 
>>>>>>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>>>> individually
>>>>>>>>>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>>>>>>>>>> For flink-table-planning, we can have certain 
>>>>>>>>>>>>>>>>>>>> packages that
>>>>>>> we
>>>>>>>>>>>>> port
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
>>>>>>> mostly
>>>>>>>>>>>>> extend
>>>>>>>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
>>>>>>> choices
>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>> ported. The code generation classes will require more
>>>>> effort
>>>>>>>> to
>>>>>>>>>>>>>> port.
>>>>>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>>> are also some dependencies in planning on the api 
>>>>>>>>>>>>>>>>>>>> module
>>>>>>> that
>>>>>>>>>> we
>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For SQL most work when adding new features is done 
>>>>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>> planning
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> runtime modules. So, this separation should already 
>>>>>>>>>>>>>>>>>>>> reduce
>>>>>>>>>>>>>>>>>> "technological
>>>>>>>>>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui 
>>>>>>>>>>>>>>>>>>>> <xingcanc@gmail.com
>>>>>>>> :
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I also think about this problem these days and 
>>>>>>>>>>>>>>>>>>>>> here are my
>>>>>>>>>>>>>> thoughts.
>>>>>>>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
>>>>>>>> interoperate
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
>>>>>>> (Scala
>>>>>>>>>>>>>>>>>> collections
>>>>>>>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to 
>>>>>>>>>>>>>>>>>>>>> implement a
>>>>>>>> method
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>>>> Scala functions as parameters. Considering the 
>>>>>>>>>>>>>>>>>>>>> major part
>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a 
>>>>>>>>>>>>>>>>>>>>> long-term
>>>>>>>>>> view.
>>>>>>>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a 
>>>>>>>>>>>>>>>>>>>>> Scala API
>>>>>>> and
>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it 
>>>>>>>>>>>>>>>>>>>>> could be
>>>>>>>>>>> achieved
>>>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala 
>>>>>>>>>>>>>>>>>>>>> codes in
>>>>>>>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 3) If the community makes the final decision, 
>>>>>>>>>>>>>>>>>>>>> maybe any
>>>>> new
>>>>>>>>>>>>>> features
>>>>>>>>>>>>>>>>>>>>> should be added in Java (regardless of the 
>>>>>>>>>>>>>>>>>>>>> modules), in
>>>>>>> order
>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the 
>>>>>>>>>>>>>>>>>>>>>> less
>>>>> code
>>>>>>>> we
>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>> to rewrite. I have some objections/counter 
>>>>>>>>>>>>>>>>>>>>> proposals to
>>>>>>>>>> Fabian's
>>>>>>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>>>>>>>>>> First, I do not see a problem of having 
>>>>>>>>>>>>>>>>>>>>>> java/scala code
>>>>>>> even
>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>> module, especially not if there are clean 
>>>>>>>>>>>>>>>>>>>>> boundaries. Like
>>>>>>> we
>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes 
>>>>>>>>>>>>>>>>>>>>> written in
>>>>>>>> Java
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>>>>>>>>>> scala/java
>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>> bases
>>>>>>>>>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>>>>>>>>>> Secondly this whole migration might and most like 
>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>> take
>>>>>>>>>>>>> longer
>>>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>> expected, so that creates a problem for a new code 
>>>>>>>>>>>>>>>>>>>>> that we
>>>>>>>>>> will
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
>>>>>>> almost
>>>>>>>>>> any
>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>>>> line of code will be immediately a technological 
>>>>>>>>>>>>>>>>>>>>> debt and
>>>>>>> we
>>>>>>>>>>> will
>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal -
>>>>> modules
>>>>>>>>>>>>>>>>> structure
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> which parts of modules we want to have eventually
>>>>>>> Scala-free.
>>>>>>>>>>>>>>>>> Secondly
>>>>>>>>>>>>>>>>>>>>> taking all steps necessary that will allow us to 
>>>>>>>>>>>>>>>>>>>>> write new
>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>> complaint
>>>>>>>>>>>>>>>>>>>>> with our end goal. Only after that we should/could 
>>>>>>>>>>>>>>>>>>>>> focus
>>>>> on
>>>>>>>>>>>>>>>>>>> incrementally
>>>>>>>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be
>>>>> stuck/blocked
>>>>>>>>>> for
>>>>>>>>>>>>>>>>> years
>>>>>>>>>>>>>>>>>>>>> writing new code in Scala (and increasing 
>>>>>>>>>>>>>>>>>>>>> technological
>>>>>>>> debt),
>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>> nobody have found a time to rewrite some non 
>>>>>>>>>>>>>>>>>>>>> important and
>>>>>>>> not
>>>>>>>>>>>>>>>>>> actively
>>>>>>>>>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
>>>>>>> fhueske@gmail.com
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> In general, I think this is a good effort. 
>>>>>>>>>>>>>>>>>>>>>>> However, it
>>>>>>>> won't
>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>>>>>>>>>> fragmented
>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I think we should do this one step at a time and 
>>>>>>>>>>>>>>>>>>>>>>> focus
>>>>> on
>>>>>>>>>>>>>>>>> migrating
>>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the 
>>>>>>>>>>>>>>>>>>>>>>> runtime to
>>>>>>>> Java.
>>>>>>>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, 
>>>>>>>>>>>>>>>>>>>>>>> porting
>>>>>>> them
>>>>>>>>>> to
>>>>>>>>>>>>>>>>> Java,
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible 
>>>>>>>>>>>>>>>>>>>>>>> without
>>>>>>>>>>>>> breaking
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>>>>>>>>>> trohrmann@apache.org
>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we
>>>>> should
>>>>>>>>>>>>> strive
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>>>> This, however, must be an iterative process 
>>>>>>>>>>>>>>>>>>>>>>>> given the
>>>>>>>> sheer
>>>>>>>>>>>>> size
>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> code base. I like the approach to define common 
>>>>>>>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>> modules
>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving
>>>>> classes
>>>>>>>>>> from
>>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and 
>>>>>>>>>>>>>>>>>>>>>>>>> java
>>>>>>>>>>> interacts
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your 
>>>>>>>>>>>>>>>>>>>>>>>>> proposal, but
>>>>>>>>>>>>> generally
>>>>>>>>>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>>>>>>>>>> `flink-table-core`
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be 
>>>>>>>>>>>>>>>>>>>>>>>>> nice to
>>>>> be
>>>>>>>>>> able
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that 
>>>>>>>>>>>>>>>>>>>>>>>>> they can
>>>>>>>>>>> coexist
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from 
>>>>>>>>>>>>>>>>>>>>>>>>> Scala to
>>>>>>> Java.
>>>>>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL 
>>>>>>>>>>>>>>>>>>>>>>>>>> API is
>>>>>>>>>>>>> implemented
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
>>>>>>> initital
>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The 
>>>>>>>>>>>>>>>>>>>>>>>>> community
>>>>>>> kept
>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>>>>>>>>>> the nice language features that enable a 
>>>>>>>>>>>>>>>>>>>>>>>>> fluent Table
>>>>>>> API
>>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala 
>>>>>>>>>>>>>>>>>>>>>>>>> allows
>>>>>>> for
>>>>>>>>>>>>> quick
>>>>>>>>>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code 
>>>>>>>>>>>>>>>>>>>>>>>>> generation). The
>>>>>>>>>>>>> committers
>>>>>>>>>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming
>>>>> languages.
>>>>>>>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more 
>>>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>> more
>>>>>>>>>>>>> becomes
>>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. 
>>>>>>>>>>>>>>>>>>>>>>>>> Connectors,
>>>>>>>>>> formats,
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> SQL
>>>>>>>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
>>>>>>> interoperate
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
>>>>>>>> mentioned
>>>>>>>>>>> in
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also 
>>>>>>>>>>>>>>>>>>>>>>>>> exposes
>>>>>>>>>> member
>>>>>>>>>>>>>>>>>>> variables
>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to 
>>>>>>>>>>>>>>>>>>>>>>>>> users
>>>>>>> [1].
>>>>>>>>>>> Java
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> most important API language and right now we 
>>>>>>>>>>>>>>>>>>>>>>>>> treat it
>>>>>>> as
>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add
>>>>> Scala
>>>>>>>> if
>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method 
>>>>>>>>>>>>>>>>>>>>>>>>> clashes
>>>>>>>>>> between
>>>>>>>>>>>>>>>>>> `public
>>>>>>>>>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
>>>>>>> toString()`.
>>>>>>>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
>>>>>>> reimplementing
>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we 
>>>>>>>>>>>>>>>>>>>>>>>>> might never
>>>>>>>>>>> reach.
>>>>>>>>>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have 
>>>>>>>>>>>>>>>>>>>>>>>>> this as a
>>>>>>>>>>>>> long-term
>>>>>>>>>>>>>>>>>> goal
>>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert 
>>>>>>>>>>>>>>>>>>>>>>>>> user-facing
>>>>> and
>>>>>>>>>>>>> runtime
>>>>>>>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. 
>>>>>>>>>>>>>>>>>>>>>>>>>> This
>>>>>>> would
>>>>>>>>>>>>>>>>> require
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and 
>>>>>>>>>>>>>>>>>>>>>>>>>> UDFs
>>>>> can
>>>>>>>>>> use
>>>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>>>>>>> contains interface classes such as 
>>>>>>>>>>>>>>>>>>>>>>>>> descriptors, table
>>>>>>>>>> sink,
>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on 
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-common and
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current 
>>>>>>>>>>>>>>>>>>>>>>>>>> main code
>>>>>>>>>> base.
>>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to 
>>>>>>>>>>>>>>>>>>>>>>>>>> convert
>>>>>>>>>> classes
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>>>>>>>>>> potentially.
>>>>>>>>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3. 
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>>>>>>>>>> traits-tp21335.html
>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Hi Aljoscha,

thanks for your feedback. I also don't like the fact that an API depends 
on runtime. I will try to come up with a better design while 
implementing a PoC. The general goal should be to make table programs 
still runnable in an IDE. So maybe there is a better way of doing it.

Regards,
Timo


Am 07.12.18 um 16:20 schrieb Aljoscha Krettek:
> Hi,
>
> this is a very nice effort!
>
> There is one thing that we should change, though. In the batch API we have a clear separation between API and runtime, and using the API (depending on flink-batch) does not "expose" the runtime classes that are in flink-runtime. For the streaming API, we made the mistake of letting flink-streaming depend on flink-runtime. This means that depending on flink-streaming pulls in flink-runtime transitively, which enlarges the surface that users see from Flink and (for example) makes it harder to package a user fat jar (we have the excludes/provided, whatnot).
>
> We should avoid this error and have flink-table-api not depend on flink-table-runtime, but the other way round, as we have it for the batch API.
>
> Btw, another project that has gotten this separation very nicely is Beam, where there is an sdk package, that has all the user facing API that people use to create programs  and they see nothing of the runner/runtime specifics. In this project it comes out of necessity, because there can be widely different runners, but we should still strive for this here.
>
> Off topic: we also have to achieve this for the streaming API.
>
> Best,
> Aljoscha
>
>> On 29. Nov 2018, at 16:58, Timo Walther <tw...@apache.org> wrote:
>>
>> Thanks for the feedback, everyone!
>>
>> I created a FLIP for these efforts: https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
>>
>> I will open an umbrella Jira ticket for FLIP-28 with concrete subtasks shortly.
>>
>> Thanks,
>> Timo
>>
>> Am 29.11.18 um 12:44 schrieb Jark Wu:
>>> Thanks Timo,
>>>
>>> That makes sense to me. And I left the comment about code generation in doc.
>>>
>>> Looking forward to participate in it!
>>>
>>> Best,
>>> Jark
>>>
>>> On Thu, 29 Nov 2018 at 16:42, Timo Walther <tw...@apache.org> wrote:
>>>
>>>> @Kurt: Yes, I don't think that that forks of Flink will have a hard time
>>>> to keep up with the porting. That is also why I called this `long-term
>>>> goal` because I don't see big resources for the porting to happen
>>>> quicker. But at least new features, API, and runtime profit from Java to
>>>> Scala conversion.
>>>>
>>>> @Jark: I updated the document:
>>>>
>>>> 1. flink-table-common has been renamed to flink-table-spi by request.
>>>>
>>>> 2. Yes, good point. flink-sql-client can be moved there as well.
>>>>
>>>> 3. I added a paragraph to the document. Porting the code generation to
>>>> Java makes only sense if acceptable tooling for it is in place.
>>>>
>>>>
>>>> Thanks for the feedback,
>>>>
>>>> Timo
>>>>
>>>>
>>>> Am 29.11.18 um 08:28 schrieb Jark Wu:
>>>>> Hi Timo,
>>>>>
>>>>> Thanks for the great work!
>>>>>
>>>>> Moving flink-table to Java is a long-awaited things but will involve much
>>>>> effort. Agree with that we should make it as a long-term goal.
>>>>>
>>>>> I have read the google doc and +1 for the proposal. Here I have some
>>>>> questions:
>>>>>
>>>>> 1. Where should the flink-table-common module place ?  Will we move the
>>>>> flink-table-common classes to the new modules?
>>>>> 2. Should flink-sql-client also as a sub-module under flink-table ?
>>>>> 3. The flink-table-planner contains code generation and will be converted
>>>>> to Java. Actually, I prefer using Scala to code generate because of the
>>>>> Multiline-String and String-Interpolation (i.e. s"hello $user") features
>>>> in
>>>>> Scala. It makes code of code-generation more readable. Do we really
>>>>> want to migrate
>>>>> code generation to Java?
>>>>>
>>>>> Best,
>>>>> Jark
>>>>>
>>>>>
>>>>> On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
>>>>>
>>>>>> Hi Timo and Vino,
>>>>>>
>>>>>> I agree that table is very active and there is no guarantee for not
>>>>>> producing any conflicts if you decide
>>>>>> to develop based on community version. I think this part is the risk
>>>> what
>>>>>> we can imagine in the first place. But massively
>>>>>> language replacing is something you can not imagine and be ready for,
>>>> there
>>>>>> is no feature added, no refactor is done, simply changing
>>>>>> from scala to java will cause lots of conflicts.
>>>>>>
>>>>>> But I also agree that this is a "technical debt" that we should
>>>> eventually
>>>>>> pay, as you said, we can do this slowly, even one file each time,
>>>>>> let other people have more time to resolve the conflicts.
>>>>>>
>>>>>> Best,
>>>>>> Kurt
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org>
>>>> wrote:
>>>>>>> Hi Kurt,
>>>>>>>
>>>>>>> I understand your concerns. However, there is no concrete roadmap for
>>>>>>> Flink 2.0 and (as Vino said) the flink-table is developed very
>>>> actively.
>>>>>>> Major refactorings happened in the past and will also happen with or
>>>>>>> without Scala migration. A good example, is the proper catalog support
>>>>>>> which will refactor big parts of the TableEnvironment class. Or the
>>>>>>> introduction of "retractions" which needed a big refactoring of the
>>>>>>> planning phase. Stability is only guaranteed for the API and the
>>>> general
>>>>>>> behavior, however, currently flink-table is not using @Public or
>>>>>>> @PublicEvolving annotations for a reason.
>>>>>>>
>>>>>>> I think the migration will still happen slowly because it needs people
>>>>>>> that allocate time for that. Therefore, even Flink forks can slowly
>>>>>>> adapt to the evolving Scala-to-Java code base.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Timo
>>>>>>>
>>>>>>>
>>>>>>> Am 27.11.18 um 13:16 schrieb vino yang:
>>>>>>>> Hi Kurt,
>>>>>>>>
>>>>>>>> Currently, there is still a long time to go from flink 2.0.
>>>> Considering
>>>>>>>> that the flink-table
>>>>>>>> is one of the most active modules in the current flink project, each
>>>>>>>> version has
>>>>>>>> a number of changes and features added. I think that refactoring
>>>> faster
>>>>>>>> will reduce subsequent
>>>>>>>> complexity and workload. And this may be a gradual and long process.
>>>> We
>>>>>>>> should be able to
>>>>>>>>     regard it as a "technical debt", and if it does not change it, it
>>>>>> will
>>>>>>>> also affect the decision-making of other issues.
>>>>>>>>
>>>>>>>> Thanks, vino.
>>>>>>>>
>>>>>>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>>>>>>>>
>>>>>>>>> Hi Timo,
>>>>>>>>>
>>>>>>>>> Thanks for writing up the document. I'm +1 for reorganizing the
>>>> module
>>>>>>>>> structure and make table scala free. But I have
>>>>>>>>> a little concern abount the timing. Is it more appropriate to get
>>>> this
>>>>>>> done
>>>>>>>>> when Flink decide to bump to next big version, like 2.x.
>>>>>>>>> It's true you can keep all the class's package path as it is, and
>>>> will
>>>>>>> not
>>>>>>>>> introduce API change. But if some company are developing their own
>>>>>>>>> Flink, and sync with community version by rebasing, may face a lot of
>>>>>>>>> conflicts. Although you can avoid conflicts by always moving source
>>>>>>> codes
>>>>>>>>> between packages, but I assume you still need to delete the original
>>>>>>> scala
>>>>>>>>> file and add a new java file when you want to change program
>>>> language.
>>>>>>>>> Best,
>>>>>>>>> Kurt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
>>>>>>> wrote:
>>>>>>>>>> Hi Hequn,
>>>>>>>>>>
>>>>>>>>>> thanks for your feedback. Yes, migrating the test cases is another
>>>>>>> issue
>>>>>>>>>> that is not represented in the document but should naturally go
>>>> along
>>>>>>>>>> with the migration.
>>>>>>>>>>
>>>>>>>>>> I agree that we should migrate the main API classes quickly within
>>>>>> this
>>>>>>>>>> 1.8 release after the module split has been performed. Help here is
>>>>>>>>>> highly appreciated!
>>>>>>>>>>
>>>>>>>>>> I forgot that Java supports static methods in interfaces now, but
>>>>>>>>>> actually I don't like the design of calling
>>>>>>> `TableEnvironment.get(env)`.
>>>>>>>>>> Because people often use `TableEnvironment tEnd =
>>>>>>>>>> TableEnvironment.get(env)` and then wonder why there is no
>>>>>>>>>> `toAppendStream` or `toDataSet` because they are using the base
>>>>>> class.
>>>>>>>>>> However, things like that can be discussed in the corresponding
>>>> issue
>>>>>>>>>> when it comes to implementation.
>>>>>>>>>>
>>>>>>>>>> @Vino: I think your work fits nicely to these efforts.
>>>>>>>>>>
>>>>>>>>>> @everyone: I will wait for more feedback until end of this week.
>>>>>> Then I
>>>>>>>>>> will convert the design document into a FLIP and open subtasks in
>>>>>> Jira,
>>>>>>>>>> if there are no objections?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Timo
>>>>>>>>>>
>>>>>>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>>>>>>>>> Hi hequn,
>>>>>>>>>>>
>>>>>>>>>>> I am very glad to hear that you are interested in this work.
>>>>>>>>>>> As we all know, this process involves a lot.
>>>>>>>>>>> Currently, the migration work has begun. I started with the
>>>>>>>>>>> Kafka connector's dependency on flink-table and moved the
>>>>>>>>>>> related dependencies to flink-table-common.
>>>>>>>>>>> This work is tracked by FLINK-9461.  [1]
>>>>>>>>>>> I don't know if it will conflict with what you expect to do, but
>>>>>> from
>>>>>>>>> the
>>>>>>>>>>> impact I have observed,
>>>>>>>>>>> it will involve many classes that are currently in flink-table.
>>>>>>>>>>>
>>>>>>>>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>>>>>>>>
>>>>>>>>>>> Thanks, vino.
>>>>>>>>>>>
>>>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>>>>>>>>
>>>>>>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the effort and writing up this document. I like the
>>>> idea
>>>>>>> to
>>>>>>>>>> make
>>>>>>>>>>>> flink-table scala free, so +1 for the proposal!
>>>>>>>>>>>>
>>>>>>>>>>>> It's good to make Java the first-class citizen. For a long time,
>>>> we
>>>>>>>>> have
>>>>>>>>>>>> neglected java so that many features in Table are missed in Java
>>>>>> Test
>>>>>>>>>>>> cases, such as this one[1] I found recently. And I think we may
>>>>>> also
>>>>>>>>>> need
>>>>>>>>>>>> to migrate our test cases, i.e, add java tests.
>>>>>>>>>>>>
>>>>>>>>>>>> This definitely is a big change and will break API compatible. In
>>>>>>>>> order
>>>>>>>>>> to
>>>>>>>>>>>> bring a smaller impact on users, I think we should go fast when we
>>>>>>>>>> migrate
>>>>>>>>>>>> APIs targeted to users. It's better to introduce the user
>>>> sensitive
>>>>>>>>>> changes
>>>>>>>>>>>> within a release. However, it may be not that easy. I can help to
>>>>>>>>>>>> contribute.
>>>>>>>>>>>>
>>>>>>>>>>>> Separation of interface and implementation is a good idea. This
>>>> may
>>>>>>>>>>>> introduce a minimum of dependencies or even no dependencies. I saw
>>>>>>>>> your
>>>>>>>>>>>> reply in the google doc. Java8 has already supported static method
>>>>>>> for
>>>>>>>>>>>> interfaces, I think we can make use of it?
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Hequn
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for the great feedback so far. I updated the document with
>>>>>>> the
>>>>>>>>>>>>> input I got so far
>>>>>>>>>>>>>
>>>>>>>>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
>>>>>>> the
>>>>>>>>>>>> list.
>>>>>>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
>>>>>> Do
>>>>>>>>>> you
>>>>>>>>>>>>> mean a module containing pure Java `interface`s? Or is the
>>>>>>> validation
>>>>>>>>>>>>> logic also part of the API module? Are 50+ expression classes
>>>> part
>>>>>>> of
>>>>>>>>>>>>> the API interface or already too implementation-specific?
>>>>>>>>>>>>>
>>>>>>>>>>>>> @Xuefu: I extended the document by almost a page to clarify when
>>>>>> we
>>>>>>>>>>>>> should develop in Scala and when in Java. As Piotr said, every
>>>> new
>>>>>>>>>> Scala
>>>>>>>>>>>>> line is instant technical debt.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>>>> Java
>>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>>>> that
>>>>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>> current code base there are cases where a Scala class extends
>>>> Java
>>>>>>>>> and
>>>>>>>>>>>> vise
>>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>>>> extension
>>>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>>>> However,
>>>>>>>>>>>> I'm
>>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
>>>>>> we
>>>>>>>>>>>> will
>>>>>>>>>>>>> have to work it out as we go. One thing to consider is that from
>>>>>> now
>>>>>>>>>> on,
>>>>>>>>>>>>> every single new code line written in Scala anywhere in
>>>>>> Flink-table
>>>>>>>>>>>> (except
>>>>>>>>>>>>> of Flink-table-api-scala) is an instant technological debt. From
>>>>>>> this
>>>>>>>>>>>>> perspective I would be in favour of tolerating quite big
>>>>>>>>> inchonvieneces
>>>>>>>>>>>>> just to avoid any new Scala code.
>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <
>>>> xuefu.z@alibaba-inc.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the effort and the Google writeup. During our
>>>>>> external
>>>>>>>>>>>>> catalog rework, we found much confusion between Java and Scala,
>>>>>> and
>>>>>>>>>> this
>>>>>>>>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>>>> Java
>>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>>>> that
>>>>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>> current code base there are cases where a Scala class extends
>>>> Java
>>>>>>>>> and
>>>>>>>>>>>> vise
>>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>>>> extension
>>>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>>>> However,
>>>>>>>>>>>> I'm
>>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Xuefu
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>>>>>>>>> Scala-free
>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently when using SQL/TableAPI should include many
>>>>>> dependence.
>>>>>>>>> In
>>>>>>>>>>>>>>> particular, it is not necessary to introduce the specific
>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>> dependencies which users do not care about. So I am glad to see
>>>>>>>>> your
>>>>>>>>>>>>>>> proposal, and hope when we consider splitting the API interface
>>>>>>>>> into
>>>>>>>>>> a
>>>>>>>>>>>>>>> separate module, so that the user can introduce minimum of
>>>>>>>>>>>> dependencies.
>>>>>>>>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
>>>>>>>>> `Table` &
>>>>>>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
>>>>>>>>> thing
>>>>>>>>>>>> to
>>>>>>>>>>>>> do.
>>>>>>>>>>>>>>>> While we are doing this, can we also keep in mind that we want
>>>>>> to
>>>>>>>>>>>>>>>> eventually have a TableAPI interface only module which users
>>>>>> can
>>>>>>>>>> take
>>>>>>>>>>>>>>>> dependency on, but without including any implementation
>>>>>> details?
>>>>>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
>>>>>> fhueske@gmail.com
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>>>>>>>>> I like the new structure and agree to prioritize the porting
>>>>>> of
>>>>>>>>> the
>>>>>>>>>>>>>>>>> flink-table-common classes.
>>>>>>>>>>>>>>>>> Since flink-table-runtime is (or should be) independent of
>>>> the
>>>>>>>>> API
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> planner modules, we could start porting these classes once
>>>> the
>>>>>>>>> code
>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> split into the new module structure.
>>>>>>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>>>>>>>>>>>> Scala-free
>>>>>>>>>>>>>>>>> execution Jar.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>>>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I would like to continue this discussion thread and convert
>>>>>> the
>>>>>>>>>>>>> outcome
>>>>>>>>>>>>>>>>>> into a FLIP such that users and contributors know what to
>>>>>>> expect
>>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I created a design document [1] that clarifies our
>>>> motivation
>>>>>>>>> why
>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>> want to do this, how a Maven module structure could look
>>>>>> like,
>>>>>>>>> and
>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
>>>>>> release
>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>> that new features can be developed in Java and major
>>>>>>>>> refactorings
>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>> as improvements to the connectors and external catalog
>>>>>> support
>>>>>>>>> are
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
>>>>>> the
>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>> I think the first step would be to separate the flink-table
>>>>>>>>>> module
>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
>>>>>>> divided
>>>>>>>>>>>>>>>> further
>>>>>>>>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
>>>>>>>>>>>> everything
>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime
>>>> module
>>>>>>>>> and
>>>>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>>>>>>>>> The api module will be much harder to port because of
>>>>>> several
>>>>>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
>>>>>> iterations,
>>>>>>>>>>>> etc.).
>>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>>> not saying we should not port this to Java, but it is not
>>>>>>> clear
>>>>>>>>>> to
>>>>>>>>>>>>> me
>>>>>>>>>>>>>>>>>> (yet)
>>>>>>>>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
>>>>>>> The
>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing
>>>> very
>>>>>>>>>>>>>>>> Java-like.
>>>>>>>>>>>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>>>>>>>>>>>> individually
>>>>>>>>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>>>>>>>>> For flink-table-planning, we can have certain packages that
>>>>>> we
>>>>>>>>>>>> port
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
>>>>>> mostly
>>>>>>>>>>>> extend
>>>>>>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
>>>>>> choices
>>>>>>>>>> for
>>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>> ported. The code generation classes will require more
>>>> effort
>>>>>>> to
>>>>>>>>>>>>> port.
>>>>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>> are also some dependencies in planning on the api module
>>>>>> that
>>>>>>>>> we
>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For SQL most work when adding new features is done in the
>>>>>>>>>> planning
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>>>>>>>>>>>> "technological
>>>>>>>>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
>>>>>>> :
>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I also think about this problem these days and here are my
>>>>>>>>>>>>> thoughts.
>>>>>>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
>>>>>>> interoperate
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
>>>>>> (Scala
>>>>>>>>>>>>>>>>> collections
>>>>>>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
>>>>>>> method
>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>>> Scala functions as parameters. Considering the major part
>>>>>> of
>>>>>>>>> the
>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
>>>>>>>>> view.
>>>>>>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
>>>>>> and
>>>>>>>>>>>> make
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
>>>>>>>>>> achieved
>>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any
>>>> new
>>>>>>>>>>>>> features
>>>>>>>>>>>>>>>>>>>> should be added in Java (regardless of the modules), in
>>>>>> order
>>>>>>>>> to
>>>>>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less
>>>> code
>>>>>>> we
>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
>>>>>>>>> Fabian's
>>>>>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
>>>>>> even
>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
>>>>>> we
>>>>>>>>>>>> could
>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
>>>>>>> Java
>>>>>>>>>> in
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>>>>>>>>> scala/java
>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>> bases
>>>>>>>>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>>>>>>>>> Secondly this whole migration might and most like will
>>>>>> take
>>>>>>>>>>>> longer
>>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
>>>>>>>>> will
>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
>>>>>> almost
>>>>>>>>> any
>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>>> line of code will be immediately a technological debt and
>>>>>> we
>>>>>>>>>> will
>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal -
>>>> modules
>>>>>>>>>>>>>>>> structure
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> which parts of modules we want to have eventually
>>>>>> Scala-free.
>>>>>>>>>>>>>>>> Secondly
>>>>>>>>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
>>>>>>>>> code
>>>>>>>>>>>>>>>>>> complaint
>>>>>>>>>>>>>>>>>>>> with our end goal. Only after that we should/could focus
>>>> on
>>>>>>>>>>>>>>>>>> incrementally
>>>>>>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be
>>>> stuck/blocked
>>>>>>>>> for
>>>>>>>>>>>>>>>> years
>>>>>>>>>>>>>>>>>>>> writing new code in Scala (and increasing technological
>>>>>>> debt),
>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>> nobody have found a time to rewrite some non important and
>>>>>>> not
>>>>>>>>>>>>>>>>> actively
>>>>>>>>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
>>>>>> fhueske@gmail.com
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
>>>>>>> won't
>>>>>>>>>> be
>>>>>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>>>>>>>>> fragmented
>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think we should do this one step at a time and focus
>>>> on
>>>>>>>>>>>>>>>> migrating
>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
>>>>>>> Java.
>>>>>>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
>>>>>> them
>>>>>>>>> to
>>>>>>>>>>>>>>>> Java,
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
>>>>>>>>>>>> breaking
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>>>>>>>>> trohrmann@apache.org
>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we
>>>> should
>>>>>>>>>>>> strive
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>>> This, however, must be an iterative process given the
>>>>>>> sheer
>>>>>>>>>>>> size
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
>>>>>>>>> modules
>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving
>>>> classes
>>>>>>>>> from
>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
>>>>>>>>>> interacts
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>>>>>>>>>>>> generally
>>>>>>>>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>>>>>>>>> `flink-table-core`
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to
>>>> be
>>>>>>>>> able
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
>>>>>>>>>> coexist
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
>>>>>> Java.
>>>>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>>>>>>>>>>>> implemented
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
>>>>>> initital
>>>>>>>>>> code
>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
>>>>>> kept
>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
>>>>>> API
>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
>>>>>> for
>>>>>>>>>>>> quick
>>>>>>>>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>>>>>>>>>>>> committers
>>>>>>>>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming
>>>> languages.
>>>>>>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and
>>>> more
>>>>>>>>>>>> becomes
>>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
>>>>>>>>> formats,
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> SQL
>>>>>>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
>>>>>> interoperate
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
>>>>>>> mentioned
>>>>>>>>>> in
>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
>>>>>>>>> member
>>>>>>>>>>>>>>>>>> variables
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
>>>>>> [1].
>>>>>>>>>> Java
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> most important API language and right now we treat it
>>>>>> as
>>>>>>> a
>>>>>>>>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add
>>>> Scala
>>>>>>> if
>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
>>>>>>>>> between
>>>>>>>>>>>>>>>>> `public
>>>>>>>>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
>>>>>> toString()`.
>>>>>>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
>>>>>> reimplementing
>>>>>>>>> the
>>>>>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
>>>>>>>>>> reach.
>>>>>>>>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>>>>>>>>>>>> long-term
>>>>>>>>>>>>>>>>> goal
>>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing
>>>> and
>>>>>>>>>>>> runtime
>>>>>>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
>>>>>> would
>>>>>>>>>>>>>>>> require
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs
>>>> can
>>>>>>>>> use
>>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
>>>>>>>>> sink,
>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
>>>>>>>>> base.
>>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
>>>>>>>>> classes
>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>>>>>>>>> potentially.
>>>>>>>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>>>>>>>>> traits-tp21335.html



Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,

this is a very nice effort!

There is one thing that we should change, though. In the batch API we have a clear separation between API and runtime, and using the API (depending on flink-batch) does not "expose" the runtime classes that are in flink-runtime. For the streaming API, we made the mistake of letting flink-streaming depend on flink-runtime. This means that depending on flink-streaming pulls in flink-runtime transitively, which enlarges the surface that users see from Flink and (for example) makes it harder to package a user fat jar (we have the excludes/provided, whatnot).

We should avoid this error and have flink-table-api not depend on flink-table-runtime, but the other way round, as we have it for the batch API.

Btw, another project that has gotten this separation very nicely is Beam, where there is an sdk package, that has all the user facing API that people use to create programs  and they see nothing of the runner/runtime specifics. In this project it comes out of necessity, because there can be widely different runners, but we should still strive for this here.

Off topic: we also have to achieve this for the streaming API.

Best,
Aljoscha

> On 29. Nov 2018, at 16:58, Timo Walther <tw...@apache.org> wrote:
> 
> Thanks for the feedback, everyone!
> 
> I created a FLIP for these efforts: https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
> 
> I will open an umbrella Jira ticket for FLIP-28 with concrete subtasks shortly.
> 
> Thanks,
> Timo
> 
> Am 29.11.18 um 12:44 schrieb Jark Wu:
>> Thanks Timo,
>> 
>> That makes sense to me. And I left the comment about code generation in doc.
>> 
>> Looking forward to participate in it!
>> 
>> Best,
>> Jark
>> 
>> On Thu, 29 Nov 2018 at 16:42, Timo Walther <tw...@apache.org> wrote:
>> 
>>> @Kurt: Yes, I don't think that that forks of Flink will have a hard time
>>> to keep up with the porting. That is also why I called this `long-term
>>> goal` because I don't see big resources for the porting to happen
>>> quicker. But at least new features, API, and runtime profit from Java to
>>> Scala conversion.
>>> 
>>> @Jark: I updated the document:
>>> 
>>> 1. flink-table-common has been renamed to flink-table-spi by request.
>>> 
>>> 2. Yes, good point. flink-sql-client can be moved there as well.
>>> 
>>> 3. I added a paragraph to the document. Porting the code generation to
>>> Java makes only sense if acceptable tooling for it is in place.
>>> 
>>> 
>>> Thanks for the feedback,
>>> 
>>> Timo
>>> 
>>> 
>>> Am 29.11.18 um 08:28 schrieb Jark Wu:
>>>> Hi Timo,
>>>> 
>>>> Thanks for the great work!
>>>> 
>>>> Moving flink-table to Java is a long-awaited things but will involve much
>>>> effort. Agree with that we should make it as a long-term goal.
>>>> 
>>>> I have read the google doc and +1 for the proposal. Here I have some
>>>> questions:
>>>> 
>>>> 1. Where should the flink-table-common module place ?  Will we move the
>>>> flink-table-common classes to the new modules?
>>>> 2. Should flink-sql-client also as a sub-module under flink-table ?
>>>> 3. The flink-table-planner contains code generation and will be converted
>>>> to Java. Actually, I prefer using Scala to code generate because of the
>>>> Multiline-String and String-Interpolation (i.e. s"hello $user") features
>>> in
>>>> Scala. It makes code of code-generation more readable. Do we really
>>>> want to migrate
>>>> code generation to Java?
>>>> 
>>>> Best,
>>>> Jark
>>>> 
>>>> 
>>>> On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
>>>> 
>>>>> Hi Timo and Vino,
>>>>> 
>>>>> I agree that table is very active and there is no guarantee for not
>>>>> producing any conflicts if you decide
>>>>> to develop based on community version. I think this part is the risk
>>> what
>>>>> we can imagine in the first place. But massively
>>>>> language replacing is something you can not imagine and be ready for,
>>> there
>>>>> is no feature added, no refactor is done, simply changing
>>>>> from scala to java will cause lots of conflicts.
>>>>> 
>>>>> But I also agree that this is a "technical debt" that we should
>>> eventually
>>>>> pay, as you said, we can do this slowly, even one file each time,
>>>>> let other people have more time to resolve the conflicts.
>>>>> 
>>>>> Best,
>>>>> Kurt
>>>>> 
>>>>> 
>>>>> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org>
>>> wrote:
>>>>>> Hi Kurt,
>>>>>> 
>>>>>> I understand your concerns. However, there is no concrete roadmap for
>>>>>> Flink 2.0 and (as Vino said) the flink-table is developed very
>>> actively.
>>>>>> Major refactorings happened in the past and will also happen with or
>>>>>> without Scala migration. A good example, is the proper catalog support
>>>>>> which will refactor big parts of the TableEnvironment class. Or the
>>>>>> introduction of "retractions" which needed a big refactoring of the
>>>>>> planning phase. Stability is only guaranteed for the API and the
>>> general
>>>>>> behavior, however, currently flink-table is not using @Public or
>>>>>> @PublicEvolving annotations for a reason.
>>>>>> 
>>>>>> I think the migration will still happen slowly because it needs people
>>>>>> that allocate time for that. Therefore, even Flink forks can slowly
>>>>>> adapt to the evolving Scala-to-Java code base.
>>>>>> 
>>>>>> Regards,
>>>>>> Timo
>>>>>> 
>>>>>> 
>>>>>> Am 27.11.18 um 13:16 schrieb vino yang:
>>>>>>> Hi Kurt,
>>>>>>> 
>>>>>>> Currently, there is still a long time to go from flink 2.0.
>>> Considering
>>>>>>> that the flink-table
>>>>>>> is one of the most active modules in the current flink project, each
>>>>>>> version has
>>>>>>> a number of changes and features added. I think that refactoring
>>> faster
>>>>>>> will reduce subsequent
>>>>>>> complexity and workload. And this may be a gradual and long process.
>>> We
>>>>>>> should be able to
>>>>>>>    regard it as a "technical debt", and if it does not change it, it
>>>>> will
>>>>>>> also affect the decision-making of other issues.
>>>>>>> 
>>>>>>> Thanks, vino.
>>>>>>> 
>>>>>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>>>>>>> 
>>>>>>>> Hi Timo,
>>>>>>>> 
>>>>>>>> Thanks for writing up the document. I'm +1 for reorganizing the
>>> module
>>>>>>>> structure and make table scala free. But I have
>>>>>>>> a little concern abount the timing. Is it more appropriate to get
>>> this
>>>>>> done
>>>>>>>> when Flink decide to bump to next big version, like 2.x.
>>>>>>>> It's true you can keep all the class's package path as it is, and
>>> will
>>>>>> not
>>>>>>>> introduce API change. But if some company are developing their own
>>>>>>>> Flink, and sync with community version by rebasing, may face a lot of
>>>>>>>> conflicts. Although you can avoid conflicts by always moving source
>>>>>> codes
>>>>>>>> between packages, but I assume you still need to delete the original
>>>>>> scala
>>>>>>>> file and add a new java file when you want to change program
>>> language.
>>>>>>>> Best,
>>>>>>>> Kurt
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
>>>>>> wrote:
>>>>>>>>> Hi Hequn,
>>>>>>>>> 
>>>>>>>>> thanks for your feedback. Yes, migrating the test cases is another
>>>>>> issue
>>>>>>>>> that is not represented in the document but should naturally go
>>> along
>>>>>>>>> with the migration.
>>>>>>>>> 
>>>>>>>>> I agree that we should migrate the main API classes quickly within
>>>>> this
>>>>>>>>> 1.8 release after the module split has been performed. Help here is
>>>>>>>>> highly appreciated!
>>>>>>>>> 
>>>>>>>>> I forgot that Java supports static methods in interfaces now, but
>>>>>>>>> actually I don't like the design of calling
>>>>>> `TableEnvironment.get(env)`.
>>>>>>>>> Because people often use `TableEnvironment tEnd =
>>>>>>>>> TableEnvironment.get(env)` and then wonder why there is no
>>>>>>>>> `toAppendStream` or `toDataSet` because they are using the base
>>>>> class.
>>>>>>>>> However, things like that can be discussed in the corresponding
>>> issue
>>>>>>>>> when it comes to implementation.
>>>>>>>>> 
>>>>>>>>> @Vino: I think your work fits nicely to these efforts.
>>>>>>>>> 
>>>>>>>>> @everyone: I will wait for more feedback until end of this week.
>>>>> Then I
>>>>>>>>> will convert the design document into a FLIP and open subtasks in
>>>>> Jira,
>>>>>>>>> if there are no objections?
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Timo
>>>>>>>>> 
>>>>>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>>>>>>>> Hi hequn,
>>>>>>>>>> 
>>>>>>>>>> I am very glad to hear that you are interested in this work.
>>>>>>>>>> As we all know, this process involves a lot.
>>>>>>>>>> Currently, the migration work has begun. I started with the
>>>>>>>>>> Kafka connector's dependency on flink-table and moved the
>>>>>>>>>> related dependencies to flink-table-common.
>>>>>>>>>> This work is tracked by FLINK-9461.  [1]
>>>>>>>>>> I don't know if it will conflict with what you expect to do, but
>>>>> from
>>>>>>>> the
>>>>>>>>>> impact I have observed,
>>>>>>>>>> it will involve many classes that are currently in flink-table.
>>>>>>>>>> 
>>>>>>>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>>>>>>> 
>>>>>>>>>> Thanks, vino.
>>>>>>>>>> 
>>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>>>>>>> 
>>>>>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>>>>>>>>>> 
>>>>>>>>>>> Hi Timo,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the effort and writing up this document. I like the
>>> idea
>>>>>> to
>>>>>>>>> make
>>>>>>>>>>> flink-table scala free, so +1 for the proposal!
>>>>>>>>>>> 
>>>>>>>>>>> It's good to make Java the first-class citizen. For a long time,
>>> we
>>>>>>>> have
>>>>>>>>>>> neglected java so that many features in Table are missed in Java
>>>>> Test
>>>>>>>>>>> cases, such as this one[1] I found recently. And I think we may
>>>>> also
>>>>>>>>> need
>>>>>>>>>>> to migrate our test cases, i.e, add java tests.
>>>>>>>>>>> 
>>>>>>>>>>> This definitely is a big change and will break API compatible. In
>>>>>>>> order
>>>>>>>>> to
>>>>>>>>>>> bring a smaller impact on users, I think we should go fast when we
>>>>>>>>> migrate
>>>>>>>>>>> APIs targeted to users. It's better to introduce the user
>>> sensitive
>>>>>>>>> changes
>>>>>>>>>>> within a release. However, it may be not that easy. I can help to
>>>>>>>>>>> contribute.
>>>>>>>>>>> 
>>>>>>>>>>> Separation of interface and implementation is a good idea. This
>>> may
>>>>>>>>>>> introduce a minimum of dependencies or even no dependencies. I saw
>>>>>>>> your
>>>>>>>>>>> reply in the google doc. Java8 has already supported static method
>>>>>> for
>>>>>>>>>>> interfaces, I think we can make use of it?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Hequn
>>>>>>>>>>> 
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>> 
>>>>>>>>>>>> thanks for the great feedback so far. I updated the document with
>>>>>> the
>>>>>>>>>>>> input I got so far
>>>>>>>>>>>> 
>>>>>>>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
>>>>>> the
>>>>>>>>>>> list.
>>>>>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
>>>>> Do
>>>>>>>>> you
>>>>>>>>>>>> mean a module containing pure Java `interface`s? Or is the
>>>>>> validation
>>>>>>>>>>>> logic also part of the API module? Are 50+ expression classes
>>> part
>>>>>> of
>>>>>>>>>>>> the API interface or already too implementation-specific?
>>>>>>>>>>>> 
>>>>>>>>>>>> @Xuefu: I extended the document by almost a page to clarify when
>>>>> we
>>>>>>>>>>>> should develop in Scala and when in Java. As Piotr said, every
>>> new
>>>>>>>>> Scala
>>>>>>>>>>>> line is instant technical debt.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Timo
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>>> Java
>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>>> that
>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>>> current code base there are cases where a Scala class extends
>>> Java
>>>>>>>> and
>>>>>>>>>>> vise
>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>>> extension
>>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>>> However,
>>>>>>>>>>> I'm
>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
>>>>> we
>>>>>>>>>>> will
>>>>>>>>>>>> have to work it out as we go. One thing to consider is that from
>>>>> now
>>>>>>>>> on,
>>>>>>>>>>>> every single new code line written in Scala anywhere in
>>>>> Flink-table
>>>>>>>>>>> (except
>>>>>>>>>>>> of Flink-table-api-scala) is an instant technological debt. From
>>>>>> this
>>>>>>>>>>>> perspective I would be in favour of tolerating quite big
>>>>>>>> inchonvieneces
>>>>>>>>>>>> just to avoid any new Scala code.
>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <
>>> xuefu.z@alibaba-inc.com
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the effort and the Google writeup. During our
>>>>> external
>>>>>>>>>>>> catalog rework, we found much confusion between Java and Scala,
>>>>> and
>>>>>>>>> this
>>>>>>>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>>> Java
>>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>>> that
>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>>> current code base there are cases where a Scala class extends
>>> Java
>>>>>>>> and
>>>>>>>>>>> vise
>>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>>> extension
>>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>>> However,
>>>>>>>>>>> I'm
>>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Xuefu
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>> ------------------------------------------------------------------
>>>>>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>>>>>>>> Scala-free
>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Currently when using SQL/TableAPI should include many
>>>>> dependence.
>>>>>>>> In
>>>>>>>>>>>>>> particular, it is not necessary to introduce the specific
>>>>>>>>>>> implementation
>>>>>>>>>>>>>> dependencies which users do not care about. So I am glad to see
>>>>>>>> your
>>>>>>>>>>>>>> proposal, and hope when we consider splitting the API interface
>>>>>>>> into
>>>>>>>>> a
>>>>>>>>>>>>>> separate module, so that the user can introduce minimum of
>>>>>>>>>>> dependencies.
>>>>>>>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
>>>>>>>> `Table` &
>>>>>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
>>>>>>>> thing
>>>>>>>>>>> to
>>>>>>>>>>>> do.
>>>>>>>>>>>>>>> While we are doing this, can we also keep in mind that we want
>>>>> to
>>>>>>>>>>>>>>> eventually have a TableAPI interface only module which users
>>>>> can
>>>>>>>>> take
>>>>>>>>>>>>>>> dependency on, but without including any implementation
>>>>> details?
>>>>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
>>>>> fhueske@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>>>>>>>> I like the new structure and agree to prioritize the porting
>>>>> of
>>>>>>>> the
>>>>>>>>>>>>>>>> flink-table-common classes.
>>>>>>>>>>>>>>>> Since flink-table-runtime is (or should be) independent of
>>> the
>>>>>>>> API
>>>>>>>>>>> and
>>>>>>>>>>>>>>>> planner modules, we could start porting these classes once
>>> the
>>>>>>>> code
>>>>>>>>>>> is
>>>>>>>>>>>>>>>> split into the new module structure.
>>>>>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>>>>>>>>>>> Scala-free
>>>>>>>>>>>>>>>> execution Jar.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I would like to continue this discussion thread and convert
>>>>> the
>>>>>>>>>>>> outcome
>>>>>>>>>>>>>>>>> into a FLIP such that users and contributors know what to
>>>>>> expect
>>>>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I created a design document [1] that clarifies our
>>> motivation
>>>>>>>> why
>>>>>>>>>>> we
>>>>>>>>>>>>>>>>> want to do this, how a Maven module structure could look
>>>>> like,
>>>>>>>> and
>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
>>>>> release
>>>>>>>>>>> such
>>>>>>>>>>>>>>>>> that new features can be developed in Java and major
>>>>>>>> refactorings
>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>> as improvements to the connectors and external catalog
>>>>> support
>>>>>>>> are
>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
>>>>> the
>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>> I think the first step would be to separate the flink-table
>>>>>>>>> module
>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
>>>>>> divided
>>>>>>>>>>>>>>> further
>>>>>>>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
>>>>>>>>>>> everything
>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime
>>> module
>>>>>>>> and
>>>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>>>>>>>> The api module will be much harder to port because of
>>>>> several
>>>>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
>>>>> iterations,
>>>>>>>>>>> etc.).
>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>>> not saying we should not port this to Java, but it is not
>>>>>> clear
>>>>>>>>> to
>>>>>>>>>>>> me
>>>>>>>>>>>>>>>>> (yet)
>>>>>>>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
>>>>>> The
>>>>>>>>>>> code
>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing
>>> very
>>>>>>>>>>>>>>> Java-like.
>>>>>>>>>>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>>>>>>>>>>> individually
>>>>>>>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>>>>>>>> For flink-table-planning, we can have certain packages that
>>>>> we
>>>>>>>>>>> port
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
>>>>> mostly
>>>>>>>>>>> extend
>>>>>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
>>>>> choices
>>>>>>>>> for
>>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>> ported. The code generation classes will require more
>>> effort
>>>>>> to
>>>>>>>>>>>> port.
>>>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>> are also some dependencies in planning on the api module
>>>>> that
>>>>>>>> we
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> For SQL most work when adding new features is done in the
>>>>>>>>> planning
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>>>>>>>>>>> "technological
>>>>>>>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
>>>>>> :
>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I also think about this problem these days and here are my
>>>>>>>>>>>> thoughts.
>>>>>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
>>>>>> interoperate
>>>>>>>>>>> with
>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
>>>>> (Scala
>>>>>>>>>>>>>>>> collections
>>>>>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
>>>>>> method
>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>> Scala functions as parameters. Considering the major part
>>>>> of
>>>>>>>> the
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
>>>>>>>> view.
>>>>>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
>>>>> and
>>>>>>>>>>> make
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
>>>>>>>>> achieved
>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any
>>> new
>>>>>>>>>>>> features
>>>>>>>>>>>>>>>>>>> should be added in Java (regardless of the modules), in
>>>>> order
>>>>>>>> to
>>>>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less
>>> code
>>>>>> we
>>>>>>>>>>> will
>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
>>>>>>>> Fabian's
>>>>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
>>>>> even
>>>>>>>>>>>> within
>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
>>>>> we
>>>>>>>>>>> could
>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
>>>>>> Java
>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>>>>>>>> scala/java
>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>> bases
>>>>>>>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>>>>>>>> Secondly this whole migration might and most like will
>>>>> take
>>>>>>>>>>> longer
>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
>>>>>>>> will
>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
>>>>> almost
>>>>>>>> any
>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>> line of code will be immediately a technological debt and
>>>>> we
>>>>>>>>> will
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal -
>>> modules
>>>>>>>>>>>>>>> structure
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> which parts of modules we want to have eventually
>>>>> Scala-free.
>>>>>>>>>>>>>>> Secondly
>>>>>>>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
>>>>>>>> code
>>>>>>>>>>>>>>>>> complaint
>>>>>>>>>>>>>>>>>>> with our end goal. Only after that we should/could focus
>>> on
>>>>>>>>>>>>>>>>> incrementally
>>>>>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be
>>> stuck/blocked
>>>>>>>> for
>>>>>>>>>>>>>>> years
>>>>>>>>>>>>>>>>>>> writing new code in Scala (and increasing technological
>>>>>> debt),
>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>> nobody have found a time to rewrite some non important and
>>>>>> not
>>>>>>>>>>>>>>>> actively
>>>>>>>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
>>>>> fhueske@gmail.com
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
>>>>>> won't
>>>>>>>>> be
>>>>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>>>>>>>> fragmented
>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I think we should do this one step at a time and focus
>>> on
>>>>>>>>>>>>>>> migrating
>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
>>>>>> Java.
>>>>>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
>>>>> them
>>>>>>>> to
>>>>>>>>>>>>>>> Java,
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
>>>>>>>>>>> breaking
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>>>>>>>> trohrmann@apache.org
>>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we
>>> should
>>>>>>>>>>> strive
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>> This, however, must be an iterative process given the
>>>>>> sheer
>>>>>>>>>>> size
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
>>>>>>>> modules
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving
>>> classes
>>>>>>>> from
>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
>>>>>>>>> interacts
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>>>>>>>>>>> generally
>>>>>>>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>>>>>>>> `flink-table-core`
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to
>>> be
>>>>>>>> able
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
>>>>>>>>> coexist
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
>>>>> Java.
>>>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>>>>>>>>>>> implemented
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
>>>>> initital
>>>>>>>>> code
>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
>>>>> kept
>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
>>>>> API
>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
>>>>> for
>>>>>>>>>>> quick
>>>>>>>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>>>>>>>>>>> committers
>>>>>>>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming
>>> languages.
>>>>>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and
>>> more
>>>>>>>>>>> becomes
>>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
>>>>>>>> formats,
>>>>>>>>>>> and
>>>>>>>>>>>>>>>> SQL
>>>>>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
>>>>> interoperate
>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
>>>>>> mentioned
>>>>>>>>> in
>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
>>>>>>>> member
>>>>>>>>>>>>>>>>> variables
>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
>>>>> [1].
>>>>>>>>> Java
>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> most important API language and right now we treat it
>>>>> as
>>>>>> a
>>>>>>>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add
>>> Scala
>>>>>> if
>>>>>>>>>>> you
>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
>>>>>>>> between
>>>>>>>>>>>>>>>> `public
>>>>>>>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
>>>>> toString()`.
>>>>>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
>>>>> reimplementing
>>>>>>>> the
>>>>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
>>>>>>>>> reach.
>>>>>>>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>>>>>>>>>>> long-term
>>>>>>>>>>>>>>>> goal
>>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing
>>> and
>>>>>>>>>>> runtime
>>>>>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
>>>>> would
>>>>>>>>>>>>>>> require
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs
>>> can
>>>>>>>> use
>>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
>>>>>>>> sink,
>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
>>>>>>>> base.
>>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
>>>>>>>> classes
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>>>>>>>> potentially.
>>>>>>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>>>>>>>> traits-tp21335.html
>>> 
> 


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Thanks for the feedback, everyone!

I created a FLIP for these efforts: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free

I will open an umbrella Jira ticket for FLIP-28 with concrete subtasks 
shortly.

Thanks,
Timo

Am 29.11.18 um 12:44 schrieb Jark Wu:
> Thanks Timo,
>
> That makes sense to me. And I left the comment about code generation in doc.
>
> Looking forward to participate in it!
>
> Best,
> Jark
>
> On Thu, 29 Nov 2018 at 16:42, Timo Walther <tw...@apache.org> wrote:
>
>> @Kurt: Yes, I don't think that that forks of Flink will have a hard time
>> to keep up with the porting. That is also why I called this `long-term
>> goal` because I don't see big resources for the porting to happen
>> quicker. But at least new features, API, and runtime profit from Java to
>> Scala conversion.
>>
>> @Jark: I updated the document:
>>
>> 1. flink-table-common has been renamed to flink-table-spi by request.
>>
>> 2. Yes, good point. flink-sql-client can be moved there as well.
>>
>> 3. I added a paragraph to the document. Porting the code generation to
>> Java makes only sense if acceptable tooling for it is in place.
>>
>>
>> Thanks for the feedback,
>>
>> Timo
>>
>>
>> Am 29.11.18 um 08:28 schrieb Jark Wu:
>>> Hi Timo,
>>>
>>> Thanks for the great work!
>>>
>>> Moving flink-table to Java is a long-awaited things but will involve much
>>> effort. Agree with that we should make it as a long-term goal.
>>>
>>> I have read the google doc and +1 for the proposal. Here I have some
>>> questions:
>>>
>>> 1. Where should the flink-table-common module place ?  Will we move the
>>> flink-table-common classes to the new modules?
>>> 2. Should flink-sql-client also as a sub-module under flink-table ?
>>> 3. The flink-table-planner contains code generation and will be converted
>>> to Java. Actually, I prefer using Scala to code generate because of the
>>> Multiline-String and String-Interpolation (i.e. s"hello $user") features
>> in
>>> Scala. It makes code of code-generation more readable. Do we really
>>> want to migrate
>>> code generation to Java?
>>>
>>> Best,
>>> Jark
>>>
>>>
>>> On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
>>>
>>>> Hi Timo and Vino,
>>>>
>>>> I agree that table is very active and there is no guarantee for not
>>>> producing any conflicts if you decide
>>>> to develop based on community version. I think this part is the risk
>> what
>>>> we can imagine in the first place. But massively
>>>> language replacing is something you can not imagine and be ready for,
>> there
>>>> is no feature added, no refactor is done, simply changing
>>>> from scala to java will cause lots of conflicts.
>>>>
>>>> But I also agree that this is a "technical debt" that we should
>> eventually
>>>> pay, as you said, we can do this slowly, even one file each time,
>>>> let other people have more time to resolve the conflicts.
>>>>
>>>> Best,
>>>> Kurt
>>>>
>>>>
>>>> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org>
>> wrote:
>>>>> Hi Kurt,
>>>>>
>>>>> I understand your concerns. However, there is no concrete roadmap for
>>>>> Flink 2.0 and (as Vino said) the flink-table is developed very
>> actively.
>>>>> Major refactorings happened in the past and will also happen with or
>>>>> without Scala migration. A good example, is the proper catalog support
>>>>> which will refactor big parts of the TableEnvironment class. Or the
>>>>> introduction of "retractions" which needed a big refactoring of the
>>>>> planning phase. Stability is only guaranteed for the API and the
>> general
>>>>> behavior, however, currently flink-table is not using @Public or
>>>>> @PublicEvolving annotations for a reason.
>>>>>
>>>>> I think the migration will still happen slowly because it needs people
>>>>> that allocate time for that. Therefore, even Flink forks can slowly
>>>>> adapt to the evolving Scala-to-Java code base.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>> Am 27.11.18 um 13:16 schrieb vino yang:
>>>>>> Hi Kurt,
>>>>>>
>>>>>> Currently, there is still a long time to go from flink 2.0.
>> Considering
>>>>>> that the flink-table
>>>>>> is one of the most active modules in the current flink project, each
>>>>>> version has
>>>>>> a number of changes and features added. I think that refactoring
>> faster
>>>>>> will reduce subsequent
>>>>>> complexity and workload. And this may be a gradual and long process.
>> We
>>>>>> should be able to
>>>>>>     regard it as a "technical debt", and if it does not change it, it
>>>> will
>>>>>> also affect the decision-making of other issues.
>>>>>>
>>>>>> Thanks, vino.
>>>>>>
>>>>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>>>>>>
>>>>>>> Hi Timo,
>>>>>>>
>>>>>>> Thanks for writing up the document. I'm +1 for reorganizing the
>> module
>>>>>>> structure and make table scala free. But I have
>>>>>>> a little concern abount the timing. Is it more appropriate to get
>> this
>>>>> done
>>>>>>> when Flink decide to bump to next big version, like 2.x.
>>>>>>> It's true you can keep all the class's package path as it is, and
>> will
>>>>> not
>>>>>>> introduce API change. But if some company are developing their own
>>>>>>> Flink, and sync with community version by rebasing, may face a lot of
>>>>>>> conflicts. Although you can avoid conflicts by always moving source
>>>>> codes
>>>>>>> between packages, but I assume you still need to delete the original
>>>>> scala
>>>>>>> file and add a new java file when you want to change program
>> language.
>>>>>>> Best,
>>>>>>> Kurt
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
>>>>> wrote:
>>>>>>>> Hi Hequn,
>>>>>>>>
>>>>>>>> thanks for your feedback. Yes, migrating the test cases is another
>>>>> issue
>>>>>>>> that is not represented in the document but should naturally go
>> along
>>>>>>>> with the migration.
>>>>>>>>
>>>>>>>> I agree that we should migrate the main API classes quickly within
>>>> this
>>>>>>>> 1.8 release after the module split has been performed. Help here is
>>>>>>>> highly appreciated!
>>>>>>>>
>>>>>>>> I forgot that Java supports static methods in interfaces now, but
>>>>>>>> actually I don't like the design of calling
>>>>> `TableEnvironment.get(env)`.
>>>>>>>> Because people often use `TableEnvironment tEnd =
>>>>>>>> TableEnvironment.get(env)` and then wonder why there is no
>>>>>>>> `toAppendStream` or `toDataSet` because they are using the base
>>>> class.
>>>>>>>> However, things like that can be discussed in the corresponding
>> issue
>>>>>>>> when it comes to implementation.
>>>>>>>>
>>>>>>>> @Vino: I think your work fits nicely to these efforts.
>>>>>>>>
>>>>>>>> @everyone: I will wait for more feedback until end of this week.
>>>> Then I
>>>>>>>> will convert the design document into a FLIP and open subtasks in
>>>> Jira,
>>>>>>>> if there are no objections?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>>>>>>> Hi hequn,
>>>>>>>>>
>>>>>>>>> I am very glad to hear that you are interested in this work.
>>>>>>>>> As we all know, this process involves a lot.
>>>>>>>>> Currently, the migration work has begun. I started with the
>>>>>>>>> Kafka connector's dependency on flink-table and moved the
>>>>>>>>> related dependencies to flink-table-common.
>>>>>>>>> This work is tracked by FLINK-9461.  [1]
>>>>>>>>> I don't know if it will conflict with what you expect to do, but
>>>> from
>>>>>>> the
>>>>>>>>> impact I have observed,
>>>>>>>>> it will involve many classes that are currently in flink-table.
>>>>>>>>>
>>>>>>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>>>>>>
>>>>>>>>> Thanks, vino.
>>>>>>>>>
>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>>>>>>
>>>>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>>>>>>>>>
>>>>>>>>>> Hi Timo,
>>>>>>>>>>
>>>>>>>>>> Thanks for the effort and writing up this document. I like the
>> idea
>>>>> to
>>>>>>>> make
>>>>>>>>>> flink-table scala free, so +1 for the proposal!
>>>>>>>>>>
>>>>>>>>>> It's good to make Java the first-class citizen. For a long time,
>> we
>>>>>>> have
>>>>>>>>>> neglected java so that many features in Table are missed in Java
>>>> Test
>>>>>>>>>> cases, such as this one[1] I found recently. And I think we may
>>>> also
>>>>>>>> need
>>>>>>>>>> to migrate our test cases, i.e, add java tests.
>>>>>>>>>>
>>>>>>>>>> This definitely is a big change and will break API compatible. In
>>>>>>> order
>>>>>>>> to
>>>>>>>>>> bring a smaller impact on users, I think we should go fast when we
>>>>>>>> migrate
>>>>>>>>>> APIs targeted to users. It's better to introduce the user
>> sensitive
>>>>>>>> changes
>>>>>>>>>> within a release. However, it may be not that easy. I can help to
>>>>>>>>>> contribute.
>>>>>>>>>>
>>>>>>>>>> Separation of interface and implementation is a good idea. This
>> may
>>>>>>>>>> introduce a minimum of dependencies or even no dependencies. I saw
>>>>>>> your
>>>>>>>>>> reply in the google doc. Java8 has already supported static method
>>>>> for
>>>>>>>>>> interfaces, I think we can make use of it?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Hequn
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> thanks for the great feedback so far. I updated the document with
>>>>> the
>>>>>>>>>>> input I got so far
>>>>>>>>>>>
>>>>>>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
>>>>> the
>>>>>>>>>> list.
>>>>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
>>>> Do
>>>>>>>> you
>>>>>>>>>>> mean a module containing pure Java `interface`s? Or is the
>>>>> validation
>>>>>>>>>>> logic also part of the API module? Are 50+ expression classes
>> part
>>>>> of
>>>>>>>>>>> the API interface or already too implementation-specific?
>>>>>>>>>>>
>>>>>>>>>>> @Xuefu: I extended the document by almost a page to clarify when
>>>> we
>>>>>>>>>>> should develop in Scala and when in Java. As Piotr said, every
>> new
>>>>>>>> Scala
>>>>>>>>>>> line is instant technical debt.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Timo
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>> Java
>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>> that
>>>>>>> in
>>>>>>>>>> the
>>>>>>>>>>> current code base there are cases where a Scala class extends
>> Java
>>>>>>> and
>>>>>>>>>> vise
>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>> extension
>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>> However,
>>>>>>>>>> I'm
>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
>>>> we
>>>>>>>>>> will
>>>>>>>>>>> have to work it out as we go. One thing to consider is that from
>>>> now
>>>>>>>> on,
>>>>>>>>>>> every single new code line written in Scala anywhere in
>>>> Flink-table
>>>>>>>>>> (except
>>>>>>>>>>> of Flink-table-api-scala) is an instant technological debt. From
>>>>> this
>>>>>>>>>>> perspective I would be in favour of tolerating quite big
>>>>>>> inchonvieneces
>>>>>>>>>>> just to avoid any new Scala code.
>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>
>>>>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <
>> xuefu.z@alibaba-inc.com
>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the effort and the Google writeup. During our
>>>> external
>>>>>>>>>>> catalog rework, we found much confusion between Java and Scala,
>>>> and
>>>>>>>> this
>>>>>>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>>>> Java
>>>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>>>> that
>>>>>>> in
>>>>>>>>>> the
>>>>>>>>>>> current code base there are cases where a Scala class extends
>> Java
>>>>>>> and
>>>>>>>>>> vise
>>>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>>>> extension
>>>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>>>> However,
>>>>>>>>>> I'm
>>>>>>>>>>> not sure if this is practical.
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Xuefu
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>> ------------------------------------------------------------------
>>>>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>>>>>>> Scala-free
>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Currently when using SQL/TableAPI should include many
>>>> dependence.
>>>>>>> In
>>>>>>>>>>>>> particular, it is not necessary to introduce the specific
>>>>>>>>>> implementation
>>>>>>>>>>>>> dependencies which users do not care about. So I am glad to see
>>>>>>> your
>>>>>>>>>>>>> proposal, and hope when we consider splitting the API interface
>>>>>>> into
>>>>>>>> a
>>>>>>>>>>>>> separate module, so that the user can introduce minimum of
>>>>>>>>>> dependencies.
>>>>>>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
>>>>>>> `Table` &
>>>>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jincheng
>>>>>>>>>>>>>
>>>>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
>>>>>>> thing
>>>>>>>>>> to
>>>>>>>>>>> do.
>>>>>>>>>>>>>> While we are doing this, can we also keep in mind that we want
>>>> to
>>>>>>>>>>>>>> eventually have a TableAPI interface only module which users
>>>> can
>>>>>>>> take
>>>>>>>>>>>>>> dependency on, but without including any implementation
>>>> details?
>>>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
>>>> fhueske@gmail.com
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>>>>>>> I like the new structure and agree to prioritize the porting
>>>> of
>>>>>>> the
>>>>>>>>>>>>>>> flink-table-common classes.
>>>>>>>>>>>>>>> Since flink-table-runtime is (or should be) independent of
>> the
>>>>>>> API
>>>>>>>>>> and
>>>>>>>>>>>>>>> planner modules, we could start porting these classes once
>> the
>>>>>>> code
>>>>>>>>>> is
>>>>>>>>>>>>>>> split into the new module structure.
>>>>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>>>>>>>>>> Scala-free
>>>>>>>>>>>>>>> execution Jar.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>>> :
>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would like to continue this discussion thread and convert
>>>> the
>>>>>>>>>>> outcome
>>>>>>>>>>>>>>>> into a FLIP such that users and contributors know what to
>>>>> expect
>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I created a design document [1] that clarifies our
>> motivation
>>>>>>> why
>>>>>>>>>> we
>>>>>>>>>>>>>>>> want to do this, how a Maven module structure could look
>>>> like,
>>>>>>> and
>>>>>>>>>> a
>>>>>>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
>>>> release
>>>>>>>>>> such
>>>>>>>>>>>>>>>> that new features can be developed in Java and major
>>>>>>> refactorings
>>>>>>>>>>> such
>>>>>>>>>>>>>>>> as improvements to the connectors and external catalog
>>>> support
>>>>>>> are
>>>>>>>>>>> not
>>>>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
>>>> the
>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>> I think the first step would be to separate the flink-table
>>>>>>>> module
>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
>>>>> divided
>>>>>>>>>>>>>> further
>>>>>>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
>>>>>>>>>> everything
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime
>> module
>>>>>>> and
>>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>>>>>>> The api module will be much harder to port because of
>>>> several
>>>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
>>>> iterations,
>>>>>>>>>> etc.).
>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>>> not saying we should not port this to Java, but it is not
>>>>> clear
>>>>>>>> to
>>>>>>>>>>> me
>>>>>>>>>>>>>>>> (yet)
>>>>>>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
>>>>> The
>>>>>>>>>> code
>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing
>> very
>>>>>>>>>>>>>> Java-like.
>>>>>>>>>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>>>>>>>>>> individually
>>>>>>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>>>>>>> For flink-table-planning, we can have certain packages that
>>>> we
>>>>>>>>>> port
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
>>>> mostly
>>>>>>>>>> extend
>>>>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
>>>> choices
>>>>>>>> for
>>>>>>>>>>>>>>> being
>>>>>>>>>>>>>>>>> ported. The code generation classes will require more
>> effort
>>>>> to
>>>>>>>>>>> port.
>>>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>> are also some dependencies in planning on the api module
>>>> that
>>>>>>> we
>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For SQL most work when adding new features is done in the
>>>>>>>> planning
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>>>>>>>>>> "technological
>>>>>>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
>>>>> :
>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I also think about this problem these days and here are my
>>>>>>>>>>> thoughts.
>>>>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
>>>>> interoperate
>>>>>>>>>> with
>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
>>>> (Scala
>>>>>>>>>>>>>>> collections
>>>>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
>>>>> method
>>>>>>>>>>> which
>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>> Scala functions as parameters. Considering the major part
>>>> of
>>>>>>> the
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
>>>>>>> view.
>>>>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
>>>> and
>>>>>>>>>> make
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
>>>>>>>> achieved
>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any
>> new
>>>>>>>>>>> features
>>>>>>>>>>>>>>>>>> should be added in Java (regardless of the modules), in
>>>> order
>>>>>>> to
>>>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less
>> code
>>>>> we
>>>>>>>>>> will
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
>>>>>>> Fabian's
>>>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
>>>> even
>>>>>>>>>>> within
>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
>>>> we
>>>>>>>>>> could
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
>>>>> Java
>>>>>>>> in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>>>>>>> scala/java
>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>> bases
>>>>>>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>>>>>>> Secondly this whole migration might and most like will
>>>> take
>>>>>>>>>> longer
>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
>>>>>>> will
>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
>>>> almost
>>>>>>> any
>>>>>>>>>>> new
>>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>> line of code will be immediately a technological debt and
>>>> we
>>>>>>>> will
>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal -
>> modules
>>>>>>>>>>>>>> structure
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> which parts of modules we want to have eventually
>>>> Scala-free.
>>>>>>>>>>>>>> Secondly
>>>>>>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
>>>>>>> code
>>>>>>>>>>>>>>>> complaint
>>>>>>>>>>>>>>>>>> with our end goal. Only after that we should/could focus
>> on
>>>>>>>>>>>>>>>> incrementally
>>>>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be
>> stuck/blocked
>>>>>>> for
>>>>>>>>>>>>>> years
>>>>>>>>>>>>>>>>>> writing new code in Scala (and increasing technological
>>>>> debt),
>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>> nobody have found a time to rewrite some non important and
>>>>> not
>>>>>>>>>>>>>>> actively
>>>>>>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
>>>> fhueske@gmail.com
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
>>>>> won't
>>>>>>>> be
>>>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>>>>>>> fragmented
>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think we should do this one step at a time and focus
>> on
>>>>>>>>>>>>>> migrating
>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
>>>>> Java.
>>>>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
>>>> them
>>>>>>> to
>>>>>>>>>>>>>> Java,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
>>>>>>>>>> breaking
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>>>>>>> trohrmann@apache.org
>>>>>>>>>>> :
>>>>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we
>> should
>>>>>>>>>> strive
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>> This, however, must be an iterative process given the
>>>>> sheer
>>>>>>>>>> size
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
>>>>>>> modules
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving
>> classes
>>>>>>> from
>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
>>>>>>>> interacts
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>>>>>>>>>> generally
>>>>>>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>>>>>>> `flink-table-core`
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to
>> be
>>>>>>> able
>>>>>>>>>> to
>>>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
>>>>>>>> coexist
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
>>>> Java.
>>>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>>>>>>>>>> implemented
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
>>>> initital
>>>>>>>> code
>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
>>>> kept
>>>>>>>>>> Scala
>>>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
>>>> API
>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
>>>> for
>>>>>>>>>> quick
>>>>>>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>>>>>>>>>> committers
>>>>>>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming
>> languages.
>>>>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and
>> more
>>>>>>>>>> becomes
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
>>>>>>> formats,
>>>>>>>>>> and
>>>>>>>>>>>>>>> SQL
>>>>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
>>>> interoperate
>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
>>>>> mentioned
>>>>>>>> in
>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
>>>>>>> member
>>>>>>>>>>>>>>>> variables
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
>>>> [1].
>>>>>>>> Java
>>>>>>>>>>> is
>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> most important API language and right now we treat it
>>>> as
>>>>> a
>>>>>>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add
>> Scala
>>>>> if
>>>>>>>>>> you
>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
>>>>>>> between
>>>>>>>>>>>>>>> `public
>>>>>>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
>>>> toString()`.
>>>>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
>>>> reimplementing
>>>>>>> the
>>>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
>>>>>>>> reach.
>>>>>>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>>>>>>>>>> long-term
>>>>>>>>>>>>>>> goal
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing
>> and
>>>>>>>>>> runtime
>>>>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
>>>> would
>>>>>>>>>>>>>> require
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs
>> can
>>>>>>> use
>>>>>>>>>>>>>> this.
>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
>>>>>>> sink,
>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
>>>>>>> base.
>>>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
>>>>>>> classes
>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>>>>>>> potentially.
>>>>>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>>>>>>> traits-tp21335.html
>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Jark Wu <im...@gmail.com>.
Thanks Timo,

That makes sense to me. And I left the comment about code generation in doc.

Looking forward to participate in it!

Best,
Jark

On Thu, 29 Nov 2018 at 16:42, Timo Walther <tw...@apache.org> wrote:

> @Kurt: Yes, I don't think that that forks of Flink will have a hard time
> to keep up with the porting. That is also why I called this `long-term
> goal` because I don't see big resources for the porting to happen
> quicker. But at least new features, API, and runtime profit from Java to
> Scala conversion.
>
> @Jark: I updated the document:
>
> 1. flink-table-common has been renamed to flink-table-spi by request.
>
> 2. Yes, good point. flink-sql-client can be moved there as well.
>
> 3. I added a paragraph to the document. Porting the code generation to
> Java makes only sense if acceptable tooling for it is in place.
>
>
> Thanks for the feedback,
>
> Timo
>
>
> Am 29.11.18 um 08:28 schrieb Jark Wu:
> > Hi Timo,
> >
> > Thanks for the great work!
> >
> > Moving flink-table to Java is a long-awaited things but will involve much
> > effort. Agree with that we should make it as a long-term goal.
> >
> > I have read the google doc and +1 for the proposal. Here I have some
> > questions:
> >
> > 1. Where should the flink-table-common module place ?  Will we move the
> > flink-table-common classes to the new modules?
> > 2. Should flink-sql-client also as a sub-module under flink-table ?
> > 3. The flink-table-planner contains code generation and will be converted
> > to Java. Actually, I prefer using Scala to code generate because of the
> > Multiline-String and String-Interpolation (i.e. s"hello $user") features
> in
> > Scala. It makes code of code-generation more readable. Do we really
> > want to migrate
> > code generation to Java?
> >
> > Best,
> > Jark
> >
> >
> > On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
> >
> >> Hi Timo and Vino,
> >>
> >> I agree that table is very active and there is no guarantee for not
> >> producing any conflicts if you decide
> >> to develop based on community version. I think this part is the risk
> what
> >> we can imagine in the first place. But massively
> >> language replacing is something you can not imagine and be ready for,
> there
> >> is no feature added, no refactor is done, simply changing
> >> from scala to java will cause lots of conflicts.
> >>
> >> But I also agree that this is a "technical debt" that we should
> eventually
> >> pay, as you said, we can do this slowly, even one file each time,
> >> let other people have more time to resolve the conflicts.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org>
> wrote:
> >>
> >>> Hi Kurt,
> >>>
> >>> I understand your concerns. However, there is no concrete roadmap for
> >>> Flink 2.0 and (as Vino said) the flink-table is developed very
> actively.
> >>> Major refactorings happened in the past and will also happen with or
> >>> without Scala migration. A good example, is the proper catalog support
> >>> which will refactor big parts of the TableEnvironment class. Or the
> >>> introduction of "retractions" which needed a big refactoring of the
> >>> planning phase. Stability is only guaranteed for the API and the
> general
> >>> behavior, however, currently flink-table is not using @Public or
> >>> @PublicEvolving annotations for a reason.
> >>>
> >>> I think the migration will still happen slowly because it needs people
> >>> that allocate time for that. Therefore, even Flink forks can slowly
> >>> adapt to the evolving Scala-to-Java code base.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>>
> >>> Am 27.11.18 um 13:16 schrieb vino yang:
> >>>> Hi Kurt,
> >>>>
> >>>> Currently, there is still a long time to go from flink 2.0.
> Considering
> >>>> that the flink-table
> >>>> is one of the most active modules in the current flink project, each
> >>>> version has
> >>>> a number of changes and features added. I think that refactoring
> faster
> >>>> will reduce subsequent
> >>>> complexity and workload. And this may be a gradual and long process.
> We
> >>>> should be able to
> >>>>    regard it as a "technical debt", and if it does not change it, it
> >> will
> >>>> also affect the decision-making of other issues.
> >>>>
> >>>> Thanks, vino.
> >>>>
> >>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
> >>>>
> >>>>> Hi Timo,
> >>>>>
> >>>>> Thanks for writing up the document. I'm +1 for reorganizing the
> module
> >>>>> structure and make table scala free. But I have
> >>>>> a little concern abount the timing. Is it more appropriate to get
> this
> >>> done
> >>>>> when Flink decide to bump to next big version, like 2.x.
> >>>>> It's true you can keep all the class's package path as it is, and
> will
> >>> not
> >>>>> introduce API change. But if some company are developing their own
> >>>>> Flink, and sync with community version by rebasing, may face a lot of
> >>>>> conflicts. Although you can avoid conflicts by always moving source
> >>> codes
> >>>>> between packages, but I assume you still need to delete the original
> >>> scala
> >>>>> file and add a new java file when you want to change program
> language.
> >>>>>
> >>>>> Best,
> >>>>> Kurt
> >>>>>
> >>>>>
> >>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
> >>> wrote:
> >>>>>> Hi Hequn,
> >>>>>>
> >>>>>> thanks for your feedback. Yes, migrating the test cases is another
> >>> issue
> >>>>>> that is not represented in the document but should naturally go
> along
> >>>>>> with the migration.
> >>>>>>
> >>>>>> I agree that we should migrate the main API classes quickly within
> >> this
> >>>>>> 1.8 release after the module split has been performed. Help here is
> >>>>>> highly appreciated!
> >>>>>>
> >>>>>> I forgot that Java supports static methods in interfaces now, but
> >>>>>> actually I don't like the design of calling
> >>> `TableEnvironment.get(env)`.
> >>>>>> Because people often use `TableEnvironment tEnd =
> >>>>>> TableEnvironment.get(env)` and then wonder why there is no
> >>>>>> `toAppendStream` or `toDataSet` because they are using the base
> >> class.
> >>>>>> However, things like that can be discussed in the corresponding
> issue
> >>>>>> when it comes to implementation.
> >>>>>>
> >>>>>> @Vino: I think your work fits nicely to these efforts.
> >>>>>>
> >>>>>> @everyone: I will wait for more feedback until end of this week.
> >> Then I
> >>>>>> will convert the design document into a FLIP and open subtasks in
> >> Jira,
> >>>>>> if there are no objections?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Timo
> >>>>>>
> >>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
> >>>>>>> Hi hequn,
> >>>>>>>
> >>>>>>> I am very glad to hear that you are interested in this work.
> >>>>>>> As we all know, this process involves a lot.
> >>>>>>> Currently, the migration work has begun. I started with the
> >>>>>>> Kafka connector's dependency on flink-table and moved the
> >>>>>>> related dependencies to flink-table-common.
> >>>>>>> This work is tracked by FLINK-9461.  [1]
> >>>>>>> I don't know if it will conflict with what you expect to do, but
> >> from
> >>>>> the
> >>>>>>> impact I have observed,
> >>>>>>> it will involve many classes that are currently in flink-table.
> >>>>>>>
> >>>>>>> *Just a statement to prevent unnecessary conflicts.*
> >>>>>>>
> >>>>>>> Thanks, vino.
> >>>>>>>
> >>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
> >>>>>>>
> >>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
> >>>>>>>
> >>>>>>>> Hi Timo,
> >>>>>>>>
> >>>>>>>> Thanks for the effort and writing up this document. I like the
> idea
> >>> to
> >>>>>> make
> >>>>>>>> flink-table scala free, so +1 for the proposal!
> >>>>>>>>
> >>>>>>>> It's good to make Java the first-class citizen. For a long time,
> we
> >>>>> have
> >>>>>>>> neglected java so that many features in Table are missed in Java
> >> Test
> >>>>>>>> cases, such as this one[1] I found recently. And I think we may
> >> also
> >>>>>> need
> >>>>>>>> to migrate our test cases, i.e, add java tests.
> >>>>>>>>
> >>>>>>>> This definitely is a big change and will break API compatible. In
> >>>>> order
> >>>>>> to
> >>>>>>>> bring a smaller impact on users, I think we should go fast when we
> >>>>>> migrate
> >>>>>>>> APIs targeted to users. It's better to introduce the user
> sensitive
> >>>>>> changes
> >>>>>>>> within a release. However, it may be not that easy. I can help to
> >>>>>>>> contribute.
> >>>>>>>>
> >>>>>>>> Separation of interface and implementation is a good idea. This
> may
> >>>>>>>> introduce a minimum of dependencies or even no dependencies. I saw
> >>>>> your
> >>>>>>>> reply in the google doc. Java8 has already supported static method
> >>> for
> >>>>>>>> interfaces, I think we can make use of it?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Hequn
> >>>>>>>>
> >>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
> >>>>>> wrote:
> >>>>>>>>> Hi everyone,
> >>>>>>>>>
> >>>>>>>>> thanks for the great feedback so far. I updated the document with
> >>> the
> >>>>>>>>> input I got so far
> >>>>>>>>>
> >>>>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
> >>> the
> >>>>>>>> list.
> >>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
> >> Do
> >>>>>> you
> >>>>>>>>> mean a module containing pure Java `interface`s? Or is the
> >>> validation
> >>>>>>>>> logic also part of the API module? Are 50+ expression classes
> part
> >>> of
> >>>>>>>>> the API interface or already too implementation-specific?
> >>>>>>>>>
> >>>>>>>>> @Xuefu: I extended the document by almost a page to clarify when
> >> we
> >>>>>>>>> should develop in Scala and when in Java. As Piotr said, every
> new
> >>>>>> Scala
> >>>>>>>>> line is instant technical debt.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> >>>>>>>>>> Hi Timo,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for writing this down +1 from my side :)
> >>>>>>>>>>
> >>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
> >>>>> Java
> >>>>>>>>> and Scala coexist that dependency can only be one-way. I found
> >> that
> >>>>> in
> >>>>>>>> the
> >>>>>>>>> current code base there are cases where a Scala class extends
> Java
> >>>>> and
> >>>>>>>> vise
> >>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
> >>>>>> extension
> >>>>>>>>> can only be from Java to Scala, which will help the situation.
> >>>>> However,
> >>>>>>>> I'm
> >>>>>>>>> not sure if this is practical.
> >>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
> >> we
> >>>>>>>> will
> >>>>>>>>> have to work it out as we go. One thing to consider is that from
> >> now
> >>>>>> on,
> >>>>>>>>> every single new code line written in Scala anywhere in
> >> Flink-table
> >>>>>>>> (except
> >>>>>>>>> of Flink-table-api-scala) is an instant technological debt. From
> >>> this
> >>>>>>>>> perspective I would be in favour of tolerating quite big
> >>>>> inchonvieneces
> >>>>>>>>> just to avoid any new Scala code.
> >>>>>>>>>> Piotrek
> >>>>>>>>>>
> >>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <
> xuefu.z@alibaba-inc.com
> >>>>>>>> wrote:
> >>>>>>>>>>> Hi Timo,
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the effort and the Google writeup. During our
> >> external
> >>>>>>>>> catalog rework, we found much confusion between Java and Scala,
> >> and
> >>>>>> this
> >>>>>>>>> Scala-free roadmap should greatly mitigate that.
> >>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
> >>>>> Java
> >>>>>>>>> and Scala coexist that dependency can only be one-way. I found
> >> that
> >>>>> in
> >>>>>>>> the
> >>>>>>>>> current code base there are cases where a Scala class extends
> Java
> >>>>> and
> >>>>>>>> vise
> >>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
> >>>>>> extension
> >>>>>>>>> can only be from Java to Scala, which will help the situation.
> >>>>> However,
> >>>>>>>> I'm
> >>>>>>>>> not sure if this is practical.
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Xuefu
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >> ------------------------------------------------------------------
> >>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
> >>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
> >>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
> >>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
> >>>>> Scala-free
> >>>>>>>>>>> Hi Timo,
> >>>>>>>>>>> Thanks for initiating this great discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> Currently when using SQL/TableAPI should include many
> >> dependence.
> >>>>> In
> >>>>>>>>>>> particular, it is not necessary to introduce the specific
> >>>>>>>> implementation
> >>>>>>>>>>> dependencies which users do not care about. So I am glad to see
> >>>>> your
> >>>>>>>>>>> proposal, and hope when we consider splitting the API interface
> >>>>> into
> >>>>>> a
> >>>>>>>>>>> separate module, so that the user can introduce minimum of
> >>>>>>>> dependencies.
> >>>>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
> >>>>> `Table` &
> >>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Jincheng
> >>>>>>>>>>>
> >>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
> >>>>> thing
> >>>>>>>> to
> >>>>>>>>> do.
> >>>>>>>>>>>> While we are doing this, can we also keep in mind that we want
> >> to
> >>>>>>>>>>>> eventually have a TableAPI interface only module which users
> >> can
> >>>>>> take
> >>>>>>>>>>>> dependency on, but without including any implementation
> >> details?
> >>>>>>>>>>>> Xiaowei
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
> >> fhueske@gmail.com
> >>>>>>>>> wrote:
> >>>>>>>>>>>>> Hi Timo,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for writing up this document.
> >>>>>>>>>>>>> I like the new structure and agree to prioritize the porting
> >> of
> >>>>> the
> >>>>>>>>>>>>> flink-table-common classes.
> >>>>>>>>>>>>> Since flink-table-runtime is (or should be) independent of
> the
> >>>>> API
> >>>>>>>> and
> >>>>>>>>>>>>> planner modules, we could start porting these classes once
> the
> >>>>> code
> >>>>>>>> is
> >>>>>>>>>>>>> split into the new module structure.
> >>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
> >>>>>>>> Scala-free
> >>>>>>>>>>>>> execution Jar.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best, Fabian
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> >>>>>>>>>>>>> twalthr@apache.org
> >>>>>>>>>>>>>> :
> >>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I would like to continue this discussion thread and convert
> >> the
> >>>>>>>>> outcome
> >>>>>>>>>>>>>> into a FLIP such that users and contributors know what to
> >>> expect
> >>>>>> in
> >>>>>>>>> the
> >>>>>>>>>>>>>> upcoming releases.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I created a design document [1] that clarifies our
> motivation
> >>>>> why
> >>>>>>>> we
> >>>>>>>>>>>>>> want to do this, how a Maven module structure could look
> >> like,
> >>>>> and
> >>>>>>>> a
> >>>>>>>>>>>>>> suggestion for a migration plan.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
> >> release
> >>>>>>>> such
> >>>>>>>>>>>>>> that new features can be developed in Java and major
> >>>>> refactorings
> >>>>>>>>> such
> >>>>>>>>>>>>>> as improvements to the connectors and external catalog
> >> support
> >>>>> are
> >>>>>>>>> not
> >>>>>>>>>>>>>> blocked.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please let me know what you think.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> >>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> >>>>>>>>>>>>>>> Hi Piotr,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
> >> the
> >>>>>>>>>>>> comments.
> >>>>>>>>>>>>>>> I think the first step would be to separate the flink-table
> >>>>>> module
> >>>>>>>>>>>> into
> >>>>>>>>>>>>>>> multiple sub modules. These could be:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
> >>> divided
> >>>>>>>>>>>> further
> >>>>>>>>>>>>>>> into Java/Scala Table API/SQL
> >>>>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
> >>>>>>>> everything
> >>>>>>>>>>>> we
> >>>>>>>>>>>>> do
> >>>>>>>>>>>>>>> with Calcite)
> >>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime
> module
> >>>>> and
> >>>>>>>>>>>>> certain
> >>>>>>>>>>>>>>> parts of the planning module ported to Java.
> >>>>>>>>>>>>>>> The api module will be much harder to port because of
> >> several
> >>>>>>>>>>>>>> dependencies
> >>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
> >> iterations,
> >>>>>>>> etc.).
> >>>>>>>>>>>>> I'm
> >>>>>>>>>>>>>>> not saying we should not port this to Java, but it is not
> >>> clear
> >>>>>> to
> >>>>>>>>> me
> >>>>>>>>>>>>>> (yet)
> >>>>>>>>>>>>>>> how to do it.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
> >>> The
> >>>>>>>> code
> >>>>>>>>>>>>> does
> >>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing
> very
> >>>>>>>>>>>> Java-like.
> >>>>>>>>>>>>>>> Also, there are not many dependencies and operators can be
> >>>>>>>>>>>> individually
> >>>>>>>>>>>>>>> ported step-by-step.
> >>>>>>>>>>>>>>> For flink-table-planning, we can have certain packages that
> >> we
> >>>>>>>> port
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>> Java
> >>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
> >> mostly
> >>>>>>>> extend
> >>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
> >> choices
> >>>>>> for
> >>>>>>>>>>>>> being
> >>>>>>>>>>>>>>> ported. The code generation classes will require more
> effort
> >>> to
> >>>>>>>>> port.
> >>>>>>>>>>>>>> There
> >>>>>>>>>>>>>>> are also some dependencies in planning on the api module
> >> that
> >>>>> we
> >>>>>>>>>>>> would
> >>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>> to resolve somehow.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For SQL most work when adding new features is done in the
> >>>>>> planning
> >>>>>>>>>>>> and
> >>>>>>>>>>>>>>> runtime modules. So, this separation should already reduce
> >>>>>>>>>>>>> "technological
> >>>>>>>>>>>>>>> dept" quite a lot.
> >>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Cheers, Fabian
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
> >>> :
> >>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I also think about this problem these days and here are my
> >>>>>>>>> thoughts.
> >>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
> >>> interoperate
> >>>>>>>> with
> >>>>>>>>>>>>> Java
> >>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
> >> (Scala
> >>>>>>>>>>>>> collections
> >>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
> >>> method
> >>>>>>>>> which
> >>>>>>>>>>>>>> takes
> >>>>>>>>>>>>>>>> Scala functions as parameters. Considering the major part
> >> of
> >>>>> the
> >>>>>>>>>>>> code
> >>>>>>>>>>>>>> base
> >>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
> >>>>> view.
> >>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
> >> and
> >>>>>>>> make
> >>>>>>>>>>>> all
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
> >>>>>> achieved
> >>>>>>>>>>>> even
> >>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> >>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any
> new
> >>>>>>>>> features
> >>>>>>>>>>>>>>>> should be added in Java (regardless of the modules), in
> >> order
> >>>>> to
> >>>>>>>>>>>>> prevent
> >>>>>>>>>>>>>>>> the Scala codes from growing.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Xingcan
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> >>>>>>>>>>>> piotr@data-artisans.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>> Bumping the topic.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less
> code
> >>> we
> >>>>>>>> will
> >>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
> >>>>> Fabian's
> >>>>>>>>>>>>>> proposal
> >>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
> >>>>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
> >> even
> >>>>>>>>> within
> >>>>>>>>>>>>> one
> >>>>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
> >> we
> >>>>>>>> could
> >>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
> >>> Java
> >>>>>> in
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
> >>>>> scala/java
> >>>>>>>>>>>> code
> >>>>>>>>>>>>>> bases
> >>>>>>>>>>>>>>>> before, so I might be missing something here.
> >>>>>>>>>>>>>>>>> Secondly this whole migration might and most like will
> >> take
> >>>>>>>> longer
> >>>>>>>>>>>>> then
> >>>>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
> >>>>> will
> >>>>>>>> be
> >>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
> >> almost
> >>>>> any
> >>>>>>>>> new
> >>>>>>>>>>>>>> Scala
> >>>>>>>>>>>>>>>> line of code will be immediately a technological debt and
> >> we
> >>>>>> will
> >>>>>>>>>>>> have
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> rewrite it to Java later.
> >>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal -
> modules
> >>>>>>>>>>>> structure
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> which parts of modules we want to have eventually
> >> Scala-free.
> >>>>>>>>>>>> Secondly
> >>>>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
> >>>>> code
> >>>>>>>>>>>>>> complaint
> >>>>>>>>>>>>>>>> with our end goal. Only after that we should/could focus
> on
> >>>>>>>>>>>>>> incrementally
> >>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be
> stuck/blocked
> >>>>> for
> >>>>>>>>>>>> years
> >>>>>>>>>>>>>>>> writing new code in Scala (and increasing technological
> >>> debt),
> >>>>>>>>>>>> because
> >>>>>>>>>>>>>>>> nobody have found a time to rewrite some non important and
> >>> not
> >>>>>>>>>>>>> actively
> >>>>>>>>>>>>>>>> developed part of some module.
> >>>>>>>>>>>>>>>>> Piotrek
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
> >> fhueske@gmail.com
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
> >>> won't
> >>>>>> be
> >>>>>>>>>>>> easy
> >>>>>>>>>>>>>>>> and I
> >>>>>>>>>>>>>>>>>> think we have to plan this well.
> >>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
> >>>>> fragmented
> >>>>>>>>>>>> into
> >>>>>>>>>>>>>> Java
> >>>>>>>>>>>>>>>>>> and Scala code for too long.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I think we should do this one step at a time and focus
> on
> >>>>>>>>>>>> migrating
> >>>>>>>>>>>>>> one
> >>>>>>>>>>>>>>>>>> module at a time.
> >>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
> >>> Java.
> >>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
> >> them
> >>>>> to
> >>>>>>>>>>>> Java,
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
> >>>>>>>> breaking
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> API
> >>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best, Fabian
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
> >>>>>> trohrmann@apache.org
> >>>>>>>>> :
> >>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we
> should
> >>>>>>>> strive
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>> it.
> >>>>>>>>>>>>>>>>>>> This, however, must be an iterative process given the
> >>> sheer
> >>>>>>>> size
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
> >>>>> modules
> >>>>>>>>>>>> which
> >>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>> used
> >>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving
> classes
> >>>>> from
> >>>>>>>>>>>> Scala
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> >>>>>>>>>>>>>>>> piotr@data-artisans.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
> >>>>>> interacts
> >>>>>>>>>>>> with
> >>>>>>>>>>>>>>>> each
> >>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
> >>>>>>>> generally
> >>>>>>>>>>>>>>>> speaking
> >>>>>>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>>>>>> from me.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
> >>>>>>>>>>>>> `flink-table-core`
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to
> be
> >>>>> able
> >>>>>>>> to
> >>>>>>>>>>>>> add
> >>>>>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
> >>>>>> coexist
> >>>>>>>>>>>> with
> >>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
> >> Java.
> >>>>>>>>>>>>>>>>>>>> Piotrek
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
> >>>>> twalthr@apache.org
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
> >>>>>>>> implemented
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> Scala.
> >>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
> >> initital
> >>>>>> code
> >>>>>>>>>>>> base
> >>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
> >> kept
> >>>>>>>> Scala
> >>>>>>>>>>>>>>>> because of
> >>>>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
> >> API
> >>>>>>>> like
> >>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
> >> for
> >>>>>>>> quick
> >>>>>>>>>>>>>>>>>>> prototyping
> >>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
> >>>>>>>> committers
> >>>>>>>>>>>>>>>> enforced
> >>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming
> languages.
> >>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and
> more
> >>>>>>>> becomes
> >>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
> >>>>> formats,
> >>>>>>>> and
> >>>>>>>>>>>>> SQL
> >>>>>>>>>>>>>>>>>>> client
> >>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
> >> interoperate
> >>>>>>>> with
> >>>>>>>>>>>>>>>>>>> flink-table
> >>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
> >>> mentioned
> >>>>>> in
> >>>>>>>>> an
> >>>>>>>>>>>>>>>> earlier
> >>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
> >>>>> member
> >>>>>>>>>>>>>> variables
> >>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
> >> [1].
> >>>>>> Java
> >>>>>>>>> is
> >>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> most important API language and right now we treat it
> >> as
> >>> a
> >>>>>>>>>>>>>>>> second-class
> >>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add
> Scala
> >>> if
> >>>>>>>> you
> >>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
> >>>>> between
> >>>>>>>>>>>>> `public
> >>>>>>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
> >> toString()`.
> >>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
> >> reimplementing
> >>>>> the
> >>>>>>>>>>>>> entire
> >>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
> >>>>>> reach.
> >>>>>>>>>>>>>>>> However, we
> >>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
> >>>>>>>> long-term
> >>>>>>>>>>>>> goal
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing
> and
> >>>>>>>> runtime
> >>>>>>>>>>>>>>>> classes
> >>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
> >>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> >>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
> >> would
> >>>>>>>>>>>> require
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
> >>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> >>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> flink-table-common
> >>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs
> can
> >>>>> use
> >>>>>>>>>>>> this.
> >>>>>>>>>>>>> It
> >>>>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
> >>>>> sink,
> >>>>>>>>>>>> table
> >>>>>>>>>>>>>>>> source.
> >>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> >>>>>>>>>>>>>>>>>>>> flink-table-runtime}
> >>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
> >>>>> base.
> >>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
> >>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
> >>>>> classes
> >>>>>>>> in
> >>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> >>>>>>>> potentially.
> >>>>>>>>>>>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> >>>>>>>>>>>>>>>>>>>
> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> >>>>>>>>>>>>>>>> traits-tp21335.html
> >>>
>
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
@Kurt: Yes, I don't think that that forks of Flink will have a hard time 
to keep up with the porting. That is also why I called this `long-term 
goal` because I don't see big resources for the porting to happen 
quicker. But at least new features, API, and runtime profit from Java to 
Scala conversion.

@Jark: I updated the document:

1. flink-table-common has been renamed to flink-table-spi by request.

2. Yes, good point. flink-sql-client can be moved there as well.

3. I added a paragraph to the document. Porting the code generation to 
Java makes only sense if acceptable tooling for it is in place.


Thanks for the feedback,

Timo


Am 29.11.18 um 08:28 schrieb Jark Wu:
> Hi Timo,
>
> Thanks for the great work!
>
> Moving flink-table to Java is a long-awaited things but will involve much
> effort. Agree with that we should make it as a long-term goal.
>
> I have read the google doc and +1 for the proposal. Here I have some
> questions:
>
> 1. Where should the flink-table-common module place ?  Will we move the
> flink-table-common classes to the new modules?
> 2. Should flink-sql-client also as a sub-module under flink-table ?
> 3. The flink-table-planner contains code generation and will be converted
> to Java. Actually, I prefer using Scala to code generate because of the
> Multiline-String and String-Interpolation (i.e. s"hello $user") features in
> Scala. It makes code of code-generation more readable. Do we really
> want to migrate
> code generation to Java?
>
> Best,
> Jark
>
>
> On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:
>
>> Hi Timo and Vino,
>>
>> I agree that table is very active and there is no guarantee for not
>> producing any conflicts if you decide
>> to develop based on community version. I think this part is the risk what
>> we can imagine in the first place. But massively
>> language replacing is something you can not imagine and be ready for, there
>> is no feature added, no refactor is done, simply changing
>> from scala to java will cause lots of conflicts.
>>
>> But I also agree that this is a "technical debt" that we should eventually
>> pay, as you said, we can do this slowly, even one file each time,
>> let other people have more time to resolve the conflicts.
>>
>> Best,
>> Kurt
>>
>>
>> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org> wrote:
>>
>>> Hi Kurt,
>>>
>>> I understand your concerns. However, there is no concrete roadmap for
>>> Flink 2.0 and (as Vino said) the flink-table is developed very actively.
>>> Major refactorings happened in the past and will also happen with or
>>> without Scala migration. A good example, is the proper catalog support
>>> which will refactor big parts of the TableEnvironment class. Or the
>>> introduction of "retractions" which needed a big refactoring of the
>>> planning phase. Stability is only guaranteed for the API and the general
>>> behavior, however, currently flink-table is not using @Public or
>>> @PublicEvolving annotations for a reason.
>>>
>>> I think the migration will still happen slowly because it needs people
>>> that allocate time for that. Therefore, even Flink forks can slowly
>>> adapt to the evolving Scala-to-Java code base.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>> Am 27.11.18 um 13:16 schrieb vino yang:
>>>> Hi Kurt,
>>>>
>>>> Currently, there is still a long time to go from flink 2.0. Considering
>>>> that the flink-table
>>>> is one of the most active modules in the current flink project, each
>>>> version has
>>>> a number of changes and features added. I think that refactoring faster
>>>> will reduce subsequent
>>>> complexity and workload. And this may be a gradual and long process. We
>>>> should be able to
>>>>    regard it as a "technical debt", and if it does not change it, it
>> will
>>>> also affect the decision-making of other issues.
>>>>
>>>> Thanks, vino.
>>>>
>>>> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>>>>
>>>>> Hi Timo,
>>>>>
>>>>> Thanks for writing up the document. I'm +1 for reorganizing the module
>>>>> structure and make table scala free. But I have
>>>>> a little concern abount the timing. Is it more appropriate to get this
>>> done
>>>>> when Flink decide to bump to next big version, like 2.x.
>>>>> It's true you can keep all the class's package path as it is, and will
>>> not
>>>>> introduce API change. But if some company are developing their own
>>>>> Flink, and sync with community version by rebasing, may face a lot of
>>>>> conflicts. Although you can avoid conflicts by always moving source
>>> codes
>>>>> between packages, but I assume you still need to delete the original
>>> scala
>>>>> file and add a new java file when you want to change program language.
>>>>>
>>>>> Best,
>>>>> Kurt
>>>>>
>>>>>
>>>>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
>>> wrote:
>>>>>> Hi Hequn,
>>>>>>
>>>>>> thanks for your feedback. Yes, migrating the test cases is another
>>> issue
>>>>>> that is not represented in the document but should naturally go along
>>>>>> with the migration.
>>>>>>
>>>>>> I agree that we should migrate the main API classes quickly within
>> this
>>>>>> 1.8 release after the module split has been performed. Help here is
>>>>>> highly appreciated!
>>>>>>
>>>>>> I forgot that Java supports static methods in interfaces now, but
>>>>>> actually I don't like the design of calling
>>> `TableEnvironment.get(env)`.
>>>>>> Because people often use `TableEnvironment tEnd =
>>>>>> TableEnvironment.get(env)` and then wonder why there is no
>>>>>> `toAppendStream` or `toDataSet` because they are using the base
>> class.
>>>>>> However, things like that can be discussed in the corresponding issue
>>>>>> when it comes to implementation.
>>>>>>
>>>>>> @Vino: I think your work fits nicely to these efforts.
>>>>>>
>>>>>> @everyone: I will wait for more feedback until end of this week.
>> Then I
>>>>>> will convert the design document into a FLIP and open subtasks in
>> Jira,
>>>>>> if there are no objections?
>>>>>>
>>>>>> Regards,
>>>>>> Timo
>>>>>>
>>>>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>>>>> Hi hequn,
>>>>>>>
>>>>>>> I am very glad to hear that you are interested in this work.
>>>>>>> As we all know, this process involves a lot.
>>>>>>> Currently, the migration work has begun. I started with the
>>>>>>> Kafka connector's dependency on flink-table and moved the
>>>>>>> related dependencies to flink-table-common.
>>>>>>> This work is tracked by FLINK-9461.  [1]
>>>>>>> I don't know if it will conflict with what you expect to do, but
>> from
>>>>> the
>>>>>>> impact I have observed,
>>>>>>> it will involve many classes that are currently in flink-table.
>>>>>>>
>>>>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>>>>
>>>>>>> Thanks, vino.
>>>>>>>
>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>>>>
>>>>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>>>>>>>
>>>>>>>> Hi Timo,
>>>>>>>>
>>>>>>>> Thanks for the effort and writing up this document. I like the idea
>>> to
>>>>>> make
>>>>>>>> flink-table scala free, so +1 for the proposal!
>>>>>>>>
>>>>>>>> It's good to make Java the first-class citizen. For a long time, we
>>>>> have
>>>>>>>> neglected java so that many features in Table are missed in Java
>> Test
>>>>>>>> cases, such as this one[1] I found recently. And I think we may
>> also
>>>>>> need
>>>>>>>> to migrate our test cases, i.e, add java tests.
>>>>>>>>
>>>>>>>> This definitely is a big change and will break API compatible. In
>>>>> order
>>>>>> to
>>>>>>>> bring a smaller impact on users, I think we should go fast when we
>>>>>> migrate
>>>>>>>> APIs targeted to users. It's better to introduce the user sensitive
>>>>>> changes
>>>>>>>> within a release. However, it may be not that easy. I can help to
>>>>>>>> contribute.
>>>>>>>>
>>>>>>>> Separation of interface and implementation is a good idea. This may
>>>>>>>> introduce a minimum of dependencies or even no dependencies. I saw
>>>>> your
>>>>>>>> reply in the google doc. Java8 has already supported static method
>>> for
>>>>>>>> interfaces, I think we can make use of it?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Hequn
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
>>>>>> wrote:
>>>>>>>>> Hi everyone,
>>>>>>>>>
>>>>>>>>> thanks for the great feedback so far. I updated the document with
>>> the
>>>>>>>>> input I got so far
>>>>>>>>>
>>>>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
>>> the
>>>>>>>> list.
>>>>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
>> Do
>>>>>> you
>>>>>>>>> mean a module containing pure Java `interface`s? Or is the
>>> validation
>>>>>>>>> logic also part of the API module? Are 50+ expression classes part
>>> of
>>>>>>>>> the API interface or already too implementation-specific?
>>>>>>>>>
>>>>>>>>> @Xuefu: I extended the document by almost a page to clarify when
>> we
>>>>>>>>> should develop in Scala and when in Java. As Piotr said, every new
>>>>>> Scala
>>>>>>>>> line is instant technical debt.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Timo
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>>>>> Hi Timo,
>>>>>>>>>>
>>>>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>>>>
>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>> Java
>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>> that
>>>>> in
>>>>>>>> the
>>>>>>>>> current code base there are cases where a Scala class extends Java
>>>>> and
>>>>>>>> vise
>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>> extension
>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>> However,
>>>>>>>> I'm
>>>>>>>>> not sure if this is practical.
>>>>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
>> we
>>>>>>>> will
>>>>>>>>> have to work it out as we go. One thing to consider is that from
>> now
>>>>>> on,
>>>>>>>>> every single new code line written in Scala anywhere in
>> Flink-table
>>>>>>>> (except
>>>>>>>>> of Flink-table-api-scala) is an instant technological debt. From
>>> this
>>>>>>>>> perspective I would be in favour of tolerating quite big
>>>>> inchonvieneces
>>>>>>>>> just to avoid any new Scala code.
>>>>>>>>>> Piotrek
>>>>>>>>>>
>>>>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xuefu.z@alibaba-inc.com
>>>>>>>> wrote:
>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the effort and the Google writeup. During our
>> external
>>>>>>>>> catalog rework, we found much confusion between Java and Scala,
>> and
>>>>>> this
>>>>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>>>>> I'm wondering that whether we can have rule in the interim when
>>>>> Java
>>>>>>>>> and Scala coexist that dependency can only be one-way. I found
>> that
>>>>> in
>>>>>>>> the
>>>>>>>>> current code base there are cases where a Scala class extends Java
>>>>> and
>>>>>>>> vise
>>>>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>>>>> extension
>>>>>>>>> can only be from Java to Scala, which will help the situation.
>>>>> However,
>>>>>>>> I'm
>>>>>>>>> not sure if this is practical.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Xuefu
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>> ------------------------------------------------------------------
>>>>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>>>>> Scala-free
>>>>>>>>>>> Hi Timo,
>>>>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>>>>
>>>>>>>>>>> Currently when using SQL/TableAPI should include many
>> dependence.
>>>>> In
>>>>>>>>>>> particular, it is not necessary to introduce the specific
>>>>>>>> implementation
>>>>>>>>>>> dependencies which users do not care about. So I am glad to see
>>>>> your
>>>>>>>>>>> proposal, and hope when we consider splitting the API interface
>>>>> into
>>>>>> a
>>>>>>>>>>> separate module, so that the user can introduce minimum of
>>>>>>>> dependencies.
>>>>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
>>>>> `Table` &
>>>>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>>>>> Best,
>>>>>>>>>>> Jincheng
>>>>>>>>>>>
>>>>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
>>>>> thing
>>>>>>>> to
>>>>>>>>> do.
>>>>>>>>>>>> While we are doing this, can we also keep in mind that we want
>> to
>>>>>>>>>>>> eventually have a TableAPI interface only module which users
>> can
>>>>>> take
>>>>>>>>>>>> dependency on, but without including any implementation
>> details?
>>>>>>>>>>>> Xiaowei
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
>> fhueske@gmail.com
>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi Timo,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>>>>> I like the new structure and agree to prioritize the porting
>> of
>>>>> the
>>>>>>>>>>>>> flink-table-common classes.
>>>>>>>>>>>>> Since flink-table-runtime is (or should be) independent of the
>>>>> API
>>>>>>>> and
>>>>>>>>>>>>> planner modules, we could start porting these classes once the
>>>>> code
>>>>>>>> is
>>>>>>>>>>>>> split into the new module structure.
>>>>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>>>>>>>> Scala-free
>>>>>>>>>>>>> execution Jar.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>>>>> :
>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to continue this discussion thread and convert
>> the
>>>>>>>>> outcome
>>>>>>>>>>>>>> into a FLIP such that users and contributors know what to
>>> expect
>>>>>> in
>>>>>>>>> the
>>>>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I created a design document [1] that clarifies our motivation
>>>>> why
>>>>>>>> we
>>>>>>>>>>>>>> want to do this, how a Maven module structure could look
>> like,
>>>>> and
>>>>>>>> a
>>>>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It would be great to start with the efforts for the 1.8
>> release
>>>>>>>> such
>>>>>>>>>>>>>> that new features can be developed in Java and major
>>>>> refactorings
>>>>>>>>> such
>>>>>>>>>>>>>> as improvements to the connectors and external catalog
>> support
>>>>> are
>>>>>>>>> not
>>>>>>>>>>>>>> blocked.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
>> the
>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>> I think the first step would be to separate the flink-table
>>>>>> module
>>>>>>>>>>>> into
>>>>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
>>> divided
>>>>>>>>>>>> further
>>>>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
>>>>>>>> everything
>>>>>>>>>>>> we
>>>>>>>>>>>>> do
>>>>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module
>>>>> and
>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>>>>> The api module will be much harder to port because of
>> several
>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>> to Scala core classes (the parser framework, tree
>> iterations,
>>>>>>>> etc.).
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>> not saying we should not port this to Java, but it is not
>>> clear
>>>>>> to
>>>>>>>>> me
>>>>>>>>>>>>>> (yet)
>>>>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
>>> The
>>>>>>>> code
>>>>>>>>>>>>> does
>>>>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing very
>>>>>>>>>>>> Java-like.
>>>>>>>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>>>>>>>> individually
>>>>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>>>>> For flink-table-planning, we can have certain packages that
>> we
>>>>>>>> port
>>>>>>>>>>>> to
>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>> like planning rules or plan nodes. The related classes
>> mostly
>>>>>>>> extend
>>>>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
>> choices
>>>>>> for
>>>>>>>>>>>>> being
>>>>>>>>>>>>>>> ported. The code generation classes will require more effort
>>> to
>>>>>>>>> port.
>>>>>>>>>>>>>> There
>>>>>>>>>>>>>>> are also some dependencies in planning on the api module
>> that
>>>>> we
>>>>>>>>>>>> would
>>>>>>>>>>>>>> need
>>>>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For SQL most work when adding new features is done in the
>>>>>> planning
>>>>>>>>>>>> and
>>>>>>>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>>>>>>>> "technological
>>>>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
>>> :
>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also think about this problem these days and here are my
>>>>>>>>> thoughts.
>>>>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
>>> interoperate
>>>>>>>> with
>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>> and Scala. E.g., they have different collection types
>> (Scala
>>>>>>>>>>>>> collections
>>>>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
>>> method
>>>>>>>>> which
>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>> Scala functions as parameters. Considering the major part
>> of
>>>>> the
>>>>>>>>>>>> code
>>>>>>>>>>>>>> base
>>>>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
>>>>> view.
>>>>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
>> and
>>>>>>>> make
>>>>>>>>>>>> all
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
>>>>>> achieved
>>>>>>>>>>>> even
>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any new
>>>>>>>>> features
>>>>>>>>>>>>>>>> should be added in Java (regardless of the modules), in
>> order
>>>>> to
>>>>>>>>>>>>> prevent
>>>>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less code
>>> we
>>>>>>>> will
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
>>>>> Fabian's
>>>>>>>>>>>>>> proposal
>>>>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
>> even
>>>>>>>>> within
>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
>> we
>>>>>>>> could
>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
>>> Java
>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>>>>> scala/java
>>>>>>>>>>>> code
>>>>>>>>>>>>>> bases
>>>>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>>>>> Secondly this whole migration might and most like will
>> take
>>>>>>>> longer
>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
>>>>> will
>>>>>>>> be
>>>>>>>>>>>>>>>> creating. After making a decision to migrate to Java,
>> almost
>>>>> any
>>>>>>>>> new
>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>> line of code will be immediately a technological debt and
>> we
>>>>>> will
>>>>>>>>>>>> have
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>>>>> Thus I would propose first to state our end goal - modules
>>>>>>>>>>>> structure
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> which parts of modules we want to have eventually
>> Scala-free.
>>>>>>>>>>>> Secondly
>>>>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
>>>>> code
>>>>>>>>>>>>>> complaint
>>>>>>>>>>>>>>>> with our end goal. Only after that we should/could focus on
>>>>>>>>>>>>>> incrementally
>>>>>>>>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked
>>>>> for
>>>>>>>>>>>> years
>>>>>>>>>>>>>>>> writing new code in Scala (and increasing technological
>>> debt),
>>>>>>>>>>>> because
>>>>>>>>>>>>>>>> nobody have found a time to rewrite some non important and
>>> not
>>>>>>>>>>>>> actively
>>>>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
>> fhueske@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
>>> won't
>>>>>> be
>>>>>>>>>>>> easy
>>>>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>>>>> fragmented
>>>>>>>>>>>> into
>>>>>>>>>>>>>> Java
>>>>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I think we should do this one step at a time and focus on
>>>>>>>>>>>> migrating
>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
>>> Java.
>>>>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
>> them
>>>>> to
>>>>>>>>>>>> Java,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
>>>>>>>> breaking
>>>>>>>>>>>> the
>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>>>>> trohrmann@apache.org
>>>>>>>>> :
>>>>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we should
>>>>>>>> strive
>>>>>>>>>>>> for
>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>> This, however, must be an iterative process given the
>>> sheer
>>>>>>>> size
>>>>>>>>>>>> of
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
>>>>> modules
>>>>>>>>>>>> which
>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving classes
>>>>> from
>>>>>>>>>>>> Scala
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
>>>>>> interacts
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>>>>>>>> generally
>>>>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>>>>> `flink-table-core`
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be
>>>>> able
>>>>>>>> to
>>>>>>>>>>>>> add
>>>>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
>>>>>> coexist
>>>>>>>>>>>> with
>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
>> Java.
>>>>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>>>>> twalthr@apache.org
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>>>>>>>> implemented
>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
>> initital
>>>>>> code
>>>>>>>>>>>> base
>>>>>>>>>>>>>> was
>>>>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
>> kept
>>>>>>>> Scala
>>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
>> API
>>>>>>>> like
>>>>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
>> for
>>>>>>>> quick
>>>>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>>>>>>>> committers
>>>>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>> splitting the code-base into two programming languages.
>>>>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
>>>>>>>> becomes
>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
>>>>> formats,
>>>>>>>> and
>>>>>>>>>>>>> SQL
>>>>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>>>>> are actually implemented in Java but need to
>> interoperate
>>>>>>>> with
>>>>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
>>> mentioned
>>>>>> in
>>>>>>>>> an
>>>>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
>>>>> member
>>>>>>>>>>>>>> variables
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
>> [1].
>>>>>> Java
>>>>>>>>> is
>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> most important API language and right now we treat it
>> as
>>> a
>>>>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala
>>> if
>>>>>>>> you
>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
>>>>> between
>>>>>>>>>>>>> `public
>>>>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
>> toString()`.
>>>>>>>>>>>>>>>>>>>>> Given the size of the current code base,
>> reimplementing
>>>>> the
>>>>>>>>>>>>> entire
>>>>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
>>>>>> reach.
>>>>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>>>>>>>> long-term
>>>>>>>>>>>>> goal
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
>>>>>>>> runtime
>>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
>> would
>>>>>>>>>>>> require
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can
>>>>> use
>>>>>>>>>>>> this.
>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
>>>>> sink,
>>>>>>>>>>>> table
>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
>>>>> base.
>>>>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
>>>>> classes
>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>>>>> potentially.
>>>>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>>>>>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>>>>> traits-tp21335.html
>>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Jark Wu <im...@gmail.com>.
Hi Timo,

Thanks for the great work!

Moving flink-table to Java is a long-awaited things but will involve much
effort. Agree with that we should make it as a long-term goal.

I have read the google doc and +1 for the proposal. Here I have some
questions:

1. Where should the flink-table-common module place ?  Will we move the
flink-table-common classes to the new modules?
2. Should flink-sql-client also as a sub-module under flink-table ?
3. The flink-table-planner contains code generation and will be converted
to Java. Actually, I prefer using Scala to code generate because of the
Multiline-String and String-Interpolation (i.e. s"hello $user") features in
Scala. It makes code of code-generation more readable. Do we really
want to migrate
code generation to Java?

Best,
Jark


On Wed, 28 Nov 2018 at 09:14, Kurt Young <yk...@gmail.com> wrote:

> Hi Timo and Vino,
>
> I agree that table is very active and there is no guarantee for not
> producing any conflicts if you decide
> to develop based on community version. I think this part is the risk what
> we can imagine in the first place. But massively
> language replacing is something you can not imagine and be ready for, there
> is no feature added, no refactor is done, simply changing
> from scala to java will cause lots of conflicts.
>
> But I also agree that this is a "technical debt" that we should eventually
> pay, as you said, we can do this slowly, even one file each time,
> let other people have more time to resolve the conflicts.
>
> Best,
> Kurt
>
>
> On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org> wrote:
>
> > Hi Kurt,
> >
> > I understand your concerns. However, there is no concrete roadmap for
> > Flink 2.0 and (as Vino said) the flink-table is developed very actively.
> > Major refactorings happened in the past and will also happen with or
> > without Scala migration. A good example, is the proper catalog support
> > which will refactor big parts of the TableEnvironment class. Or the
> > introduction of "retractions" which needed a big refactoring of the
> > planning phase. Stability is only guaranteed for the API and the general
> > behavior, however, currently flink-table is not using @Public or
> > @PublicEvolving annotations for a reason.
> >
> > I think the migration will still happen slowly because it needs people
> > that allocate time for that. Therefore, even Flink forks can slowly
> > adapt to the evolving Scala-to-Java code base.
> >
> > Regards,
> > Timo
> >
> >
> > Am 27.11.18 um 13:16 schrieb vino yang:
> > > Hi Kurt,
> > >
> > > Currently, there is still a long time to go from flink 2.0. Considering
> > > that the flink-table
> > > is one of the most active modules in the current flink project, each
> > > version has
> > > a number of changes and features added. I think that refactoring faster
> > > will reduce subsequent
> > > complexity and workload. And this may be a gradual and long process. We
> > > should be able to
> > >   regard it as a "technical debt", and if it does not change it, it
> will
> > > also affect the decision-making of other issues.
> > >
> > > Thanks, vino.
> > >
> > > Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
> > >
> > >> Hi Timo,
> > >>
> > >> Thanks for writing up the document. I'm +1 for reorganizing the module
> > >> structure and make table scala free. But I have
> > >> a little concern abount the timing. Is it more appropriate to get this
> > done
> > >> when Flink decide to bump to next big version, like 2.x.
> > >> It's true you can keep all the class's package path as it is, and will
> > not
> > >> introduce API change. But if some company are developing their own
> > >> Flink, and sync with community version by rebasing, may face a lot of
> > >> conflicts. Although you can avoid conflicts by always moving source
> > codes
> > >> between packages, but I assume you still need to delete the original
> > scala
> > >> file and add a new java file when you want to change program language.
> > >>
> > >> Best,
> > >> Kurt
> > >>
> > >>
> > >> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
> > wrote:
> > >>
> > >>> Hi Hequn,
> > >>>
> > >>> thanks for your feedback. Yes, migrating the test cases is another
> > issue
> > >>> that is not represented in the document but should naturally go along
> > >>> with the migration.
> > >>>
> > >>> I agree that we should migrate the main API classes quickly within
> this
> > >>> 1.8 release after the module split has been performed. Help here is
> > >>> highly appreciated!
> > >>>
> > >>> I forgot that Java supports static methods in interfaces now, but
> > >>> actually I don't like the design of calling
> > `TableEnvironment.get(env)`.
> > >>> Because people often use `TableEnvironment tEnd =
> > >>> TableEnvironment.get(env)` and then wonder why there is no
> > >>> `toAppendStream` or `toDataSet` because they are using the base
> class.
> > >>> However, things like that can be discussed in the corresponding issue
> > >>> when it comes to implementation.
> > >>>
> > >>> @Vino: I think your work fits nicely to these efforts.
> > >>>
> > >>> @everyone: I will wait for more feedback until end of this week.
> Then I
> > >>> will convert the design document into a FLIP and open subtasks in
> Jira,
> > >>> if there are no objections?
> > >>>
> > >>> Regards,
> > >>> Timo
> > >>>
> > >>> Am 24.11.18 um 13:45 schrieb vino yang:
> > >>>> Hi hequn,
> > >>>>
> > >>>> I am very glad to hear that you are interested in this work.
> > >>>> As we all know, this process involves a lot.
> > >>>> Currently, the migration work has begun. I started with the
> > >>>> Kafka connector's dependency on flink-table and moved the
> > >>>> related dependencies to flink-table-common.
> > >>>> This work is tracked by FLINK-9461.  [1]
> > >>>> I don't know if it will conflict with what you expect to do, but
> from
> > >> the
> > >>>> impact I have observed,
> > >>>> it will involve many classes that are currently in flink-table.
> > >>>>
> > >>>> *Just a statement to prevent unnecessary conflicts.*
> > >>>>
> > >>>> Thanks, vino.
> > >>>>
> > >>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
> > >>>>
> > >>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
> > >>>>
> > >>>>> Hi Timo,
> > >>>>>
> > >>>>> Thanks for the effort and writing up this document. I like the idea
> > to
> > >>> make
> > >>>>> flink-table scala free, so +1 for the proposal!
> > >>>>>
> > >>>>> It's good to make Java the first-class citizen. For a long time, we
> > >> have
> > >>>>> neglected java so that many features in Table are missed in Java
> Test
> > >>>>> cases, such as this one[1] I found recently. And I think we may
> also
> > >>> need
> > >>>>> to migrate our test cases, i.e, add java tests.
> > >>>>>
> > >>>>> This definitely is a big change and will break API compatible. In
> > >> order
> > >>> to
> > >>>>> bring a smaller impact on users, I think we should go fast when we
> > >>> migrate
> > >>>>> APIs targeted to users. It's better to introduce the user sensitive
> > >>> changes
> > >>>>> within a release. However, it may be not that easy. I can help to
> > >>>>> contribute.
> > >>>>>
> > >>>>> Separation of interface and implementation is a good idea. This may
> > >>>>> introduce a minimum of dependencies or even no dependencies. I saw
> > >> your
> > >>>>> reply in the google doc. Java8 has already supported static method
> > for
> > >>>>> interfaces, I think we can make use of it?
> > >>>>>
> > >>>>> Best,
> > >>>>> Hequn
> > >>>>>
> > >>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
> > >>> wrote:
> > >>>>>> Hi everyone,
> > >>>>>>
> > >>>>>> thanks for the great feedback so far. I updated the document with
> > the
> > >>>>>> input I got so far
> > >>>>>>
> > >>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
> > the
> > >>>>> list.
> > >>>>>> @Xiaowei: Could you elaborate what "interface only" means to you?
> Do
> > >>> you
> > >>>>>> mean a module containing pure Java `interface`s? Or is the
> > validation
> > >>>>>> logic also part of the API module? Are 50+ expression classes part
> > of
> > >>>>>> the API interface or already too implementation-specific?
> > >>>>>>
> > >>>>>> @Xuefu: I extended the document by almost a page to clarify when
> we
> > >>>>>> should develop in Scala and when in Java. As Piotr said, every new
> > >>> Scala
> > >>>>>> line is instant technical debt.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Timo
> > >>>>>>
> > >>>>>>
> > >>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> > >>>>>>> Hi Timo,
> > >>>>>>>
> > >>>>>>> Thanks for writing this down +1 from my side :)
> > >>>>>>>
> > >>>>>>>> I'm wondering that whether we can have rule in the interim when
> > >> Java
> > >>>>>> and Scala coexist that dependency can only be one-way. I found
> that
> > >> in
> > >>>>> the
> > >>>>>> current code base there are cases where a Scala class extends Java
> > >> and
> > >>>>> vise
> > >>>>>> versa. This is quite painful. I'm thinking if we could say that
> > >>> extension
> > >>>>>> can only be from Java to Scala, which will help the situation.
> > >> However,
> > >>>>> I'm
> > >>>>>> not sure if this is practical.
> > >>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably
> we
> > >>>>> will
> > >>>>>> have to work it out as we go. One thing to consider is that from
> now
> > >>> on,
> > >>>>>> every single new code line written in Scala anywhere in
> Flink-table
> > >>>>> (except
> > >>>>>> of Flink-table-api-scala) is an instant technological debt. From
> > this
> > >>>>>> perspective I would be in favour of tolerating quite big
> > >> inchonvieneces
> > >>>>>> just to avoid any new Scala code.
> > >>>>>>> Piotrek
> > >>>>>>>
> > >>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xuefu.z@alibaba-inc.com
> >
> > >>>>> wrote:
> > >>>>>>>> Hi Timo,
> > >>>>>>>>
> > >>>>>>>> Thanks for the effort and the Google writeup. During our
> external
> > >>>>>> catalog rework, we found much confusion between Java and Scala,
> and
> > >>> this
> > >>>>>> Scala-free roadmap should greatly mitigate that.
> > >>>>>>>> I'm wondering that whether we can have rule in the interim when
> > >> Java
> > >>>>>> and Scala coexist that dependency can only be one-way. I found
> that
> > >> in
> > >>>>> the
> > >>>>>> current code base there are cases where a Scala class extends Java
> > >> and
> > >>>>> vise
> > >>>>>> versa. This is quite painful. I'm thinking if we could say that
> > >>> extension
> > >>>>>> can only be from Java to Scala, which will help the situation.
> > >> However,
> > >>>>> I'm
> > >>>>>> not sure if this is practical.
> > >>>>>>>> Thanks,
> > >>>>>>>> Xuefu
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> ------------------------------------------------------------------
> > >>>>>>>> Sender:jincheng sun <su...@gmail.com>
> > >>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
> > >>>>>>>> Recipient:dev <de...@flink.apache.org>
> > >>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
> > >> Scala-free
> > >>>>>>>> Hi Timo,
> > >>>>>>>> Thanks for initiating this great discussion.
> > >>>>>>>>
> > >>>>>>>> Currently when using SQL/TableAPI should include many
> dependence.
> > >> In
> > >>>>>>>> particular, it is not necessary to introduce the specific
> > >>>>> implementation
> > >>>>>>>> dependencies which users do not care about. So I am glad to see
> > >> your
> > >>>>>>>> proposal, and hope when we consider splitting the API interface
> > >> into
> > >>> a
> > >>>>>>>> separate module, so that the user can introduce minimum of
> > >>>>> dependencies.
> > >>>>>>>> So, +1 to [separation of interface and implementation; e.g.
> > >> `Table` &
> > >>>>>>>> `TableImpl`] which you mentioned in the google doc.
> > >>>>>>>> Best,
> > >>>>>>>> Jincheng
> > >>>>>>>>
> > >>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> > >>>>>>>>
> > >>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
> > >> thing
> > >>>>> to
> > >>>>>> do.
> > >>>>>>>>> While we are doing this, can we also keep in mind that we want
> to
> > >>>>>>>>> eventually have a TableAPI interface only module which users
> can
> > >>> take
> > >>>>>>>>> dependency on, but without including any implementation
> details?
> > >>>>>>>>>
> > >>>>>>>>> Xiaowei
> > >>>>>>>>>
> > >>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <
> fhueske@gmail.com
> > >
> > >>>>>> wrote:
> > >>>>>>>>>> Hi Timo,
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks for writing up this document.
> > >>>>>>>>>> I like the new structure and agree to prioritize the porting
> of
> > >> the
> > >>>>>>>>>> flink-table-common classes.
> > >>>>>>>>>> Since flink-table-runtime is (or should be) independent of the
> > >> API
> > >>>>> and
> > >>>>>>>>>> planner modules, we could start porting these classes once the
> > >> code
> > >>>>> is
> > >>>>>>>>>> split into the new module structure.
> > >>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
> > >>>>> Scala-free
> > >>>>>>>>>> execution Jar.
> > >>>>>>>>>>
> > >>>>>>>>>> Best, Fabian
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> > >>>>>>>>>> twalthr@apache.org
> > >>>>>>>>>>> :
> > >>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>
> > >>>>>>>>>>> I would like to continue this discussion thread and convert
> the
> > >>>>>> outcome
> > >>>>>>>>>>> into a FLIP such that users and contributors know what to
> > expect
> > >>> in
> > >>>>>> the
> > >>>>>>>>>>> upcoming releases.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I created a design document [1] that clarifies our motivation
> > >> why
> > >>>>> we
> > >>>>>>>>>>> want to do this, how a Maven module structure could look
> like,
> > >> and
> > >>>>> a
> > >>>>>>>>>>> suggestion for a migration plan.
> > >>>>>>>>>>>
> > >>>>>>>>>>> It would be great to start with the efforts for the 1.8
> release
> > >>>>> such
> > >>>>>>>>>>> that new features can be developed in Java and major
> > >> refactorings
> > >>>>>> such
> > >>>>>>>>>>> as improvements to the connectors and external catalog
> support
> > >> are
> > >>>>>> not
> > >>>>>>>>>>> blocked.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Please let me know what you think.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regards,
> > >>>>>>>>>>> Timo
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> > >>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > >>>>>>>>>>>> Hi Piotr,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for
> the
> > >>>>>>>>> comments.
> > >>>>>>>>>>>> I think the first step would be to separate the flink-table
> > >>> module
> > >>>>>>>>> into
> > >>>>>>>>>>>> multiple sub modules. These could be:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
> > divided
> > >>>>>>>>> further
> > >>>>>>>>>>>> into Java/Scala Table API/SQL
> > >>>>>>>>>>>> - flink-table-planning: involves all planning (basically
> > >>>>> everything
> > >>>>>>>>> we
> > >>>>>>>>>> do
> > >>>>>>>>>>>> with Calcite)
> > >>>>>>>>>>>> - flink-table-runtime: the runtime code
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module
> > >> and
> > >>>>>>>>>> certain
> > >>>>>>>>>>>> parts of the planning module ported to Java.
> > >>>>>>>>>>>> The api module will be much harder to port because of
> several
> > >>>>>>>>>>> dependencies
> > >>>>>>>>>>>> to Scala core classes (the parser framework, tree
> iterations,
> > >>>>> etc.).
> > >>>>>>>>>> I'm
> > >>>>>>>>>>>> not saying we should not port this to Java, but it is not
> > clear
> > >>> to
> > >>>>>> me
> > >>>>>>>>>>> (yet)
> > >>>>>>>>>>>> how to do it.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
> > The
> > >>>>> code
> > >>>>>>>>>> does
> > >>>>>>>>>>>> not make use of many Scala features, i.e., it's writing very
> > >>>>>>>>> Java-like.
> > >>>>>>>>>>>> Also, there are not many dependencies and operators can be
> > >>>>>>>>> individually
> > >>>>>>>>>>>> ported step-by-step.
> > >>>>>>>>>>>> For flink-table-planning, we can have certain packages that
> we
> > >>>>> port
> > >>>>>>>>> to
> > >>>>>>>>>>> Java
> > >>>>>>>>>>>> like planning rules or plan nodes. The related classes
> mostly
> > >>>>> extend
> > >>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural
> choices
> > >>> for
> > >>>>>>>>>> being
> > >>>>>>>>>>>> ported. The code generation classes will require more effort
> > to
> > >>>>>> port.
> > >>>>>>>>>>> There
> > >>>>>>>>>>>> are also some dependencies in planning on the api module
> that
> > >> we
> > >>>>>>>>> would
> > >>>>>>>>>>> need
> > >>>>>>>>>>>> to resolve somehow.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> For SQL most work when adding new features is done in the
> > >>> planning
> > >>>>>>>>> and
> > >>>>>>>>>>>> runtime modules. So, this separation should already reduce
> > >>>>>>>>>> "technological
> > >>>>>>>>>>>> dept" quite a lot.
> > >>>>>>>>>>>> The Table API depends much more on Scala than SQL.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Cheers, Fabian
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xingcanc@gmail.com
> >:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi all,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I also think about this problem these days and here are my
> > >>>>>> thoughts.
> > >>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
> > interoperate
> > >>>>> with
> > >>>>>>>>>> Java
> > >>>>>>>>>>>>> and Scala. E.g., they have different collection types
> (Scala
> > >>>>>>>>>> collections
> > >>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
> > method
> > >>>>>> which
> > >>>>>>>>>>> takes
> > >>>>>>>>>>>>> Scala functions as parameters. Considering the major part
> of
> > >> the
> > >>>>>>>>> code
> > >>>>>>>>>>> base
> > >>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
> > >> view.
> > >>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API
> and
> > >>>>> make
> > >>>>>>>>> all
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
> > >>> achieved
> > >>>>>>>>> even
> > >>>>>>>>>>> in a
> > >>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> > >>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> 3) If the community makes the final decision, maybe any new
> > >>>>>> features
> > >>>>>>>>>>>>> should be added in Java (regardless of the modules), in
> order
> > >> to
> > >>>>>>>>>> prevent
> > >>>>>>>>>>>>> the Scala codes from growing.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Xingcan
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> > >>>>>>>>> piotr@data-artisans.com>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> Bumping the topic.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less code
> > we
> > >>>>> will
> > >>>>>>>>>> have
> > >>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
> > >> Fabian's
> > >>>>>>>>>>> proposal
> > >>>>>>>>>>>>> of doing it module wise and one module at a time.
> > >>>>>>>>>>>>>> First, I do not see a problem of having java/scala code
> even
> > >>>>>> within
> > >>>>>>>>>> one
> > >>>>>>>>>>>>> module, especially not if there are clean boundaries. Like
> we
> > >>>>> could
> > >>>>>>>>>> have
> > >>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
> > Java
> > >>> in
> > >>>>>>>>> the
> > >>>>>>>>>>> same
> > >>>>>>>>>>>>> module. However I haven’t previously maintained mixed
> > >> scala/java
> > >>>>>>>>> code
> > >>>>>>>>>>> bases
> > >>>>>>>>>>>>> before, so I might be missing something here.
> > >>>>>>>>>>>>>> Secondly this whole migration might and most like will
> take
> > >>>>> longer
> > >>>>>>>>>> then
> > >>>>>>>>>>>>> expected, so that creates a problem for a new code that we
> > >> will
> > >>>>> be
> > >>>>>>>>>>>>> creating. After making a decision to migrate to Java,
> almost
> > >> any
> > >>>>>> new
> > >>>>>>>>>>> Scala
> > >>>>>>>>>>>>> line of code will be immediately a technological debt and
> we
> > >>> will
> > >>>>>>>>> have
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>> rewrite it to Java later.
> > >>>>>>>>>>>>>> Thus I would propose first to state our end goal - modules
> > >>>>>>>>> structure
> > >>>>>>>>>>> and
> > >>>>>>>>>>>>> which parts of modules we want to have eventually
> Scala-free.
> > >>>>>>>>> Secondly
> > >>>>>>>>>>>>> taking all steps necessary that will allow us to write new
> > >> code
> > >>>>>>>>>>> complaint
> > >>>>>>>>>>>>> with our end goal. Only after that we should/could focus on
> > >>>>>>>>>>> incrementally
> > >>>>>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked
> > >> for
> > >>>>>>>>> years
> > >>>>>>>>>>>>> writing new code in Scala (and increasing technological
> > debt),
> > >>>>>>>>> because
> > >>>>>>>>>>>>> nobody have found a time to rewrite some non important and
> > not
> > >>>>>>>>>> actively
> > >>>>>>>>>>>>> developed part of some module.
> > >>>>>>>>>>>>>> Piotrek
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <
> fhueske@gmail.com
> > >
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
> > won't
> > >>> be
> > >>>>>>>>> easy
> > >>>>>>>>>>>>> and I
> > >>>>>>>>>>>>>>> think we have to plan this well.
> > >>>>>>>>>>>>>>> I don't like the idea of having the whole code base
> > >> fragmented
> > >>>>>>>>> into
> > >>>>>>>>>>> Java
> > >>>>>>>>>>>>>>> and Scala code for too long.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I think we should do this one step at a time and focus on
> > >>>>>>>>> migrating
> > >>>>>>>>>>> one
> > >>>>>>>>>>>>>>> module at a time.
> > >>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
> > Java.
> > >>>>>>>>>>>>>>> Extracting the API classes into an own module, porting
> them
> > >> to
> > >>>>>>>>> Java,
> > >>>>>>>>>>> and
> > >>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
> > >>>>> breaking
> > >>>>>>>>> the
> > >>>>>>>>>>> API
> > >>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best, Fabian
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
> > >>> trohrmann@apache.org
> > >>>>>> :
> > >>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we should
> > >>>>> strive
> > >>>>>>>>> for
> > >>>>>>>>>>> it.
> > >>>>>>>>>>>>>>>> This, however, must be an iterative process given the
> > sheer
> > >>>>> size
> > >>>>>>>>> of
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>>> code base. I like the approach to define common Java
> > >> modules
> > >>>>>>>>> which
> > >>>>>>>>>>> are
> > >>>>>>>>>>>>> used
> > >>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving classes
> > >> from
> > >>>>>>>>> Scala
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>> Till
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > >>>>>>>>>>>>> piotr@data-artisans.com>
> > >>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
> > >>> interacts
> > >>>>>>>>> with
> > >>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
> > >>>>> generally
> > >>>>>>>>>>>>> speaking
> > >>>>>>>>>>>>>>>> +1
> > >>>>>>>>>>>>>>>>> from me.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
> > >>>>>>>>>> `flink-table-core`
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be
> > >> able
> > >>>>> to
> > >>>>>>>>>> add
> > >>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
> > >>> coexist
> > >>>>>>>>> with
> > >>>>>>>>>>> old
> > >>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to
> Java.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Piotrek
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
> > >> twalthr@apache.org
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
> > >>>>> implemented
> > >>>>>>>>> in
> > >>>>>>>>>>>>> Scala.
> > >>>>>>>>>>>>>>>>> This decision was made a long-time ago when the
> initital
> > >>> code
> > >>>>>>>>> base
> > >>>>>>>>>>> was
> > >>>>>>>>>>>>>>>>> created as part of a master's thesis. The community
> kept
> > >>>>> Scala
> > >>>>>>>>>>>>> because of
> > >>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table
> API
> > >>>>> like
> > >>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows
> for
> > >>>>> quick
> > >>>>>>>>>>>>>>>> prototyping
> > >>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
> > >>>>> committers
> > >>>>>>>>>>>>> enforced
> > >>>>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>> splitting the code-base into two programming languages.
> > >>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
> > >>>>> becomes
> > >>>>>>>>> an
> > >>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
> > >> formats,
> > >>>>> and
> > >>>>>>>>>> SQL
> > >>>>>>>>>>>>>>>> client
> > >>>>>>>>>>>>>>>>> are actually implemented in Java but need to
> interoperate
> > >>>>> with
> > >>>>>>>>>>>>>>>> flink-table
> > >>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
> > mentioned
> > >>> in
> > >>>>>> an
> > >>>>>>>>>>>>> earlier
> > >>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
> > >> member
> > >>>>>>>>>>> variables
> > >>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users
> [1].
> > >>> Java
> > >>>>>> is
> > >>>>>>>>>>> still
> > >>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> most important API language and right now we treat it
> as
> > a
> > >>>>>>>>>>>>> second-class
> > >>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala
> > if
> > >>>>> you
> > >>>>>>>>>> just
> > >>>>>>>>>>>>> want
> > >>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
> > >> between
> > >>>>>>>>>> `public
> > >>>>>>>>>>>>>>>> String
> > >>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String
> toString()`.
> > >>>>>>>>>>>>>>>>>> Given the size of the current code base,
> reimplementing
> > >> the
> > >>>>>>>>>> entire
> > >>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
> > >>> reach.
> > >>>>>>>>>>>>> However, we
> > >>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
> > >>>>> long-term
> > >>>>>>>>>> goal
> > >>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
> > >>>>> runtime
> > >>>>>>>>>>>>> classes
> > >>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> split the code base into multiple modules:
> > >>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> > >>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This
> would
> > >>>>>>>>> require
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
> > >>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> > >>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> flink-table-common
> > >>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can
> > >> use
> > >>>>>>>>> this.
> > >>>>>>>>>> It
> > >>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
> > >> sink,
> > >>>>>>>>> table
> > >>>>>>>>>>>>> source.
> > >>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> > >>>>>>>>>>>>>>>>> flink-table-runtime}
> > >>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
> > >> base.
> > >>>>>>>>>>>>>>>>>>> flink-table-runtime
> > >>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
> > >> classes
> > >>>>> in
> > >>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> > >>>>> potentially.
> > >>>>>>>>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > >>>>>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > >>>>>>>>>>>>> traits-tp21335.html
> > >>>
> >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Kurt Young <yk...@gmail.com>.
Hi Timo and Vino,

I agree that table is very active and there is no guarantee for not
producing any conflicts if you decide
to develop based on community version. I think this part is the risk what
we can imagine in the first place. But massively
language replacing is something you can not imagine and be ready for, there
is no feature added, no refactor is done, simply changing
from scala to java will cause lots of conflicts.

But I also agree that this is a "technical debt" that we should eventually
pay, as you said, we can do this slowly, even one file each time,
let other people have more time to resolve the conflicts.

Best,
Kurt


On Tue, Nov 27, 2018 at 8:37 PM Timo Walther <tw...@apache.org> wrote:

> Hi Kurt,
>
> I understand your concerns. However, there is no concrete roadmap for
> Flink 2.0 and (as Vino said) the flink-table is developed very actively.
> Major refactorings happened in the past and will also happen with or
> without Scala migration. A good example, is the proper catalog support
> which will refactor big parts of the TableEnvironment class. Or the
> introduction of "retractions" which needed a big refactoring of the
> planning phase. Stability is only guaranteed for the API and the general
> behavior, however, currently flink-table is not using @Public or
> @PublicEvolving annotations for a reason.
>
> I think the migration will still happen slowly because it needs people
> that allocate time for that. Therefore, even Flink forks can slowly
> adapt to the evolving Scala-to-Java code base.
>
> Regards,
> Timo
>
>
> Am 27.11.18 um 13:16 schrieb vino yang:
> > Hi Kurt,
> >
> > Currently, there is still a long time to go from flink 2.0. Considering
> > that the flink-table
> > is one of the most active modules in the current flink project, each
> > version has
> > a number of changes and features added. I think that refactoring faster
> > will reduce subsequent
> > complexity and workload. And this may be a gradual and long process. We
> > should be able to
> >   regard it as a "technical debt", and if it does not change it, it will
> > also affect the decision-making of other issues.
> >
> > Thanks, vino.
> >
> > Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
> >
> >> Hi Timo,
> >>
> >> Thanks for writing up the document. I'm +1 for reorganizing the module
> >> structure and make table scala free. But I have
> >> a little concern abount the timing. Is it more appropriate to get this
> done
> >> when Flink decide to bump to next big version, like 2.x.
> >> It's true you can keep all the class's package path as it is, and will
> not
> >> introduce API change. But if some company are developing their own
> >> Flink, and sync with community version by rebasing, may face a lot of
> >> conflicts. Although you can avoid conflicts by always moving source
> codes
> >> between packages, but I assume you still need to delete the original
> scala
> >> file and add a new java file when you want to change program language.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org>
> wrote:
> >>
> >>> Hi Hequn,
> >>>
> >>> thanks for your feedback. Yes, migrating the test cases is another
> issue
> >>> that is not represented in the document but should naturally go along
> >>> with the migration.
> >>>
> >>> I agree that we should migrate the main API classes quickly within this
> >>> 1.8 release after the module split has been performed. Help here is
> >>> highly appreciated!
> >>>
> >>> I forgot that Java supports static methods in interfaces now, but
> >>> actually I don't like the design of calling
> `TableEnvironment.get(env)`.
> >>> Because people often use `TableEnvironment tEnd =
> >>> TableEnvironment.get(env)` and then wonder why there is no
> >>> `toAppendStream` or `toDataSet` because they are using the base class.
> >>> However, things like that can be discussed in the corresponding issue
> >>> when it comes to implementation.
> >>>
> >>> @Vino: I think your work fits nicely to these efforts.
> >>>
> >>> @everyone: I will wait for more feedback until end of this week. Then I
> >>> will convert the design document into a FLIP and open subtasks in Jira,
> >>> if there are no objections?
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> Am 24.11.18 um 13:45 schrieb vino yang:
> >>>> Hi hequn,
> >>>>
> >>>> I am very glad to hear that you are interested in this work.
> >>>> As we all know, this process involves a lot.
> >>>> Currently, the migration work has begun. I started with the
> >>>> Kafka connector's dependency on flink-table and moved the
> >>>> related dependencies to flink-table-common.
> >>>> This work is tracked by FLINK-9461.  [1]
> >>>> I don't know if it will conflict with what you expect to do, but from
> >> the
> >>>> impact I have observed,
> >>>> it will involve many classes that are currently in flink-table.
> >>>>
> >>>> *Just a statement to prevent unnecessary conflicts.*
> >>>>
> >>>> Thanks, vino.
> >>>>
> >>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
> >>>>
> >>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
> >>>>
> >>>>> Hi Timo,
> >>>>>
> >>>>> Thanks for the effort and writing up this document. I like the idea
> to
> >>> make
> >>>>> flink-table scala free, so +1 for the proposal!
> >>>>>
> >>>>> It's good to make Java the first-class citizen. For a long time, we
> >> have
> >>>>> neglected java so that many features in Table are missed in Java Test
> >>>>> cases, such as this one[1] I found recently. And I think we may also
> >>> need
> >>>>> to migrate our test cases, i.e, add java tests.
> >>>>>
> >>>>> This definitely is a big change and will break API compatible. In
> >> order
> >>> to
> >>>>> bring a smaller impact on users, I think we should go fast when we
> >>> migrate
> >>>>> APIs targeted to users. It's better to introduce the user sensitive
> >>> changes
> >>>>> within a release. However, it may be not that easy. I can help to
> >>>>> contribute.
> >>>>>
> >>>>> Separation of interface and implementation is a good idea. This may
> >>>>> introduce a minimum of dependencies or even no dependencies. I saw
> >> your
> >>>>> reply in the google doc. Java8 has already supported static method
> for
> >>>>> interfaces, I think we can make use of it?
> >>>>>
> >>>>> Best,
> >>>>> Hequn
> >>>>>
> >>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
> >>>>>
> >>>>>
> >>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
> >>> wrote:
> >>>>>> Hi everyone,
> >>>>>>
> >>>>>> thanks for the great feedback so far. I updated the document with
> the
> >>>>>> input I got so far
> >>>>>>
> >>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in
> the
> >>>>> list.
> >>>>>> @Xiaowei: Could you elaborate what "interface only" means to you? Do
> >>> you
> >>>>>> mean a module containing pure Java `interface`s? Or is the
> validation
> >>>>>> logic also part of the API module? Are 50+ expression classes part
> of
> >>>>>> the API interface or already too implementation-specific?
> >>>>>>
> >>>>>> @Xuefu: I extended the document by almost a page to clarify when we
> >>>>>> should develop in Scala and when in Java. As Piotr said, every new
> >>> Scala
> >>>>>> line is instant technical debt.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Timo
> >>>>>>
> >>>>>>
> >>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> >>>>>>> Hi Timo,
> >>>>>>>
> >>>>>>> Thanks for writing this down +1 from my side :)
> >>>>>>>
> >>>>>>>> I'm wondering that whether we can have rule in the interim when
> >> Java
> >>>>>> and Scala coexist that dependency can only be one-way. I found that
> >> in
> >>>>> the
> >>>>>> current code base there are cases where a Scala class extends Java
> >> and
> >>>>> vise
> >>>>>> versa. This is quite painful. I'm thinking if we could say that
> >>> extension
> >>>>>> can only be from Java to Scala, which will help the situation.
> >> However,
> >>>>> I'm
> >>>>>> not sure if this is practical.
> >>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably we
> >>>>> will
> >>>>>> have to work it out as we go. One thing to consider is that from now
> >>> on,
> >>>>>> every single new code line written in Scala anywhere in Flink-table
> >>>>> (except
> >>>>>> of Flink-table-api-scala) is an instant technological debt. From
> this
> >>>>>> perspective I would be in favour of tolerating quite big
> >> inchonvieneces
> >>>>>> just to avoid any new Scala code.
> >>>>>>> Piotrek
> >>>>>>>
> >>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
> >>>>> wrote:
> >>>>>>>> Hi Timo,
> >>>>>>>>
> >>>>>>>> Thanks for the effort and the Google writeup. During our external
> >>>>>> catalog rework, we found much confusion between Java and Scala, and
> >>> this
> >>>>>> Scala-free roadmap should greatly mitigate that.
> >>>>>>>> I'm wondering that whether we can have rule in the interim when
> >> Java
> >>>>>> and Scala coexist that dependency can only be one-way. I found that
> >> in
> >>>>> the
> >>>>>> current code base there are cases where a Scala class extends Java
> >> and
> >>>>> vise
> >>>>>> versa. This is quite painful. I'm thinking if we could say that
> >>> extension
> >>>>>> can only be from Java to Scala, which will help the situation.
> >> However,
> >>>>> I'm
> >>>>>> not sure if this is practical.
> >>>>>>>> Thanks,
> >>>>>>>> Xuefu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ------------------------------------------------------------------
> >>>>>>>> Sender:jincheng sun <su...@gmail.com>
> >>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
> >>>>>>>> Recipient:dev <de...@flink.apache.org>
> >>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
> >> Scala-free
> >>>>>>>> Hi Timo,
> >>>>>>>> Thanks for initiating this great discussion.
> >>>>>>>>
> >>>>>>>> Currently when using SQL/TableAPI should include many dependence.
> >> In
> >>>>>>>> particular, it is not necessary to introduce the specific
> >>>>> implementation
> >>>>>>>> dependencies which users do not care about. So I am glad to see
> >> your
> >>>>>>>> proposal, and hope when we consider splitting the API interface
> >> into
> >>> a
> >>>>>>>> separate module, so that the user can introduce minimum of
> >>>>> dependencies.
> >>>>>>>> So, +1 to [separation of interface and implementation; e.g.
> >> `Table` &
> >>>>>>>> `TableImpl`] which you mentioned in the google doc.
> >>>>>>>> Best,
> >>>>>>>> Jincheng
> >>>>>>>>
> >>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> >>>>>>>>
> >>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
> >> thing
> >>>>> to
> >>>>>> do.
> >>>>>>>>> While we are doing this, can we also keep in mind that we want to
> >>>>>>>>> eventually have a TableAPI interface only module which users can
> >>> take
> >>>>>>>>> dependency on, but without including any implementation details?
> >>>>>>>>>
> >>>>>>>>> Xiaowei
> >>>>>>>>>
> >>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fhueske@gmail.com
> >
> >>>>>> wrote:
> >>>>>>>>>> Hi Timo,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for writing up this document.
> >>>>>>>>>> I like the new structure and agree to prioritize the porting of
> >> the
> >>>>>>>>>> flink-table-common classes.
> >>>>>>>>>> Since flink-table-runtime is (or should be) independent of the
> >> API
> >>>>> and
> >>>>>>>>>> planner modules, we could start porting these classes once the
> >> code
> >>>>> is
> >>>>>>>>>> split into the new module structure.
> >>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
> >>>>> Scala-free
> >>>>>>>>>> execution Jar.
> >>>>>>>>>>
> >>>>>>>>>> Best, Fabian
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> >>>>>>>>>> twalthr@apache.org
> >>>>>>>>>>> :
> >>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to continue this discussion thread and convert the
> >>>>>> outcome
> >>>>>>>>>>> into a FLIP such that users and contributors know what to
> expect
> >>> in
> >>>>>> the
> >>>>>>>>>>> upcoming releases.
> >>>>>>>>>>>
> >>>>>>>>>>> I created a design document [1] that clarifies our motivation
> >> why
> >>>>> we
> >>>>>>>>>>> want to do this, how a Maven module structure could look like,
> >> and
> >>>>> a
> >>>>>>>>>>> suggestion for a migration plan.
> >>>>>>>>>>>
> >>>>>>>>>>> It would be great to start with the efforts for the 1.8 release
> >>>>> such
> >>>>>>>>>>> that new features can be developed in Java and major
> >> refactorings
> >>>>>> such
> >>>>>>>>>>> as improvements to the connectors and external catalog support
> >> are
> >>>>>> not
> >>>>>>>>>>> blocked.
> >>>>>>>>>>>
> >>>>>>>>>>> Please let me know what you think.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Timo
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> >>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> >>>>>>>>>>>> Hi Piotr,
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for the
> >>>>>>>>> comments.
> >>>>>>>>>>>> I think the first step would be to separate the flink-table
> >>> module
> >>>>>>>>> into
> >>>>>>>>>>>> multiple sub modules. These could be:
> >>>>>>>>>>>>
> >>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later
> divided
> >>>>>>>>> further
> >>>>>>>>>>>> into Java/Scala Table API/SQL
> >>>>>>>>>>>> - flink-table-planning: involves all planning (basically
> >>>>> everything
> >>>>>>>>> we
> >>>>>>>>>> do
> >>>>>>>>>>>> with Calcite)
> >>>>>>>>>>>> - flink-table-runtime: the runtime code
> >>>>>>>>>>>>
> >>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module
> >> and
> >>>>>>>>>> certain
> >>>>>>>>>>>> parts of the planning module ported to Java.
> >>>>>>>>>>>> The api module will be much harder to port because of several
> >>>>>>>>>>> dependencies
> >>>>>>>>>>>> to Scala core classes (the parser framework, tree iterations,
> >>>>> etc.).
> >>>>>>>>>> I'm
> >>>>>>>>>>>> not saying we should not port this to Java, but it is not
> clear
> >>> to
> >>>>>> me
> >>>>>>>>>>> (yet)
> >>>>>>>>>>>> how to do it.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think flink-table-runtime should not be too hard to port.
> The
> >>>>> code
> >>>>>>>>>> does
> >>>>>>>>>>>> not make use of many Scala features, i.e., it's writing very
> >>>>>>>>> Java-like.
> >>>>>>>>>>>> Also, there are not many dependencies and operators can be
> >>>>>>>>> individually
> >>>>>>>>>>>> ported step-by-step.
> >>>>>>>>>>>> For flink-table-planning, we can have certain packages that we
> >>>>> port
> >>>>>>>>> to
> >>>>>>>>>>> Java
> >>>>>>>>>>>> like planning rules or plan nodes. The related classes mostly
> >>>>> extend
> >>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural choices
> >>> for
> >>>>>>>>>> being
> >>>>>>>>>>>> ported. The code generation classes will require more effort
> to
> >>>>>> port.
> >>>>>>>>>>> There
> >>>>>>>>>>>> are also some dependencies in planning on the api module that
> >> we
> >>>>>>>>> would
> >>>>>>>>>>> need
> >>>>>>>>>>>> to resolve somehow.
> >>>>>>>>>>>>
> >>>>>>>>>>>> For SQL most work when adding new features is done in the
> >>> planning
> >>>>>>>>> and
> >>>>>>>>>>>> runtime modules. So, this separation should already reduce
> >>>>>>>>>> "technological
> >>>>>>>>>>>> dept" quite a lot.
> >>>>>>>>>>>> The Table API depends much more on Scala than SQL.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Cheers, Fabian
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I also think about this problem these days and here are my
> >>>>>> thoughts.
> >>>>>>>>>>>>> 1) We must admit that it’s really a tough task to
> interoperate
> >>>>> with
> >>>>>>>>>> Java
> >>>>>>>>>>>>> and Scala. E.g., they have different collection types (Scala
> >>>>>>>>>> collections
> >>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a
> method
> >>>>>> which
> >>>>>>>>>>> takes
> >>>>>>>>>>>>> Scala functions as parameters. Considering the major part of
> >> the
> >>>>>>>>> code
> >>>>>>>>>>> base
> >>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
> >> view.
> >>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API and
> >>>>> make
> >>>>>>>>> all
> >>>>>>>>>>> the
> >>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
> >>> achieved
> >>>>>>>>> even
> >>>>>>>>>>> in a
> >>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> >>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 3) If the community makes the final decision, maybe any new
> >>>>>> features
> >>>>>>>>>>>>> should be added in Java (regardless of the modules), in order
> >> to
> >>>>>>>>>> prevent
> >>>>>>>>>>>>> the Scala codes from growing.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Xingcan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> >>>>>>>>> piotr@data-artisans.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Bumping the topic.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less code
> we
> >>>>> will
> >>>>>>>>>> have
> >>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
> >> Fabian's
> >>>>>>>>>>> proposal
> >>>>>>>>>>>>> of doing it module wise and one module at a time.
> >>>>>>>>>>>>>> First, I do not see a problem of having java/scala code even
> >>>>>> within
> >>>>>>>>>> one
> >>>>>>>>>>>>> module, especially not if there are clean boundaries. Like we
> >>>>> could
> >>>>>>>>>> have
> >>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in
> Java
> >>> in
> >>>>>>>>> the
> >>>>>>>>>>> same
> >>>>>>>>>>>>> module. However I haven’t previously maintained mixed
> >> scala/java
> >>>>>>>>> code
> >>>>>>>>>>> bases
> >>>>>>>>>>>>> before, so I might be missing something here.
> >>>>>>>>>>>>>> Secondly this whole migration might and most like will take
> >>>>> longer
> >>>>>>>>>> then
> >>>>>>>>>>>>> expected, so that creates a problem for a new code that we
> >> will
> >>>>> be
> >>>>>>>>>>>>> creating. After making a decision to migrate to Java, almost
> >> any
> >>>>>> new
> >>>>>>>>>>> Scala
> >>>>>>>>>>>>> line of code will be immediately a technological debt and we
> >>> will
> >>>>>>>>> have
> >>>>>>>>>>> to
> >>>>>>>>>>>>> rewrite it to Java later.
> >>>>>>>>>>>>>> Thus I would propose first to state our end goal - modules
> >>>>>>>>> structure
> >>>>>>>>>>> and
> >>>>>>>>>>>>> which parts of modules we want to have eventually Scala-free.
> >>>>>>>>> Secondly
> >>>>>>>>>>>>> taking all steps necessary that will allow us to write new
> >> code
> >>>>>>>>>>> complaint
> >>>>>>>>>>>>> with our end goal. Only after that we should/could focus on
> >>>>>>>>>>> incrementally
> >>>>>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked
> >> for
> >>>>>>>>> years
> >>>>>>>>>>>>> writing new code in Scala (and increasing technological
> debt),
> >>>>>>>>> because
> >>>>>>>>>>>>> nobody have found a time to rewrite some non important and
> not
> >>>>>>>>>> actively
> >>>>>>>>>>>>> developed part of some module.
> >>>>>>>>>>>>>> Piotrek
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fhueske@gmail.com
> >
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In general, I think this is a good effort. However, it
> won't
> >>> be
> >>>>>>>>> easy
> >>>>>>>>>>>>> and I
> >>>>>>>>>>>>>>> think we have to plan this well.
> >>>>>>>>>>>>>>> I don't like the idea of having the whole code base
> >> fragmented
> >>>>>>>>> into
> >>>>>>>>>>> Java
> >>>>>>>>>>>>>>> and Scala code for too long.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think we should do this one step at a time and focus on
> >>>>>>>>> migrating
> >>>>>>>>>>> one
> >>>>>>>>>>>>>>> module at a time.
> >>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to
> Java.
> >>>>>>>>>>>>>>> Extracting the API classes into an own module, porting them
> >> to
> >>>>>>>>> Java,
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
> >>>>> breaking
> >>>>>>>>> the
> >>>>>>>>>>> API
> >>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best, Fabian
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
> >>> trohrmann@apache.org
> >>>>>> :
> >>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we should
> >>>>> strive
> >>>>>>>>> for
> >>>>>>>>>>> it.
> >>>>>>>>>>>>>>>> This, however, must be an iterative process given the
> sheer
> >>>>> size
> >>>>>>>>> of
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> code base. I like the approach to define common Java
> >> modules
> >>>>>>>>> which
> >>>>>>>>>>> are
> >>>>>>>>>>>>> used
> >>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving classes
> >> from
> >>>>>>>>> Scala
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> >>>>>>>>>>>>> piotr@data-artisans.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
> >>> interacts
> >>>>>>>>> with
> >>>>>>>>>>>>> each
> >>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
> >>>>> generally
> >>>>>>>>>>>>> speaking
> >>>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>>> from me.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
> >>>>>>>>>> `flink-table-core`
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be
> >> able
> >>>>> to
> >>>>>>>>>> add
> >>>>>>>>>>>>> new
> >>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
> >>> coexist
> >>>>>>>>> with
> >>>>>>>>>>> old
> >>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Piotrek
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
> >> twalthr@apache.org
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
> >>>>> implemented
> >>>>>>>>> in
> >>>>>>>>>>>>> Scala.
> >>>>>>>>>>>>>>>>> This decision was made a long-time ago when the initital
> >>> code
> >>>>>>>>> base
> >>>>>>>>>>> was
> >>>>>>>>>>>>>>>>> created as part of a master's thesis. The community kept
> >>>>> Scala
> >>>>>>>>>>>>> because of
> >>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table API
> >>>>> like
> >>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows for
> >>>>> quick
> >>>>>>>>>>>>>>>> prototyping
> >>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
> >>>>> committers
> >>>>>>>>>>>>> enforced
> >>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> splitting the code-base into two programming languages.
> >>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
> >>>>> becomes
> >>>>>>>>> an
> >>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
> >> formats,
> >>>>> and
> >>>>>>>>>> SQL
> >>>>>>>>>>>>>>>> client
> >>>>>>>>>>>>>>>>> are actually implemented in Java but need to interoperate
> >>>>> with
> >>>>>>>>>>>>>>>> flink-table
> >>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As
> mentioned
> >>> in
> >>>>>> an
> >>>>>>>>>>>>> earlier
> >>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
> >> member
> >>>>>>>>>>> variables
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users [1].
> >>> Java
> >>>>>> is
> >>>>>>>>>>> still
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> most important API language and right now we treat it as
> a
> >>>>>>>>>>>>> second-class
> >>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala
> if
> >>>>> you
> >>>>>>>>>> just
> >>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
> >> between
> >>>>>>>>>> `public
> >>>>>>>>>>>>>>>> String
> >>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
> >>>>>>>>>>>>>>>>>> Given the size of the current code base, reimplementing
> >> the
> >>>>>>>>>> entire
> >>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
> >>> reach.
> >>>>>>>>>>>>> However, we
> >>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
> >>>>> long-term
> >>>>>>>>>> goal
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
> >>>>> runtime
> >>>>>>>>>>>>> classes
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> split the code base into multiple modules:
> >>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> >>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
> >>>>>>>>> require
> >>>>>>>>>> to
> >>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
> >>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> >>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> flink-table-common
> >>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can
> >> use
> >>>>>>>>> this.
> >>>>>>>>>> It
> >>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
> >> sink,
> >>>>>>>>> table
> >>>>>>>>>>>>> source.
> >>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> >>>>>>>>>>>>>>>>> flink-table-runtime}
> >>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
> >> base.
> >>>>>>>>>>>>>>>>>>> flink-table-runtime
> >>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
> >> classes
> >>>>> in
> >>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> >>>>> potentially.
> >>>>>>>>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> >>>>>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> >>>>>>>>>>>>> traits-tp21335.html
> >>>
>
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Hi Kurt,

I understand your concerns. However, there is no concrete roadmap for 
Flink 2.0 and (as Vino said) the flink-table is developed very actively. 
Major refactorings happened in the past and will also happen with or 
without Scala migration. A good example, is the proper catalog support 
which will refactor big parts of the TableEnvironment class. Or the 
introduction of "retractions" which needed a big refactoring of the 
planning phase. Stability is only guaranteed for the API and the general 
behavior, however, currently flink-table is not using @Public or 
@PublicEvolving annotations for a reason.

I think the migration will still happen slowly because it needs people 
that allocate time for that. Therefore, even Flink forks can slowly 
adapt to the evolving Scala-to-Java code base.

Regards,
Timo


Am 27.11.18 um 13:16 schrieb vino yang:
> Hi Kurt,
>
> Currently, there is still a long time to go from flink 2.0. Considering
> that the flink-table
> is one of the most active modules in the current flink project, each
> version has
> a number of changes and features added. I think that refactoring faster
> will reduce subsequent
> complexity and workload. And this may be a gradual and long process. We
> should be able to
>   regard it as a "technical debt", and if it does not change it, it will
> also affect the decision-making of other issues.
>
> Thanks, vino.
>
> Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:
>
>> Hi Timo,
>>
>> Thanks for writing up the document. I'm +1 for reorganizing the module
>> structure and make table scala free. But I have
>> a little concern abount the timing. Is it more appropriate to get this done
>> when Flink decide to bump to next big version, like 2.x.
>> It's true you can keep all the class's package path as it is, and will not
>> introduce API change. But if some company are developing their own
>> Flink, and sync with community version by rebasing, may face a lot of
>> conflicts. Although you can avoid conflicts by always moving source codes
>> between packages, but I assume you still need to delete the original scala
>> file and add a new java file when you want to change program language.
>>
>> Best,
>> Kurt
>>
>>
>> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org> wrote:
>>
>>> Hi Hequn,
>>>
>>> thanks for your feedback. Yes, migrating the test cases is another issue
>>> that is not represented in the document but should naturally go along
>>> with the migration.
>>>
>>> I agree that we should migrate the main API classes quickly within this
>>> 1.8 release after the module split has been performed. Help here is
>>> highly appreciated!
>>>
>>> I forgot that Java supports static methods in interfaces now, but
>>> actually I don't like the design of calling `TableEnvironment.get(env)`.
>>> Because people often use `TableEnvironment tEnd =
>>> TableEnvironment.get(env)` and then wonder why there is no
>>> `toAppendStream` or `toDataSet` because they are using the base class.
>>> However, things like that can be discussed in the corresponding issue
>>> when it comes to implementation.
>>>
>>> @Vino: I think your work fits nicely to these efforts.
>>>
>>> @everyone: I will wait for more feedback until end of this week. Then I
>>> will convert the design document into a FLIP and open subtasks in Jira,
>>> if there are no objections?
>>>
>>> Regards,
>>> Timo
>>>
>>> Am 24.11.18 um 13:45 schrieb vino yang:
>>>> Hi hequn,
>>>>
>>>> I am very glad to hear that you are interested in this work.
>>>> As we all know, this process involves a lot.
>>>> Currently, the migration work has begun. I started with the
>>>> Kafka connector's dependency on flink-table and moved the
>>>> related dependencies to flink-table-common.
>>>> This work is tracked by FLINK-9461.  [1]
>>>> I don't know if it will conflict with what you expect to do, but from
>> the
>>>> impact I have observed,
>>>> it will involve many classes that are currently in flink-table.
>>>>
>>>> *Just a statement to prevent unnecessary conflicts.*
>>>>
>>>> Thanks, vino.
>>>>
>>>> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>>>>
>>>> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>>>>
>>>>> Hi Timo,
>>>>>
>>>>> Thanks for the effort and writing up this document. I like the idea to
>>> make
>>>>> flink-table scala free, so +1 for the proposal!
>>>>>
>>>>> It's good to make Java the first-class citizen. For a long time, we
>> have
>>>>> neglected java so that many features in Table are missed in Java Test
>>>>> cases, such as this one[1] I found recently. And I think we may also
>>> need
>>>>> to migrate our test cases, i.e, add java tests.
>>>>>
>>>>> This definitely is a big change and will break API compatible. In
>> order
>>> to
>>>>> bring a smaller impact on users, I think we should go fast when we
>>> migrate
>>>>> APIs targeted to users. It's better to introduce the user sensitive
>>> changes
>>>>> within a release. However, it may be not that easy. I can help to
>>>>> contribute.
>>>>>
>>>>> Separation of interface and implementation is a good idea. This may
>>>>> introduce a minimum of dependencies or even no dependencies. I saw
>> your
>>>>> reply in the google doc. Java8 has already supported static method for
>>>>> interfaces, I think we can make use of it?
>>>>>
>>>>> Best,
>>>>> Hequn
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>>>>
>>>>>
>>>>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
>>> wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>> thanks for the great feedback so far. I updated the document with the
>>>>>> input I got so far
>>>>>>
>>>>>> @Fabian: I moved the porting of flink-table-runtime classes up in the
>>>>> list.
>>>>>> @Xiaowei: Could you elaborate what "interface only" means to you? Do
>>> you
>>>>>> mean a module containing pure Java `interface`s? Or is the validation
>>>>>> logic also part of the API module? Are 50+ expression classes part of
>>>>>> the API interface or already too implementation-specific?
>>>>>>
>>>>>> @Xuefu: I extended the document by almost a page to clarify when we
>>>>>> should develop in Scala and when in Java. As Piotr said, every new
>>> Scala
>>>>>> line is instant technical debt.
>>>>>>
>>>>>> Thanks,
>>>>>> Timo
>>>>>>
>>>>>>
>>>>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>>>>> Hi Timo,
>>>>>>>
>>>>>>> Thanks for writing this down +1 from my side :)
>>>>>>>
>>>>>>>> I'm wondering that whether we can have rule in the interim when
>> Java
>>>>>> and Scala coexist that dependency can only be one-way. I found that
>> in
>>>>> the
>>>>>> current code base there are cases where a Scala class extends Java
>> and
>>>>> vise
>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>> extension
>>>>>> can only be from Java to Scala, which will help the situation.
>> However,
>>>>> I'm
>>>>>> not sure if this is practical.
>>>>>>> Xuefu: I’m also not sure what’s the best approach here, probably we
>>>>> will
>>>>>> have to work it out as we go. One thing to consider is that from now
>>> on,
>>>>>> every single new code line written in Scala anywhere in Flink-table
>>>>> (except
>>>>>> of Flink-table-api-scala) is an instant technological debt. From this
>>>>>> perspective I would be in favour of tolerating quite big
>> inchonvieneces
>>>>>> just to avoid any new Scala code.
>>>>>>> Piotrek
>>>>>>>
>>>>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
>>>>> wrote:
>>>>>>>> Hi Timo,
>>>>>>>>
>>>>>>>> Thanks for the effort and the Google writeup. During our external
>>>>>> catalog rework, we found much confusion between Java and Scala, and
>>> this
>>>>>> Scala-free roadmap should greatly mitigate that.
>>>>>>>> I'm wondering that whether we can have rule in the interim when
>> Java
>>>>>> and Scala coexist that dependency can only be one-way. I found that
>> in
>>>>> the
>>>>>> current code base there are cases where a Scala class extends Java
>> and
>>>>> vise
>>>>>> versa. This is quite painful. I'm thinking if we could say that
>>> extension
>>>>>> can only be from Java to Scala, which will help the situation.
>> However,
>>>>> I'm
>>>>>> not sure if this is practical.
>>>>>>>> Thanks,
>>>>>>>> Xuefu
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------
>>>>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>>>>> Recipient:dev <de...@flink.apache.org>
>>>>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
>> Scala-free
>>>>>>>> Hi Timo,
>>>>>>>> Thanks for initiating this great discussion.
>>>>>>>>
>>>>>>>> Currently when using SQL/TableAPI should include many dependence.
>> In
>>>>>>>> particular, it is not necessary to introduce the specific
>>>>> implementation
>>>>>>>> dependencies which users do not care about. So I am glad to see
>> your
>>>>>>>> proposal, and hope when we consider splitting the API interface
>> into
>>> a
>>>>>>>> separate module, so that the user can introduce minimum of
>>>>> dependencies.
>>>>>>>> So, +1 to [separation of interface and implementation; e.g.
>> `Table` &
>>>>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>>>>> Best,
>>>>>>>> Jincheng
>>>>>>>>
>>>>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>>>>
>>>>>>>>> Hi Timo, thanks for driving this! I think that this is a nice
>> thing
>>>>> to
>>>>>> do.
>>>>>>>>> While we are doing this, can we also keep in mind that we want to
>>>>>>>>> eventually have a TableAPI interface only module which users can
>>> take
>>>>>>>>> dependency on, but without including any implementation details?
>>>>>>>>>
>>>>>>>>> Xiaowei
>>>>>>>>>
>>>>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
>>>>>> wrote:
>>>>>>>>>> Hi Timo,
>>>>>>>>>>
>>>>>>>>>> Thanks for writing up this document.
>>>>>>>>>> I like the new structure and agree to prioritize the porting of
>> the
>>>>>>>>>> flink-table-common classes.
>>>>>>>>>> Since flink-table-runtime is (or should be) independent of the
>> API
>>>>> and
>>>>>>>>>> planner modules, we could start porting these classes once the
>> code
>>>>> is
>>>>>>>>>> split into the new module structure.
>>>>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>>>>> Scala-free
>>>>>>>>>> execution Jar.
>>>>>>>>>>
>>>>>>>>>> Best, Fabian
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>>>>> twalthr@apache.org
>>>>>>>>>>> :
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> I would like to continue this discussion thread and convert the
>>>>>> outcome
>>>>>>>>>>> into a FLIP such that users and contributors know what to expect
>>> in
>>>>>> the
>>>>>>>>>>> upcoming releases.
>>>>>>>>>>>
>>>>>>>>>>> I created a design document [1] that clarifies our motivation
>> why
>>>>> we
>>>>>>>>>>> want to do this, how a Maven module structure could look like,
>> and
>>>>> a
>>>>>>>>>>> suggestion for a migration plan.
>>>>>>>>>>>
>>>>>>>>>>> It would be great to start with the efforts for the 1.8 release
>>>>> such
>>>>>>>>>>> that new features can be developed in Java and major
>> refactorings
>>>>>> such
>>>>>>>>>>> as improvements to the connectors and external catalog support
>> are
>>>>>> not
>>>>>>>>>>> blocked.
>>>>>>>>>>>
>>>>>>>>>>> Please let me know what you think.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Timo
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>>
>>>>>>>>>>>
>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>>>>> Hi Piotr,
>>>>>>>>>>>>
>>>>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for the
>>>>>>>>> comments.
>>>>>>>>>>>> I think the first step would be to separate the flink-table
>>> module
>>>>>>>>> into
>>>>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>>>>
>>>>>>>>>>>> - flink-table-api: All API facing classes. Can be later divided
>>>>>>>>> further
>>>>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>>>>> - flink-table-planning: involves all planning (basically
>>>>> everything
>>>>>>>>> we
>>>>>>>>>> do
>>>>>>>>>>>> with Calcite)
>>>>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>>>>
>>>>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module
>> and
>>>>>>>>>> certain
>>>>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>>>>> The api module will be much harder to port because of several
>>>>>>>>>>> dependencies
>>>>>>>>>>>> to Scala core classes (the parser framework, tree iterations,
>>>>> etc.).
>>>>>>>>>> I'm
>>>>>>>>>>>> not saying we should not port this to Java, but it is not clear
>>> to
>>>>>> me
>>>>>>>>>>> (yet)
>>>>>>>>>>>> how to do it.
>>>>>>>>>>>>
>>>>>>>>>>>> I think flink-table-runtime should not be too hard to port. The
>>>>> code
>>>>>>>>>> does
>>>>>>>>>>>> not make use of many Scala features, i.e., it's writing very
>>>>>>>>> Java-like.
>>>>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>>>>> individually
>>>>>>>>>>>> ported step-by-step.
>>>>>>>>>>>> For flink-table-planning, we can have certain packages that we
>>>>> port
>>>>>>>>> to
>>>>>>>>>>> Java
>>>>>>>>>>>> like planning rules or plan nodes. The related classes mostly
>>>>> extend
>>>>>>>>>>>> Calcite's Java interfaces/classes and would be natural choices
>>> for
>>>>>>>>>> being
>>>>>>>>>>>> ported. The code generation classes will require more effort to
>>>>>> port.
>>>>>>>>>>> There
>>>>>>>>>>>> are also some dependencies in planning on the api module that
>> we
>>>>>>>>> would
>>>>>>>>>>> need
>>>>>>>>>>>> to resolve somehow.
>>>>>>>>>>>>
>>>>>>>>>>>> For SQL most work when adding new features is done in the
>>> planning
>>>>>>>>> and
>>>>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>>>>> "technological
>>>>>>>>>>>> dept" quite a lot.
>>>>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers, Fabian
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also think about this problem these days and here are my
>>>>>> thoughts.
>>>>>>>>>>>>> 1) We must admit that it’s really a tough task to interoperate
>>>>> with
>>>>>>>>>> Java
>>>>>>>>>>>>> and Scala. E.g., they have different collection types (Scala
>>>>>>>>>> collections
>>>>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
>>>>>> which
>>>>>>>>>>> takes
>>>>>>>>>>>>> Scala functions as parameters. Considering the major part of
>> the
>>>>>>>>> code
>>>>>>>>>>> base
>>>>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
>> view.
>>>>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API and
>>>>> make
>>>>>>>>> all
>>>>>>>>>>> the
>>>>>>>>>>>>> other parts Scala-free. But I am not sure if it could be
>>> achieved
>>>>>>>>> even
>>>>>>>>>>> in a
>>>>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) If the community makes the final decision, maybe any new
>>>>>> features
>>>>>>>>>>>>> should be added in Java (regardless of the modules), in order
>> to
>>>>>>>>>> prevent
>>>>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Xingcan
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we want to do this, the sooner we decide, the less code we
>>>>> will
>>>>>>>>>> have
>>>>>>>>>>>>> to rewrite. I have some objections/counter proposals to
>> Fabian's
>>>>>>>>>>> proposal
>>>>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>>>>> First, I do not see a problem of having java/scala code even
>>>>>> within
>>>>>>>>>> one
>>>>>>>>>>>>> module, especially not if there are clean boundaries. Like we
>>>>> could
>>>>>>>>>> have
>>>>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in Java
>>> in
>>>>>>>>> the
>>>>>>>>>>> same
>>>>>>>>>>>>> module. However I haven’t previously maintained mixed
>> scala/java
>>>>>>>>> code
>>>>>>>>>>> bases
>>>>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>>>>> Secondly this whole migration might and most like will take
>>>>> longer
>>>>>>>>>> then
>>>>>>>>>>>>> expected, so that creates a problem for a new code that we
>> will
>>>>> be
>>>>>>>>>>>>> creating. After making a decision to migrate to Java, almost
>> any
>>>>>> new
>>>>>>>>>>> Scala
>>>>>>>>>>>>> line of code will be immediately a technological debt and we
>>> will
>>>>>>>>> have
>>>>>>>>>>> to
>>>>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>>>>> Thus I would propose first to state our end goal - modules
>>>>>>>>> structure
>>>>>>>>>>> and
>>>>>>>>>>>>> which parts of modules we want to have eventually Scala-free.
>>>>>>>>> Secondly
>>>>>>>>>>>>> taking all steps necessary that will allow us to write new
>> code
>>>>>>>>>>> complaint
>>>>>>>>>>>>> with our end goal. Only after that we should/could focus on
>>>>>>>>>>> incrementally
>>>>>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked
>> for
>>>>>>>>> years
>>>>>>>>>>>>> writing new code in Scala (and increasing technological debt),
>>>>>>>>> because
>>>>>>>>>>>>> nobody have found a time to rewrite some non important and not
>>>>>>>>>> actively
>>>>>>>>>>>>> developed part of some module.
>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In general, I think this is a good effort. However, it won't
>>> be
>>>>>>>>> easy
>>>>>>>>>>>>> and I
>>>>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>>>>> I don't like the idea of having the whole code base
>> fragmented
>>>>>>>>> into
>>>>>>>>>>> Java
>>>>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think we should do this one step at a time and focus on
>>>>>>>>> migrating
>>>>>>>>>>> one
>>>>>>>>>>>>>>> module at a time.
>>>>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
>>>>>>>>>>>>>>> Extracting the API classes into an own module, porting them
>> to
>>>>>>>>> Java,
>>>>>>>>>>> and
>>>>>>>>>>>>>>> removing the Scala dependency won't be possible without
>>>>> breaking
>>>>>>>>> the
>>>>>>>>>>> API
>>>>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
>>> trohrmann@apache.org
>>>>>> :
>>>>>>>>>>>>>>>> I think that is a noble and honorable goal and we should
>>>>> strive
>>>>>>>>> for
>>>>>>>>>>> it.
>>>>>>>>>>>>>>>> This, however, must be an iterative process given the sheer
>>>>> size
>>>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>>>>>> code base. I like the approach to define common Java
>> modules
>>>>>>>>> which
>>>>>>>>>>> are
>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>> by more specific Scala modules and slowly moving classes
>> from
>>>>>>>>> Scala
>>>>>>>>>>> to
>>>>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I do not have an experience with how scala and java
>>> interacts
>>>>>>>>> with
>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>>>>> generally
>>>>>>>>>>>>> speaking
>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>>>>> `flink-table-core`
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be
>> able
>>>>> to
>>>>>>>>>> add
>>>>>>>>>>>>> new
>>>>>>>>>>>>>>>>> classes/features written in Java and so that they can
>>> coexist
>>>>>>>>> with
>>>>>>>>>>> old
>>>>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
>> twalthr@apache.org
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>>>>> implemented
>>>>>>>>> in
>>>>>>>>>>>>> Scala.
>>>>>>>>>>>>>>>>> This decision was made a long-time ago when the initital
>>> code
>>>>>>>>> base
>>>>>>>>>>> was
>>>>>>>>>>>>>>>>> created as part of a master's thesis. The community kept
>>>>> Scala
>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>>>> the nice language features that enable a fluent Table API
>>>>> like
>>>>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows for
>>>>> quick
>>>>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>>>>> committers
>>>>>>>>>>>>> enforced
>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> splitting the code-base into two programming languages.
>>>>>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
>>>>> becomes
>>>>>>>>> an
>>>>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
>> formats,
>>>>> and
>>>>>>>>>> SQL
>>>>>>>>>>>>>>>> client
>>>>>>>>>>>>>>>>> are actually implemented in Java but need to interoperate
>>>>> with
>>>>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>>>>> which makes these modules dependent on Scala. As mentioned
>>> in
>>>>>> an
>>>>>>>>>>>>> earlier
>>>>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
>> member
>>>>>>>>>>> variables
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> methods in Java that should not be exposed to users [1].
>>> Java
>>>>>> is
>>>>>>>>>>> still
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> most important API language and right now we treat it as a
>>>>>>>>>>>>> second-class
>>>>>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala if
>>>>> you
>>>>>>>>>> just
>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
>> between
>>>>>>>>>> `public
>>>>>>>>>>>>>>>> String
>>>>>>>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
>>>>>>>>>>>>>>>>>> Given the size of the current code base, reimplementing
>> the
>>>>>>>>>> entire
>>>>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
>>> reach.
>>>>>>>>>>>>> However, we
>>>>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>>>>> long-term
>>>>>>>>>> goal
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
>>>>> runtime
>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
>>>>>>>>> require
>>>>>>>>>> to
>>>>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can
>> use
>>>>>>>>> this.
>>>>>>>>>> It
>>>>>>>>>>>>>>>>> contains interface classes such as descriptors, table
>> sink,
>>>>>>>>> table
>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
>> base.
>>>>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>>>>> Implemented in Java. This would require to convert
>> classes
>>>>> in
>>>>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>>>>> potentially.
>>>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>>>>> traits-tp21335.html
>>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by vino yang <ya...@gmail.com>.
Hi Kurt,

Currently, there is still a long time to go from flink 2.0. Considering
that the flink-table
is one of the most active modules in the current flink project, each
version has
a number of changes and features added. I think that refactoring faster
will reduce subsequent
complexity and workload. And this may be a gradual and long process. We
should be able to
 regard it as a "technical debt", and if it does not change it, it will
also affect the decision-making of other issues.

Thanks, vino.

Kurt Young <yk...@gmail.com> 于2018年11月27日周二 下午7:34写道:

> Hi Timo,
>
> Thanks for writing up the document. I'm +1 for reorganizing the module
> structure and make table scala free. But I have
> a little concern abount the timing. Is it more appropriate to get this done
> when Flink decide to bump to next big version, like 2.x.
> It's true you can keep all the class's package path as it is, and will not
> introduce API change. But if some company are developing their own
> Flink, and sync with community version by rebasing, may face a lot of
> conflicts. Although you can avoid conflicts by always moving source codes
> between packages, but I assume you still need to delete the original scala
> file and add a new java file when you want to change program language.
>
> Best,
> Kurt
>
>
> On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org> wrote:
>
> > Hi Hequn,
> >
> > thanks for your feedback. Yes, migrating the test cases is another issue
> > that is not represented in the document but should naturally go along
> > with the migration.
> >
> > I agree that we should migrate the main API classes quickly within this
> > 1.8 release after the module split has been performed. Help here is
> > highly appreciated!
> >
> > I forgot that Java supports static methods in interfaces now, but
> > actually I don't like the design of calling `TableEnvironment.get(env)`.
> > Because people often use `TableEnvironment tEnd =
> > TableEnvironment.get(env)` and then wonder why there is no
> > `toAppendStream` or `toDataSet` because they are using the base class.
> > However, things like that can be discussed in the corresponding issue
> > when it comes to implementation.
> >
> > @Vino: I think your work fits nicely to these efforts.
> >
> > @everyone: I will wait for more feedback until end of this week. Then I
> > will convert the design document into a FLIP and open subtasks in Jira,
> > if there are no objections?
> >
> > Regards,
> > Timo
> >
> > Am 24.11.18 um 13:45 schrieb vino yang:
> > > Hi hequn,
> > >
> > > I am very glad to hear that you are interested in this work.
> > > As we all know, this process involves a lot.
> > > Currently, the migration work has begun. I started with the
> > > Kafka connector's dependency on flink-table and moved the
> > > related dependencies to flink-table-common.
> > > This work is tracked by FLINK-9461.  [1]
> > > I don't know if it will conflict with what you expect to do, but from
> the
> > > impact I have observed,
> > > it will involve many classes that are currently in flink-table.
> > >
> > > *Just a statement to prevent unnecessary conflicts.*
> > >
> > > Thanks, vino.
> > >
> > > [1]: https://issues.apache.org/jira/browse/FLINK-9461
> > >
> > > Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
> > >
> > >> Hi Timo,
> > >>
> > >> Thanks for the effort and writing up this document. I like the idea to
> > make
> > >> flink-table scala free, so +1 for the proposal!
> > >>
> > >> It's good to make Java the first-class citizen. For a long time, we
> have
> > >> neglected java so that many features in Table are missed in Java Test
> > >> cases, such as this one[1] I found recently. And I think we may also
> > need
> > >> to migrate our test cases, i.e, add java tests.
> > >>
> > >> This definitely is a big change and will break API compatible. In
> order
> > to
> > >> bring a smaller impact on users, I think we should go fast when we
> > migrate
> > >> APIs targeted to users. It's better to introduce the user sensitive
> > changes
> > >> within a release. However, it may be not that easy. I can help to
> > >> contribute.
> > >>
> > >> Separation of interface and implementation is a good idea. This may
> > >> introduce a minimum of dependencies or even no dependencies. I saw
> your
> > >> reply in the google doc. Java8 has already supported static method for
> > >> interfaces, I think we can make use of it?
> > >>
> > >> Best,
> > >> Hequn
> > >>
> > >> [1] https://issues.apache.org/jira/browse/FLINK-11001
> > >>
> > >>
> > >> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
> > wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> thanks for the great feedback so far. I updated the document with the
> > >>> input I got so far
> > >>>
> > >>> @Fabian: I moved the porting of flink-table-runtime classes up in the
> > >> list.
> > >>> @Xiaowei: Could you elaborate what "interface only" means to you? Do
> > you
> > >>> mean a module containing pure Java `interface`s? Or is the validation
> > >>> logic also part of the API module? Are 50+ expression classes part of
> > >>> the API interface or already too implementation-specific?
> > >>>
> > >>> @Xuefu: I extended the document by almost a page to clarify when we
> > >>> should develop in Scala and when in Java. As Piotr said, every new
> > Scala
> > >>> line is instant technical debt.
> > >>>
> > >>> Thanks,
> > >>> Timo
> > >>>
> > >>>
> > >>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> > >>>> Hi Timo,
> > >>>>
> > >>>> Thanks for writing this down +1 from my side :)
> > >>>>
> > >>>>> I'm wondering that whether we can have rule in the interim when
> Java
> > >>> and Scala coexist that dependency can only be one-way. I found that
> in
> > >> the
> > >>> current code base there are cases where a Scala class extends Java
> and
> > >> vise
> > >>> versa. This is quite painful. I'm thinking if we could say that
> > extension
> > >>> can only be from Java to Scala, which will help the situation.
> However,
> > >> I'm
> > >>> not sure if this is practical.
> > >>>> Xuefu: I’m also not sure what’s the best approach here, probably we
> > >> will
> > >>> have to work it out as we go. One thing to consider is that from now
> > on,
> > >>> every single new code line written in Scala anywhere in Flink-table
> > >> (except
> > >>> of Flink-table-api-scala) is an instant technological debt. From this
> > >>> perspective I would be in favour of tolerating quite big
> inchonvieneces
> > >>> just to avoid any new Scala code.
> > >>>> Piotrek
> > >>>>
> > >>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
> > >> wrote:
> > >>>>> Hi Timo,
> > >>>>>
> > >>>>> Thanks for the effort and the Google writeup. During our external
> > >>> catalog rework, we found much confusion between Java and Scala, and
> > this
> > >>> Scala-free roadmap should greatly mitigate that.
> > >>>>> I'm wondering that whether we can have rule in the interim when
> Java
> > >>> and Scala coexist that dependency can only be one-way. I found that
> in
> > >> the
> > >>> current code base there are cases where a Scala class extends Java
> and
> > >> vise
> > >>> versa. This is quite painful. I'm thinking if we could say that
> > extension
> > >>> can only be from Java to Scala, which will help the situation.
> However,
> > >> I'm
> > >>> not sure if this is practical.
> > >>>>> Thanks,
> > >>>>> Xuefu
> > >>>>>
> > >>>>>
> > >>>>> ------------------------------------------------------------------
> > >>>>> Sender:jincheng sun <su...@gmail.com>
> > >>>>> Sent at:2018 Nov 23 (Fri) 09:49
> > >>>>> Recipient:dev <de...@flink.apache.org>
> > >>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table
> Scala-free
> > >>>>>
> > >>>>> Hi Timo,
> > >>>>> Thanks for initiating this great discussion.
> > >>>>>
> > >>>>> Currently when using SQL/TableAPI should include many dependence.
> In
> > >>>>> particular, it is not necessary to introduce the specific
> > >> implementation
> > >>>>> dependencies which users do not care about. So I am glad to see
> your
> > >>>>> proposal, and hope when we consider splitting the API interface
> into
> > a
> > >>>>> separate module, so that the user can introduce minimum of
> > >> dependencies.
> > >>>>> So, +1 to [separation of interface and implementation; e.g.
> `Table` &
> > >>>>> `TableImpl`] which you mentioned in the google doc.
> > >>>>> Best,
> > >>>>> Jincheng
> > >>>>>
> > >>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> > >>>>>
> > >>>>>> Hi Timo, thanks for driving this! I think that this is a nice
> thing
> > >> to
> > >>> do.
> > >>>>>> While we are doing this, can we also keep in mind that we want to
> > >>>>>> eventually have a TableAPI interface only module which users can
> > take
> > >>>>>> dependency on, but without including any implementation details?
> > >>>>>>
> > >>>>>> Xiaowei
> > >>>>>>
> > >>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
> > >>> wrote:
> > >>>>>>> Hi Timo,
> > >>>>>>>
> > >>>>>>> Thanks for writing up this document.
> > >>>>>>> I like the new structure and agree to prioritize the porting of
> the
> > >>>>>>> flink-table-common classes.
> > >>>>>>> Since flink-table-runtime is (or should be) independent of the
> API
> > >> and
> > >>>>>>> planner modules, we could start porting these classes once the
> code
> > >> is
> > >>>>>>> split into the new module structure.
> > >>>>>>> The benefits of a Scala-free flink-table-runtime would be a
> > >> Scala-free
> > >>>>>>> execution Jar.
> > >>>>>>>
> > >>>>>>> Best, Fabian
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> > >>>>>>> twalthr@apache.org
> > >>>>>>>> :
> > >>>>>>>> Hi everyone,
> > >>>>>>>>
> > >>>>>>>> I would like to continue this discussion thread and convert the
> > >>> outcome
> > >>>>>>>> into a FLIP such that users and contributors know what to expect
> > in
> > >>> the
> > >>>>>>>> upcoming releases.
> > >>>>>>>>
> > >>>>>>>> I created a design document [1] that clarifies our motivation
> why
> > >> we
> > >>>>>>>> want to do this, how a Maven module structure could look like,
> and
> > >> a
> > >>>>>>>> suggestion for a migration plan.
> > >>>>>>>>
> > >>>>>>>> It would be great to start with the efforts for the 1.8 release
> > >> such
> > >>>>>>>> that new features can be developed in Java and major
> refactorings
> > >>> such
> > >>>>>>>> as improvements to the connectors and external catalog support
> are
> > >>> not
> > >>>>>>>> blocked.
> > >>>>>>>>
> > >>>>>>>> Please let me know what you think.
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Timo
> > >>>>>>>>
> > >>>>>>>> [1]
> > >>>>>>>>
> > >>>>>>>>
> > >>
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> > >>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > >>>>>>>>> Hi Piotr,
> > >>>>>>>>>
> > >>>>>>>>> thanks for bumping this thread and thanks for Xingcan for the
> > >>>>>> comments.
> > >>>>>>>>> I think the first step would be to separate the flink-table
> > module
> > >>>>>> into
> > >>>>>>>>> multiple sub modules. These could be:
> > >>>>>>>>>
> > >>>>>>>>> - flink-table-api: All API facing classes. Can be later divided
> > >>>>>> further
> > >>>>>>>>> into Java/Scala Table API/SQL
> > >>>>>>>>> - flink-table-planning: involves all planning (basically
> > >> everything
> > >>>>>> we
> > >>>>>>> do
> > >>>>>>>>> with Calcite)
> > >>>>>>>>> - flink-table-runtime: the runtime code
> > >>>>>>>>>
> > >>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module
> and
> > >>>>>>> certain
> > >>>>>>>>> parts of the planning module ported to Java.
> > >>>>>>>>> The api module will be much harder to port because of several
> > >>>>>>>> dependencies
> > >>>>>>>>> to Scala core classes (the parser framework, tree iterations,
> > >> etc.).
> > >>>>>>> I'm
> > >>>>>>>>> not saying we should not port this to Java, but it is not clear
> > to
> > >>> me
> > >>>>>>>> (yet)
> > >>>>>>>>> how to do it.
> > >>>>>>>>>
> > >>>>>>>>> I think flink-table-runtime should not be too hard to port. The
> > >> code
> > >>>>>>> does
> > >>>>>>>>> not make use of many Scala features, i.e., it's writing very
> > >>>>>> Java-like.
> > >>>>>>>>> Also, there are not many dependencies and operators can be
> > >>>>>> individually
> > >>>>>>>>> ported step-by-step.
> > >>>>>>>>> For flink-table-planning, we can have certain packages that we
> > >> port
> > >>>>>> to
> > >>>>>>>> Java
> > >>>>>>>>> like planning rules or plan nodes. The related classes mostly
> > >> extend
> > >>>>>>>>> Calcite's Java interfaces/classes and would be natural choices
> > for
> > >>>>>>> being
> > >>>>>>>>> ported. The code generation classes will require more effort to
> > >>> port.
> > >>>>>>>> There
> > >>>>>>>>> are also some dependencies in planning on the api module that
> we
> > >>>>>> would
> > >>>>>>>> need
> > >>>>>>>>> to resolve somehow.
> > >>>>>>>>>
> > >>>>>>>>> For SQL most work when adding new features is done in the
> > planning
> > >>>>>> and
> > >>>>>>>>> runtime modules. So, this separation should already reduce
> > >>>>>>> "technological
> > >>>>>>>>> dept" quite a lot.
> > >>>>>>>>> The Table API depends much more on Scala than SQL.
> > >>>>>>>>>
> > >>>>>>>>> Cheers, Fabian
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> > >>>>>>>>>
> > >>>>>>>>>> Hi all,
> > >>>>>>>>>>
> > >>>>>>>>>> I also think about this problem these days and here are my
> > >>> thoughts.
> > >>>>>>>>>> 1) We must admit that it’s really a tough task to interoperate
> > >> with
> > >>>>>>> Java
> > >>>>>>>>>> and Scala. E.g., they have different collection types (Scala
> > >>>>>>> collections
> > >>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
> > >>> which
> > >>>>>>>> takes
> > >>>>>>>>>> Scala functions as parameters. Considering the major part of
> the
> > >>>>>> code
> > >>>>>>>> base
> > >>>>>>>>>> is implemented in Java, +1 for this goal from a long-term
> view.
> > >>>>>>>>>>
> > >>>>>>>>>> 2) The ideal solution would be to just expose a Scala API and
> > >> make
> > >>>>>> all
> > >>>>>>>> the
> > >>>>>>>>>> other parts Scala-free. But I am not sure if it could be
> > achieved
> > >>>>>> even
> > >>>>>>>> in a
> > >>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> > >>>>>>>>>> "flink-table-core" would be a compromise solution.
> > >>>>>>>>>>
> > >>>>>>>>>> 3) If the community makes the final decision, maybe any new
> > >>> features
> > >>>>>>>>>> should be added in Java (regardless of the modules), in order
> to
> > >>>>>>> prevent
> > >>>>>>>>>> the Scala codes from growing.
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Xingcan
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> > >>>>>> piotr@data-artisans.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>> Bumping the topic.
> > >>>>>>>>>>>
> > >>>>>>>>>>> If we want to do this, the sooner we decide, the less code we
> > >> will
> > >>>>>>> have
> > >>>>>>>>>> to rewrite. I have some objections/counter proposals to
> Fabian's
> > >>>>>>>> proposal
> > >>>>>>>>>> of doing it module wise and one module at a time.
> > >>>>>>>>>>> First, I do not see a problem of having java/scala code even
> > >>> within
> > >>>>>>> one
> > >>>>>>>>>> module, especially not if there are clean boundaries. Like we
> > >> could
> > >>>>>>> have
> > >>>>>>>>>> API in Scala and optimizer rules/logical nodes written in Java
> > in
> > >>>>>> the
> > >>>>>>>> same
> > >>>>>>>>>> module. However I haven’t previously maintained mixed
> scala/java
> > >>>>>> code
> > >>>>>>>> bases
> > >>>>>>>>>> before, so I might be missing something here.
> > >>>>>>>>>>> Secondly this whole migration might and most like will take
> > >> longer
> > >>>>>>> then
> > >>>>>>>>>> expected, so that creates a problem for a new code that we
> will
> > >> be
> > >>>>>>>>>> creating. After making a decision to migrate to Java, almost
> any
> > >>> new
> > >>>>>>>> Scala
> > >>>>>>>>>> line of code will be immediately a technological debt and we
> > will
> > >>>>>> have
> > >>>>>>>> to
> > >>>>>>>>>> rewrite it to Java later.
> > >>>>>>>>>>> Thus I would propose first to state our end goal - modules
> > >>>>>> structure
> > >>>>>>>> and
> > >>>>>>>>>> which parts of modules we want to have eventually Scala-free.
> > >>>>>> Secondly
> > >>>>>>>>>> taking all steps necessary that will allow us to write new
> code
> > >>>>>>>> complaint
> > >>>>>>>>>> with our end goal. Only after that we should/could focus on
> > >>>>>>>> incrementally
> > >>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked
> for
> > >>>>>> years
> > >>>>>>>>>> writing new code in Scala (and increasing technological debt),
> > >>>>>> because
> > >>>>>>>>>> nobody have found a time to rewrite some non important and not
> > >>>>>>> actively
> > >>>>>>>>>> developed part of some module.
> > >>>>>>>>>>> Piotrek
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> In general, I think this is a good effort. However, it won't
> > be
> > >>>>>> easy
> > >>>>>>>>>> and I
> > >>>>>>>>>>>> think we have to plan this well.
> > >>>>>>>>>>>> I don't like the idea of having the whole code base
> fragmented
> > >>>>>> into
> > >>>>>>>> Java
> > >>>>>>>>>>>> and Scala code for too long.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think we should do this one step at a time and focus on
> > >>>>>> migrating
> > >>>>>>>> one
> > >>>>>>>>>>>> module at a time.
> > >>>>>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
> > >>>>>>>>>>>> Extracting the API classes into an own module, porting them
> to
> > >>>>>> Java,
> > >>>>>>>> and
> > >>>>>>>>>>>> removing the Scala dependency won't be possible without
> > >> breaking
> > >>>>>> the
> > >>>>>>>> API
> > >>>>>>>>>>>> since a few classes depend on the Scala Table API.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best, Fabian
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
> > trohrmann@apache.org
> > >>> :
> > >>>>>>>>>>>>> I think that is a noble and honorable goal and we should
> > >> strive
> > >>>>>> for
> > >>>>>>>> it.
> > >>>>>>>>>>>>> This, however, must be an iterative process given the sheer
> > >> size
> > >>>>>> of
> > >>>>>>>> the
> > >>>>>>>>>>>>> code base. I like the approach to define common Java
> modules
> > >>>>>> which
> > >>>>>>>> are
> > >>>>>>>>>> used
> > >>>>>>>>>>>>> by more specific Scala modules and slowly moving classes
> from
> > >>>>>> Scala
> > >>>>>>>> to
> > >>>>>>>>>>>>> Java. Thus +1 for the proposal.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>> Till
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > >>>>>>>>>> piotr@data-artisans.com>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I do not have an experience with how scala and java
> > interacts
> > >>>>>> with
> > >>>>>>>>>> each
> > >>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
> > >> generally
> > >>>>>>>>>> speaking
> > >>>>>>>>>>>>> +1
> > >>>>>>>>>>>>>> from me.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
> > >>>>>>> `flink-table-core`
> > >>>>>>>> to
> > >>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be
> able
> > >> to
> > >>>>>>> add
> > >>>>>>>>>> new
> > >>>>>>>>>>>>>> classes/features written in Java and so that they can
> > coexist
> > >>>>>> with
> > >>>>>>>> old
> > >>>>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Piotrek
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <
> twalthr@apache.org
> > >
> > >>>>>>> wrote:
> > >>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
> > >> implemented
> > >>>>>> in
> > >>>>>>>>>> Scala.
> > >>>>>>>>>>>>>> This decision was made a long-time ago when the initital
> > code
> > >>>>>> base
> > >>>>>>>> was
> > >>>>>>>>>>>>>> created as part of a master's thesis. The community kept
> > >> Scala
> > >>>>>>>>>> because of
> > >>>>>>>>>>>>>> the nice language features that enable a fluent Table API
> > >> like
> > >>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows for
> > >> quick
> > >>>>>>>>>>>>> prototyping
> > >>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
> > >> committers
> > >>>>>>>>>> enforced
> > >>>>>>>>>>>>> not
> > >>>>>>>>>>>>>> splitting the code-base into two programming languages.
> > >>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
> > >> becomes
> > >>>>>> an
> > >>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors,
> formats,
> > >> and
> > >>>>>>> SQL
> > >>>>>>>>>>>>> client
> > >>>>>>>>>>>>>> are actually implemented in Java but need to interoperate
> > >> with
> > >>>>>>>>>>>>> flink-table
> > >>>>>>>>>>>>>> which makes these modules dependent on Scala. As mentioned
> > in
> > >>> an
> > >>>>>>>>>> earlier
> > >>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes
> member
> > >>>>>>>> variables
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>> methods in Java that should not be exposed to users [1].
> > Java
> > >>> is
> > >>>>>>>> still
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>> most important API language and right now we treat it as a
> > >>>>>>>>>> second-class
> > >>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala if
> > >> you
> > >>>>>>> just
> > >>>>>>>>>> want
> > >>>>>>>>>>>>> to
> > >>>>>>>>>>>>>> implement a ScalarFunction because of method clashes
> between
> > >>>>>>> `public
> > >>>>>>>>>>>>> String
> > >>>>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
> > >>>>>>>>>>>>>>> Given the size of the current code base, reimplementing
> the
> > >>>>>>> entire
> > >>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
> > reach.
> > >>>>>>>>>> However, we
> > >>>>>>>>>>>>>> should at least treat the symptoms and have this as a
> > >> long-term
> > >>>>>>> goal
> > >>>>>>>>>> in
> > >>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
> > >> runtime
> > >>>>>>>>>> classes
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>> split the code base into multiple modules:
> > >>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> > >>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
> > >>>>>> require
> > >>>>>>> to
> > >>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
> > >>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> > >>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> flink-table-common
> > >>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can
> use
> > >>>>>> this.
> > >>>>>>> It
> > >>>>>>>>>>>>>> contains interface classes such as descriptors, table
> sink,
> > >>>>>> table
> > >>>>>>>>>> source.
> > >>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> > >>>>>>>>>>>>>> flink-table-runtime}
> > >>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code
> base.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> flink-table-runtime
> > >>>>>>>>>>>>>>> Implemented in Java. This would require to convert
> classes
> > >> in
> > >>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> > >> potentially.
> > >>>>>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > >>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > >>>>>>>>>> traits-tp21335.html
> > >>>
> >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Kurt Young <yk...@gmail.com>.
Hi Timo,

Thanks for writing up the document. I'm +1 for reorganizing the module
structure and make table scala free. But I have
a little concern abount the timing. Is it more appropriate to get this done
when Flink decide to bump to next big version, like 2.x.
It's true you can keep all the class's package path as it is, and will not
introduce API change. But if some company are developing their own
Flink, and sync with community version by rebasing, may face a lot of
conflicts. Although you can avoid conflicts by always moving source codes
between packages, but I assume you still need to delete the original scala
file and add a new java file when you want to change program language.

Best,
Kurt


On Tue, Nov 27, 2018 at 5:57 PM Timo Walther <tw...@apache.org> wrote:

> Hi Hequn,
>
> thanks for your feedback. Yes, migrating the test cases is another issue
> that is not represented in the document but should naturally go along
> with the migration.
>
> I agree that we should migrate the main API classes quickly within this
> 1.8 release after the module split has been performed. Help here is
> highly appreciated!
>
> I forgot that Java supports static methods in interfaces now, but
> actually I don't like the design of calling `TableEnvironment.get(env)`.
> Because people often use `TableEnvironment tEnd =
> TableEnvironment.get(env)` and then wonder why there is no
> `toAppendStream` or `toDataSet` because they are using the base class.
> However, things like that can be discussed in the corresponding issue
> when it comes to implementation.
>
> @Vino: I think your work fits nicely to these efforts.
>
> @everyone: I will wait for more feedback until end of this week. Then I
> will convert the design document into a FLIP and open subtasks in Jira,
> if there are no objections?
>
> Regards,
> Timo
>
> Am 24.11.18 um 13:45 schrieb vino yang:
> > Hi hequn,
> >
> > I am very glad to hear that you are interested in this work.
> > As we all know, this process involves a lot.
> > Currently, the migration work has begun. I started with the
> > Kafka connector's dependency on flink-table and moved the
> > related dependencies to flink-table-common.
> > This work is tracked by FLINK-9461.  [1]
> > I don't know if it will conflict with what you expect to do, but from the
> > impact I have observed,
> > it will involve many classes that are currently in flink-table.
> >
> > *Just a statement to prevent unnecessary conflicts.*
> >
> > Thanks, vino.
> >
> > [1]: https://issues.apache.org/jira/browse/FLINK-9461
> >
> > Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
> >
> >> Hi Timo,
> >>
> >> Thanks for the effort and writing up this document. I like the idea to
> make
> >> flink-table scala free, so +1 for the proposal!
> >>
> >> It's good to make Java the first-class citizen. For a long time, we have
> >> neglected java so that many features in Table are missed in Java Test
> >> cases, such as this one[1] I found recently. And I think we may also
> need
> >> to migrate our test cases, i.e, add java tests.
> >>
> >> This definitely is a big change and will break API compatible. In order
> to
> >> bring a smaller impact on users, I think we should go fast when we
> migrate
> >> APIs targeted to users. It's better to introduce the user sensitive
> changes
> >> within a release. However, it may be not that easy. I can help to
> >> contribute.
> >>
> >> Separation of interface and implementation is a good idea. This may
> >> introduce a minimum of dependencies or even no dependencies. I saw your
> >> reply in the google doc. Java8 has already supported static method for
> >> interfaces, I think we can make use of it?
> >>
> >> Best,
> >> Hequn
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-11001
> >>
> >>
> >> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org>
> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> thanks for the great feedback so far. I updated the document with the
> >>> input I got so far
> >>>
> >>> @Fabian: I moved the porting of flink-table-runtime classes up in the
> >> list.
> >>> @Xiaowei: Could you elaborate what "interface only" means to you? Do
> you
> >>> mean a module containing pure Java `interface`s? Or is the validation
> >>> logic also part of the API module? Are 50+ expression classes part of
> >>> the API interface or already too implementation-specific?
> >>>
> >>> @Xuefu: I extended the document by almost a page to clarify when we
> >>> should develop in Scala and when in Java. As Piotr said, every new
> Scala
> >>> line is instant technical debt.
> >>>
> >>> Thanks,
> >>> Timo
> >>>
> >>>
> >>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> >>>> Hi Timo,
> >>>>
> >>>> Thanks for writing this down +1 from my side :)
> >>>>
> >>>>> I'm wondering that whether we can have rule in the interim when Java
> >>> and Scala coexist that dependency can only be one-way. I found that in
> >> the
> >>> current code base there are cases where a Scala class extends Java and
> >> vise
> >>> versa. This is quite painful. I'm thinking if we could say that
> extension
> >>> can only be from Java to Scala, which will help the situation. However,
> >> I'm
> >>> not sure if this is practical.
> >>>> Xuefu: I’m also not sure what’s the best approach here, probably we
> >> will
> >>> have to work it out as we go. One thing to consider is that from now
> on,
> >>> every single new code line written in Scala anywhere in Flink-table
> >> (except
> >>> of Flink-table-api-scala) is an instant technological debt. From this
> >>> perspective I would be in favour of tolerating quite big inchonvieneces
> >>> just to avoid any new Scala code.
> >>>> Piotrek
> >>>>
> >>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
> >> wrote:
> >>>>> Hi Timo,
> >>>>>
> >>>>> Thanks for the effort and the Google writeup. During our external
> >>> catalog rework, we found much confusion between Java and Scala, and
> this
> >>> Scala-free roadmap should greatly mitigate that.
> >>>>> I'm wondering that whether we can have rule in the interim when Java
> >>> and Scala coexist that dependency can only be one-way. I found that in
> >> the
> >>> current code base there are cases where a Scala class extends Java and
> >> vise
> >>> versa. This is quite painful. I'm thinking if we could say that
> extension
> >>> can only be from Java to Scala, which will help the situation. However,
> >> I'm
> >>> not sure if this is practical.
> >>>>> Thanks,
> >>>>> Xuefu
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------------------------
> >>>>> Sender:jincheng sun <su...@gmail.com>
> >>>>> Sent at:2018 Nov 23 (Fri) 09:49
> >>>>> Recipient:dev <de...@flink.apache.org>
> >>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
> >>>>>
> >>>>> Hi Timo,
> >>>>> Thanks for initiating this great discussion.
> >>>>>
> >>>>> Currently when using SQL/TableAPI should include many dependence. In
> >>>>> particular, it is not necessary to introduce the specific
> >> implementation
> >>>>> dependencies which users do not care about. So I am glad to see your
> >>>>> proposal, and hope when we consider splitting the API interface into
> a
> >>>>> separate module, so that the user can introduce minimum of
> >> dependencies.
> >>>>> So, +1 to [separation of interface and implementation; e.g. `Table` &
> >>>>> `TableImpl`] which you mentioned in the google doc.
> >>>>> Best,
> >>>>> Jincheng
> >>>>>
> >>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> >>>>>
> >>>>>> Hi Timo, thanks for driving this! I think that this is a nice thing
> >> to
> >>> do.
> >>>>>> While we are doing this, can we also keep in mind that we want to
> >>>>>> eventually have a TableAPI interface only module which users can
> take
> >>>>>> dependency on, but without including any implementation details?
> >>>>>>
> >>>>>> Xiaowei
> >>>>>>
> >>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
> >>> wrote:
> >>>>>>> Hi Timo,
> >>>>>>>
> >>>>>>> Thanks for writing up this document.
> >>>>>>> I like the new structure and agree to prioritize the porting of the
> >>>>>>> flink-table-common classes.
> >>>>>>> Since flink-table-runtime is (or should be) independent of the API
> >> and
> >>>>>>> planner modules, we could start porting these classes once the code
> >> is
> >>>>>>> split into the new module structure.
> >>>>>>> The benefits of a Scala-free flink-table-runtime would be a
> >> Scala-free
> >>>>>>> execution Jar.
> >>>>>>>
> >>>>>>> Best, Fabian
> >>>>>>>
> >>>>>>>
> >>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> >>>>>>> twalthr@apache.org
> >>>>>>>> :
> >>>>>>>> Hi everyone,
> >>>>>>>>
> >>>>>>>> I would like to continue this discussion thread and convert the
> >>> outcome
> >>>>>>>> into a FLIP such that users and contributors know what to expect
> in
> >>> the
> >>>>>>>> upcoming releases.
> >>>>>>>>
> >>>>>>>> I created a design document [1] that clarifies our motivation why
> >> we
> >>>>>>>> want to do this, how a Maven module structure could look like, and
> >> a
> >>>>>>>> suggestion for a migration plan.
> >>>>>>>>
> >>>>>>>> It would be great to start with the efforts for the 1.8 release
> >> such
> >>>>>>>> that new features can be developed in Java and major refactorings
> >>> such
> >>>>>>>> as improvements to the connectors and external catalog support are
> >>> not
> >>>>>>>> blocked.
> >>>>>>>>
> >>>>>>>> Please let me know what you think.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>>>
> >>
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> >>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> >>>>>>>>> Hi Piotr,
> >>>>>>>>>
> >>>>>>>>> thanks for bumping this thread and thanks for Xingcan for the
> >>>>>> comments.
> >>>>>>>>> I think the first step would be to separate the flink-table
> module
> >>>>>> into
> >>>>>>>>> multiple sub modules. These could be:
> >>>>>>>>>
> >>>>>>>>> - flink-table-api: All API facing classes. Can be later divided
> >>>>>> further
> >>>>>>>>> into Java/Scala Table API/SQL
> >>>>>>>>> - flink-table-planning: involves all planning (basically
> >> everything
> >>>>>> we
> >>>>>>> do
> >>>>>>>>> with Calcite)
> >>>>>>>>> - flink-table-runtime: the runtime code
> >>>>>>>>>
> >>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module and
> >>>>>>> certain
> >>>>>>>>> parts of the planning module ported to Java.
> >>>>>>>>> The api module will be much harder to port because of several
> >>>>>>>> dependencies
> >>>>>>>>> to Scala core classes (the parser framework, tree iterations,
> >> etc.).
> >>>>>>> I'm
> >>>>>>>>> not saying we should not port this to Java, but it is not clear
> to
> >>> me
> >>>>>>>> (yet)
> >>>>>>>>> how to do it.
> >>>>>>>>>
> >>>>>>>>> I think flink-table-runtime should not be too hard to port. The
> >> code
> >>>>>>> does
> >>>>>>>>> not make use of many Scala features, i.e., it's writing very
> >>>>>> Java-like.
> >>>>>>>>> Also, there are not many dependencies and operators can be
> >>>>>> individually
> >>>>>>>>> ported step-by-step.
> >>>>>>>>> For flink-table-planning, we can have certain packages that we
> >> port
> >>>>>> to
> >>>>>>>> Java
> >>>>>>>>> like planning rules or plan nodes. The related classes mostly
> >> extend
> >>>>>>>>> Calcite's Java interfaces/classes and would be natural choices
> for
> >>>>>>> being
> >>>>>>>>> ported. The code generation classes will require more effort to
> >>> port.
> >>>>>>>> There
> >>>>>>>>> are also some dependencies in planning on the api module that we
> >>>>>> would
> >>>>>>>> need
> >>>>>>>>> to resolve somehow.
> >>>>>>>>>
> >>>>>>>>> For SQL most work when adding new features is done in the
> planning
> >>>>>> and
> >>>>>>>>> runtime modules. So, this separation should already reduce
> >>>>>>> "technological
> >>>>>>>>> dept" quite a lot.
> >>>>>>>>> The Table API depends much more on Scala than SQL.
> >>>>>>>>>
> >>>>>>>>> Cheers, Fabian
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> >>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I also think about this problem these days and here are my
> >>> thoughts.
> >>>>>>>>>> 1) We must admit that it’s really a tough task to interoperate
> >> with
> >>>>>>> Java
> >>>>>>>>>> and Scala. E.g., they have different collection types (Scala
> >>>>>>> collections
> >>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
> >>> which
> >>>>>>>> takes
> >>>>>>>>>> Scala functions as parameters. Considering the major part of the
> >>>>>> code
> >>>>>>>> base
> >>>>>>>>>> is implemented in Java, +1 for this goal from a long-term view.
> >>>>>>>>>>
> >>>>>>>>>> 2) The ideal solution would be to just expose a Scala API and
> >> make
> >>>>>> all
> >>>>>>>> the
> >>>>>>>>>> other parts Scala-free. But I am not sure if it could be
> achieved
> >>>>>> even
> >>>>>>>> in a
> >>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> >>>>>>>>>> "flink-table-core" would be a compromise solution.
> >>>>>>>>>>
> >>>>>>>>>> 3) If the community makes the final decision, maybe any new
> >>> features
> >>>>>>>>>> should be added in Java (regardless of the modules), in order to
> >>>>>>> prevent
> >>>>>>>>>> the Scala codes from growing.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Xingcan
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> >>>>>> piotr@data-artisans.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>> Bumping the topic.
> >>>>>>>>>>>
> >>>>>>>>>>> If we want to do this, the sooner we decide, the less code we
> >> will
> >>>>>>> have
> >>>>>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
> >>>>>>>> proposal
> >>>>>>>>>> of doing it module wise and one module at a time.
> >>>>>>>>>>> First, I do not see a problem of having java/scala code even
> >>> within
> >>>>>>> one
> >>>>>>>>>> module, especially not if there are clean boundaries. Like we
> >> could
> >>>>>>> have
> >>>>>>>>>> API in Scala and optimizer rules/logical nodes written in Java
> in
> >>>>>> the
> >>>>>>>> same
> >>>>>>>>>> module. However I haven’t previously maintained mixed scala/java
> >>>>>> code
> >>>>>>>> bases
> >>>>>>>>>> before, so I might be missing something here.
> >>>>>>>>>>> Secondly this whole migration might and most like will take
> >> longer
> >>>>>>> then
> >>>>>>>>>> expected, so that creates a problem for a new code that we will
> >> be
> >>>>>>>>>> creating. After making a decision to migrate to Java, almost any
> >>> new
> >>>>>>>> Scala
> >>>>>>>>>> line of code will be immediately a technological debt and we
> will
> >>>>>> have
> >>>>>>>> to
> >>>>>>>>>> rewrite it to Java later.
> >>>>>>>>>>> Thus I would propose first to state our end goal - modules
> >>>>>> structure
> >>>>>>>> and
> >>>>>>>>>> which parts of modules we want to have eventually Scala-free.
> >>>>>> Secondly
> >>>>>>>>>> taking all steps necessary that will allow us to write new code
> >>>>>>>> complaint
> >>>>>>>>>> with our end goal. Only after that we should/could focus on
> >>>>>>>> incrementally
> >>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
> >>>>>> years
> >>>>>>>>>> writing new code in Scala (and increasing technological debt),
> >>>>>> because
> >>>>>>>>>> nobody have found a time to rewrite some non important and not
> >>>>>>> actively
> >>>>>>>>>> developed part of some module.
> >>>>>>>>>>> Piotrek
> >>>>>>>>>>>
> >>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> In general, I think this is a good effort. However, it won't
> be
> >>>>>> easy
> >>>>>>>>>> and I
> >>>>>>>>>>>> think we have to plan this well.
> >>>>>>>>>>>> I don't like the idea of having the whole code base fragmented
> >>>>>> into
> >>>>>>>> Java
> >>>>>>>>>>>> and Scala code for too long.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think we should do this one step at a time and focus on
> >>>>>> migrating
> >>>>>>>> one
> >>>>>>>>>>>> module at a time.
> >>>>>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
> >>>>>>>>>>>> Extracting the API classes into an own module, porting them to
> >>>>>> Java,
> >>>>>>>> and
> >>>>>>>>>>>> removing the Scala dependency won't be possible without
> >> breaking
> >>>>>> the
> >>>>>>>> API
> >>>>>>>>>>>> since a few classes depend on the Scala Table API.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best, Fabian
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <
> trohrmann@apache.org
> >>> :
> >>>>>>>>>>>>> I think that is a noble and honorable goal and we should
> >> strive
> >>>>>> for
> >>>>>>>> it.
> >>>>>>>>>>>>> This, however, must be an iterative process given the sheer
> >> size
> >>>>>> of
> >>>>>>>> the
> >>>>>>>>>>>>> code base. I like the approach to define common Java modules
> >>>>>> which
> >>>>>>>> are
> >>>>>>>>>> used
> >>>>>>>>>>>>> by more specific Scala modules and slowly moving classes from
> >>>>>> Scala
> >>>>>>>> to
> >>>>>>>>>>>>> Java. Thus +1 for the proposal.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>> Till
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> >>>>>>>>>> piotr@data-artisans.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I do not have an experience with how scala and java
> interacts
> >>>>>> with
> >>>>>>>>>> each
> >>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
> >> generally
> >>>>>>>>>> speaking
> >>>>>>>>>>>>> +1
> >>>>>>>>>>>>>> from me.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
> >>>>>>> `flink-table-core`
> >>>>>>>> to
> >>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be able
> >> to
> >>>>>>> add
> >>>>>>>>>> new
> >>>>>>>>>>>>>> classes/features written in Java and so that they can
> coexist
> >>>>>> with
> >>>>>>>> old
> >>>>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Piotrek
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <twalthr@apache.org
> >
> >>>>>>> wrote:
> >>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
> >> implemented
> >>>>>> in
> >>>>>>>>>> Scala.
> >>>>>>>>>>>>>> This decision was made a long-time ago when the initital
> code
> >>>>>> base
> >>>>>>>> was
> >>>>>>>>>>>>>> created as part of a master's thesis. The community kept
> >> Scala
> >>>>>>>>>> because of
> >>>>>>>>>>>>>> the nice language features that enable a fluent Table API
> >> like
> >>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows for
> >> quick
> >>>>>>>>>>>>> prototyping
> >>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
> >> committers
> >>>>>>>>>> enforced
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>>> splitting the code-base into two programming languages.
> >>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
> >> becomes
> >>>>>> an
> >>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats,
> >> and
> >>>>>>> SQL
> >>>>>>>>>>>>> client
> >>>>>>>>>>>>>> are actually implemented in Java but need to interoperate
> >> with
> >>>>>>>>>>>>> flink-table
> >>>>>>>>>>>>>> which makes these modules dependent on Scala. As mentioned
> in
> >>> an
> >>>>>>>>>> earlier
> >>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes member
> >>>>>>>> variables
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>> methods in Java that should not be exposed to users [1].
> Java
> >>> is
> >>>>>>>> still
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>> most important API language and right now we treat it as a
> >>>>>>>>>> second-class
> >>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala if
> >> you
> >>>>>>> just
> >>>>>>>>>> want
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>> implement a ScalarFunction because of method clashes between
> >>>>>>> `public
> >>>>>>>>>>>>> String
> >>>>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
> >>>>>>>>>>>>>>> Given the size of the current code base, reimplementing the
> >>>>>>> entire
> >>>>>>>>>>>>>> flink-table code in Java is a goal that we might never
> reach.
> >>>>>>>>>> However, we
> >>>>>>>>>>>>>> should at least treat the symptoms and have this as a
> >> long-term
> >>>>>>> goal
> >>>>>>>>>> in
> >>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
> >> runtime
> >>>>>>>>>> classes
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>> split the code base into multiple modules:
> >>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> >>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
> >>>>>> require
> >>>>>>> to
> >>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
> >>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> >>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> flink-table-common
> >>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
> >>>>>> this.
> >>>>>>> It
> >>>>>>>>>>>>>> contains interface classes such as descriptors, table sink,
> >>>>>> table
> >>>>>>>>>> source.
> >>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> >>>>>>>>>>>>>> flink-table-runtime}
> >>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> flink-table-runtime
> >>>>>>>>>>>>>>> Implemented in Java. This would require to convert classes
> >> in
> >>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> >> potentially.
> >>>>>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> >>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> >>>>>>>>>> traits-tp21335.html
> >>>
>
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Hi Hequn,

thanks for your feedback. Yes, migrating the test cases is another issue 
that is not represented in the document but should naturally go along 
with the migration.

I agree that we should migrate the main API classes quickly within this 
1.8 release after the module split has been performed. Help here is 
highly appreciated!

I forgot that Java supports static methods in interfaces now, but 
actually I don't like the design of calling `TableEnvironment.get(env)`. 
Because people often use `TableEnvironment tEnd = 
TableEnvironment.get(env)` and then wonder why there is no 
`toAppendStream` or `toDataSet` because they are using the base class. 
However, things like that can be discussed in the corresponding issue 
when it comes to implementation.

@Vino: I think your work fits nicely to these efforts.

@everyone: I will wait for more feedback until end of this week. Then I 
will convert the design document into a FLIP and open subtasks in Jira, 
if there are no objections?

Regards,
Timo

Am 24.11.18 um 13:45 schrieb vino yang:
> Hi hequn,
>
> I am very glad to hear that you are interested in this work.
> As we all know, this process involves a lot.
> Currently, the migration work has begun. I started with the
> Kafka connector's dependency on flink-table and moved the
> related dependencies to flink-table-common.
> This work is tracked by FLINK-9461.  [1]
> I don't know if it will conflict with what you expect to do, but from the
> impact I have observed,
> it will involve many classes that are currently in flink-table.
>
> *Just a statement to prevent unnecessary conflicts.*
>
> Thanks, vino.
>
> [1]: https://issues.apache.org/jira/browse/FLINK-9461
>
> Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:
>
>> Hi Timo,
>>
>> Thanks for the effort and writing up this document. I like the idea to make
>> flink-table scala free, so +1 for the proposal!
>>
>> It's good to make Java the first-class citizen. For a long time, we have
>> neglected java so that many features in Table are missed in Java Test
>> cases, such as this one[1] I found recently. And I think we may also need
>> to migrate our test cases, i.e, add java tests.
>>
>> This definitely is a big change and will break API compatible. In order to
>> bring a smaller impact on users, I think we should go fast when we migrate
>> APIs targeted to users. It's better to introduce the user sensitive changes
>> within a release. However, it may be not that easy. I can help to
>> contribute.
>>
>> Separation of interface and implementation is a good idea. This may
>> introduce a minimum of dependencies or even no dependencies. I saw your
>> reply in the google doc. Java8 has already supported static method for
>> interfaces, I think we can make use of it?
>>
>> Best,
>> Hequn
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-11001
>>
>>
>> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org> wrote:
>>
>>> Hi everyone,
>>>
>>> thanks for the great feedback so far. I updated the document with the
>>> input I got so far
>>>
>>> @Fabian: I moved the porting of flink-table-runtime classes up in the
>> list.
>>> @Xiaowei: Could you elaborate what "interface only" means to you? Do you
>>> mean a module containing pure Java `interface`s? Or is the validation
>>> logic also part of the API module? Are 50+ expression classes part of
>>> the API interface or already too implementation-specific?
>>>
>>> @Xuefu: I extended the document by almost a page to clarify when we
>>> should develop in Scala and when in Java. As Piotr said, every new Scala
>>> line is instant technical debt.
>>>
>>> Thanks,
>>> Timo
>>>
>>>
>>> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
>>>> Hi Timo,
>>>>
>>>> Thanks for writing this down +1 from my side :)
>>>>
>>>>> I'm wondering that whether we can have rule in the interim when Java
>>> and Scala coexist that dependency can only be one-way. I found that in
>> the
>>> current code base there are cases where a Scala class extends Java and
>> vise
>>> versa. This is quite painful. I'm thinking if we could say that extension
>>> can only be from Java to Scala, which will help the situation. However,
>> I'm
>>> not sure if this is practical.
>>>> Xuefu: I’m also not sure what’s the best approach here, probably we
>> will
>>> have to work it out as we go. One thing to consider is that from now on,
>>> every single new code line written in Scala anywhere in Flink-table
>> (except
>>> of Flink-table-api-scala) is an instant technological debt. From this
>>> perspective I would be in favour of tolerating quite big inchonvieneces
>>> just to avoid any new Scala code.
>>>> Piotrek
>>>>
>>>>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
>> wrote:
>>>>> Hi Timo,
>>>>>
>>>>> Thanks for the effort and the Google writeup. During our external
>>> catalog rework, we found much confusion between Java and Scala, and this
>>> Scala-free roadmap should greatly mitigate that.
>>>>> I'm wondering that whether we can have rule in the interim when Java
>>> and Scala coexist that dependency can only be one-way. I found that in
>> the
>>> current code base there are cases where a Scala class extends Java and
>> vise
>>> versa. This is quite painful. I'm thinking if we could say that extension
>>> can only be from Java to Scala, which will help the situation. However,
>> I'm
>>> not sure if this is practical.
>>>>> Thanks,
>>>>> Xuefu
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> Sender:jincheng sun <su...@gmail.com>
>>>>> Sent at:2018 Nov 23 (Fri) 09:49
>>>>> Recipient:dev <de...@flink.apache.org>
>>>>> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
>>>>>
>>>>> Hi Timo,
>>>>> Thanks for initiating this great discussion.
>>>>>
>>>>> Currently when using SQL/TableAPI should include many dependence. In
>>>>> particular, it is not necessary to introduce the specific
>> implementation
>>>>> dependencies which users do not care about. So I am glad to see your
>>>>> proposal, and hope when we consider splitting the API interface into a
>>>>> separate module, so that the user can introduce minimum of
>> dependencies.
>>>>> So, +1 to [separation of interface and implementation; e.g. `Table` &
>>>>> `TableImpl`] which you mentioned in the google doc.
>>>>> Best,
>>>>> Jincheng
>>>>>
>>>>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>>>>
>>>>>> Hi Timo, thanks for driving this! I think that this is a nice thing
>> to
>>> do.
>>>>>> While we are doing this, can we also keep in mind that we want to
>>>>>> eventually have a TableAPI interface only module which users can take
>>>>>> dependency on, but without including any implementation details?
>>>>>>
>>>>>> Xiaowei
>>>>>>
>>>>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
>>> wrote:
>>>>>>> Hi Timo,
>>>>>>>
>>>>>>> Thanks for writing up this document.
>>>>>>> I like the new structure and agree to prioritize the porting of the
>>>>>>> flink-table-common classes.
>>>>>>> Since flink-table-runtime is (or should be) independent of the API
>> and
>>>>>>> planner modules, we could start porting these classes once the code
>> is
>>>>>>> split into the new module structure.
>>>>>>> The benefits of a Scala-free flink-table-runtime would be a
>> Scala-free
>>>>>>> execution Jar.
>>>>>>>
>>>>>>> Best, Fabian
>>>>>>>
>>>>>>>
>>>>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>>>>> twalthr@apache.org
>>>>>>>> :
>>>>>>>> Hi everyone,
>>>>>>>>
>>>>>>>> I would like to continue this discussion thread and convert the
>>> outcome
>>>>>>>> into a FLIP such that users and contributors know what to expect in
>>> the
>>>>>>>> upcoming releases.
>>>>>>>>
>>>>>>>> I created a design document [1] that clarifies our motivation why
>> we
>>>>>>>> want to do this, how a Maven module structure could look like, and
>> a
>>>>>>>> suggestion for a migration plan.
>>>>>>>>
>>>>>>>> It would be great to start with the efforts for the 1.8 release
>> such
>>>>>>>> that new features can be developed in Java and major refactorings
>>> such
>>>>>>>> as improvements to the connectors and external catalog support are
>>> not
>>>>>>>> blocked.
>>>>>>>>
>>>>>>>> Please let me know what you think.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>>
>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>>>>> Hi Piotr,
>>>>>>>>>
>>>>>>>>> thanks for bumping this thread and thanks for Xingcan for the
>>>>>> comments.
>>>>>>>>> I think the first step would be to separate the flink-table module
>>>>>> into
>>>>>>>>> multiple sub modules. These could be:
>>>>>>>>>
>>>>>>>>> - flink-table-api: All API facing classes. Can be later divided
>>>>>> further
>>>>>>>>> into Java/Scala Table API/SQL
>>>>>>>>> - flink-table-planning: involves all planning (basically
>> everything
>>>>>> we
>>>>>>> do
>>>>>>>>> with Calcite)
>>>>>>>>> - flink-table-runtime: the runtime code
>>>>>>>>>
>>>>>>>>> IMO, a realistic mid-term goal is to have the runtime module and
>>>>>>> certain
>>>>>>>>> parts of the planning module ported to Java.
>>>>>>>>> The api module will be much harder to port because of several
>>>>>>>> dependencies
>>>>>>>>> to Scala core classes (the parser framework, tree iterations,
>> etc.).
>>>>>>> I'm
>>>>>>>>> not saying we should not port this to Java, but it is not clear to
>>> me
>>>>>>>> (yet)
>>>>>>>>> how to do it.
>>>>>>>>>
>>>>>>>>> I think flink-table-runtime should not be too hard to port. The
>> code
>>>>>>> does
>>>>>>>>> not make use of many Scala features, i.e., it's writing very
>>>>>> Java-like.
>>>>>>>>> Also, there are not many dependencies and operators can be
>>>>>> individually
>>>>>>>>> ported step-by-step.
>>>>>>>>> For flink-table-planning, we can have certain packages that we
>> port
>>>>>> to
>>>>>>>> Java
>>>>>>>>> like planning rules or plan nodes. The related classes mostly
>> extend
>>>>>>>>> Calcite's Java interfaces/classes and would be natural choices for
>>>>>>> being
>>>>>>>>> ported. The code generation classes will require more effort to
>>> port.
>>>>>>>> There
>>>>>>>>> are also some dependencies in planning on the api module that we
>>>>>> would
>>>>>>>> need
>>>>>>>>> to resolve somehow.
>>>>>>>>>
>>>>>>>>> For SQL most work when adding new features is done in the planning
>>>>>> and
>>>>>>>>> runtime modules. So, this separation should already reduce
>>>>>>> "technological
>>>>>>>>> dept" quite a lot.
>>>>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>>>>
>>>>>>>>> Cheers, Fabian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> I also think about this problem these days and here are my
>>> thoughts.
>>>>>>>>>> 1) We must admit that it’s really a tough task to interoperate
>> with
>>>>>>> Java
>>>>>>>>>> and Scala. E.g., they have different collection types (Scala
>>>>>>> collections
>>>>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
>>> which
>>>>>>>> takes
>>>>>>>>>> Scala functions as parameters. Considering the major part of the
>>>>>> code
>>>>>>>> base
>>>>>>>>>> is implemented in Java, +1 for this goal from a long-term view.
>>>>>>>>>>
>>>>>>>>>> 2) The ideal solution would be to just expose a Scala API and
>> make
>>>>>> all
>>>>>>>> the
>>>>>>>>>> other parts Scala-free. But I am not sure if it could be achieved
>>>>>> even
>>>>>>>> in a
>>>>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>>>>
>>>>>>>>>> 3) If the community makes the final decision, maybe any new
>>> features
>>>>>>>>>> should be added in Java (regardless of the modules), in order to
>>>>>>> prevent
>>>>>>>>>> the Scala codes from growing.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Xingcan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>>>>> piotr@data-artisans.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Bumping the topic.
>>>>>>>>>>>
>>>>>>>>>>> If we want to do this, the sooner we decide, the less code we
>> will
>>>>>>> have
>>>>>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
>>>>>>>> proposal
>>>>>>>>>> of doing it module wise and one module at a time.
>>>>>>>>>>> First, I do not see a problem of having java/scala code even
>>> within
>>>>>>> one
>>>>>>>>>> module, especially not if there are clean boundaries. Like we
>> could
>>>>>>> have
>>>>>>>>>> API in Scala and optimizer rules/logical nodes written in Java in
>>>>>> the
>>>>>>>> same
>>>>>>>>>> module. However I haven’t previously maintained mixed scala/java
>>>>>> code
>>>>>>>> bases
>>>>>>>>>> before, so I might be missing something here.
>>>>>>>>>>> Secondly this whole migration might and most like will take
>> longer
>>>>>>> then
>>>>>>>>>> expected, so that creates a problem for a new code that we will
>> be
>>>>>>>>>> creating. After making a decision to migrate to Java, almost any
>>> new
>>>>>>>> Scala
>>>>>>>>>> line of code will be immediately a technological debt and we will
>>>>>> have
>>>>>>>> to
>>>>>>>>>> rewrite it to Java later.
>>>>>>>>>>> Thus I would propose first to state our end goal - modules
>>>>>> structure
>>>>>>>> and
>>>>>>>>>> which parts of modules we want to have eventually Scala-free.
>>>>>> Secondly
>>>>>>>>>> taking all steps necessary that will allow us to write new code
>>>>>>>> complaint
>>>>>>>>>> with our end goal. Only after that we should/could focus on
>>>>>>>> incrementally
>>>>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
>>>>>> years
>>>>>>>>>> writing new code in Scala (and increasing technological debt),
>>>>>> because
>>>>>>>>>> nobody have found a time to rewrite some non important and not
>>>>>>> actively
>>>>>>>>>> developed part of some module.
>>>>>>>>>>> Piotrek
>>>>>>>>>>>
>>>>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> In general, I think this is a good effort. However, it won't be
>>>>>> easy
>>>>>>>>>> and I
>>>>>>>>>>>> think we have to plan this well.
>>>>>>>>>>>> I don't like the idea of having the whole code base fragmented
>>>>>> into
>>>>>>>> Java
>>>>>>>>>>>> and Scala code for too long.
>>>>>>>>>>>>
>>>>>>>>>>>> I think we should do this one step at a time and focus on
>>>>>> migrating
>>>>>>>> one
>>>>>>>>>>>> module at a time.
>>>>>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
>>>>>>>>>>>> Extracting the API classes into an own module, porting them to
>>>>>> Java,
>>>>>>>> and
>>>>>>>>>>>> removing the Scala dependency won't be possible without
>> breaking
>>>>>> the
>>>>>>>> API
>>>>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>>>>
>>>>>>>>>>>> Best, Fabian
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <trohrmann@apache.org
>>> :
>>>>>>>>>>>>> I think that is a noble and honorable goal and we should
>> strive
>>>>>> for
>>>>>>>> it.
>>>>>>>>>>>>> This, however, must be an iterative process given the sheer
>> size
>>>>>> of
>>>>>>>> the
>>>>>>>>>>>>> code base. I like the approach to define common Java modules
>>>>>> which
>>>>>>>> are
>>>>>>>>>> used
>>>>>>>>>>>>> by more specific Scala modules and slowly moving classes from
>>>>>> Scala
>>>>>>>> to
>>>>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Till
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>>>>> piotr@data-artisans.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do not have an experience with how scala and java interacts
>>>>>> with
>>>>>>>>>> each
>>>>>>>>>>>>>> other, so I can not fully validate your proposal, but
>> generally
>>>>>>>>>> speaking
>>>>>>>>>>>>> +1
>>>>>>>>>>>>>> from me.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>>>>> `flink-table-core`
>>>>>>>> to
>>>>>>>>>>>>>> Java? How would you envision it? It would be nice to be able
>> to
>>>>>>> add
>>>>>>>>>> new
>>>>>>>>>>>>>> classes/features written in Java and so that they can coexist
>>>>>> with
>>>>>>>> old
>>>>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Piotrek
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
>>>>>>> wrote:
>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> as you all know, currently the Table & SQL API is
>> implemented
>>>>>> in
>>>>>>>>>> Scala.
>>>>>>>>>>>>>> This decision was made a long-time ago when the initital code
>>>>>> base
>>>>>>>> was
>>>>>>>>>>>>>> created as part of a master's thesis. The community kept
>> Scala
>>>>>>>>>> because of
>>>>>>>>>>>>>> the nice language features that enable a fluent Table API
>> like
>>>>>>>>>>>>>> table.select('field.trim()) and because Scala allows for
>> quick
>>>>>>>>>>>>> prototyping
>>>>>>>>>>>>>> (e.g. multi-line comments for code generation). The
>> committers
>>>>>>>>>> enforced
>>>>>>>>>>>>> not
>>>>>>>>>>>>>> splitting the code-base into two programming languages.
>>>>>>>>>>>>>>> However, nowadays the flink-table module more and more
>> becomes
>>>>>> an
>>>>>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats,
>> and
>>>>>>> SQL
>>>>>>>>>>>>> client
>>>>>>>>>>>>>> are actually implemented in Java but need to interoperate
>> with
>>>>>>>>>>>>> flink-table
>>>>>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in
>>> an
>>>>>>>>>> earlier
>>>>>>>>>>>>>> mail thread, using Scala for API classes also exposes member
>>>>>>>> variables
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java
>>> is
>>>>>>>> still
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> most important API language and right now we treat it as a
>>>>>>>>>> second-class
>>>>>>>>>>>>>> citizen. I just noticed that you even need to add Scala if
>> you
>>>>>>> just
>>>>>>>>>> want
>>>>>>>>>>>>> to
>>>>>>>>>>>>>> implement a ScalarFunction because of method clashes between
>>>>>>> `public
>>>>>>>>>>>>> String
>>>>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
>>>>>>>>>>>>>>> Given the size of the current code base, reimplementing the
>>>>>>> entire
>>>>>>>>>>>>>> flink-table code in Java is a goal that we might never reach.
>>>>>>>>>> However, we
>>>>>>>>>>>>>> should at least treat the symptoms and have this as a
>> long-term
>>>>>>> goal
>>>>>>>>>> in
>>>>>>>>>>>>>> mind. My suggestion would be to convert user-facing and
>> runtime
>>>>>>>>>> classes
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
>>>>>> require
>>>>>>> to
>>>>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
>>>>>> this.
>>>>>>> It
>>>>>>>>>>>>>> contains interface classes such as descriptors, table sink,
>>>>>> table
>>>>>>>>>> source.
>>>>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>>>>> Implemented in Java. This would require to convert classes
>> in
>>>>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
>> potentially.
>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Timo
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>>>>> traits-tp21335.html
>>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by vino yang <ya...@gmail.com>.
Hi hequn,

I am very glad to hear that you are interested in this work.
As we all know, this process involves a lot.
Currently, the migration work has begun. I started with the
Kafka connector's dependency on flink-table and moved the
related dependencies to flink-table-common.
This work is tracked by FLINK-9461.  [1]
I don't know if it will conflict with what you expect to do, but from the
impact I have observed,
it will involve many classes that are currently in flink-table.

*Just a statement to prevent unnecessary conflicts.*

Thanks, vino.

[1]: https://issues.apache.org/jira/browse/FLINK-9461

Hequn Cheng <ch...@gmail.com> 于2018年11月24日周六 下午7:20写道:

> Hi Timo,
>
> Thanks for the effort and writing up this document. I like the idea to make
> flink-table scala free, so +1 for the proposal!
>
> It's good to make Java the first-class citizen. For a long time, we have
> neglected java so that many features in Table are missed in Java Test
> cases, such as this one[1] I found recently. And I think we may also need
> to migrate our test cases, i.e, add java tests.
>
> This definitely is a big change and will break API compatible. In order to
> bring a smaller impact on users, I think we should go fast when we migrate
> APIs targeted to users. It's better to introduce the user sensitive changes
> within a release. However, it may be not that easy. I can help to
> contribute.
>
> Separation of interface and implementation is a good idea. This may
> introduce a minimum of dependencies or even no dependencies. I saw your
> reply in the google doc. Java8 has already supported static method for
> interfaces, I think we can make use of it?
>
> Best,
> Hequn
>
> [1] https://issues.apache.org/jira/browse/FLINK-11001
>
>
> On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org> wrote:
>
> > Hi everyone,
> >
> > thanks for the great feedback so far. I updated the document with the
> > input I got so far
> >
> > @Fabian: I moved the porting of flink-table-runtime classes up in the
> list.
> >
> > @Xiaowei: Could you elaborate what "interface only" means to you? Do you
> > mean a module containing pure Java `interface`s? Or is the validation
> > logic also part of the API module? Are 50+ expression classes part of
> > the API interface or already too implementation-specific?
> >
> > @Xuefu: I extended the document by almost a page to clarify when we
> > should develop in Scala and when in Java. As Piotr said, every new Scala
> > line is instant technical debt.
> >
> > Thanks,
> > Timo
> >
> >
> > Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> > > Hi Timo,
> > >
> > > Thanks for writing this down +1 from my side :)
> > >
> > >> I'm wondering that whether we can have rule in the interim when Java
> > and Scala coexist that dependency can only be one-way. I found that in
> the
> > current code base there are cases where a Scala class extends Java and
> vise
> > versa. This is quite painful. I'm thinking if we could say that extension
> > can only be from Java to Scala, which will help the situation. However,
> I'm
> > not sure if this is practical.
> > > Xuefu: I’m also not sure what’s the best approach here, probably we
> will
> > have to work it out as we go. One thing to consider is that from now on,
> > every single new code line written in Scala anywhere in Flink-table
> (except
> > of Flink-table-api-scala) is an instant technological debt. From this
> > perspective I would be in favour of tolerating quite big inchonvieneces
> > just to avoid any new Scala code.
> > >
> > > Piotrek
> > >
> > >> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com>
> wrote:
> > >>
> > >> Hi Timo,
> > >>
> > >> Thanks for the effort and the Google writeup. During our external
> > catalog rework, we found much confusion between Java and Scala, and this
> > Scala-free roadmap should greatly mitigate that.
> > >>
> > >> I'm wondering that whether we can have rule in the interim when Java
> > and Scala coexist that dependency can only be one-way. I found that in
> the
> > current code base there are cases where a Scala class extends Java and
> vise
> > versa. This is quite painful. I'm thinking if we could say that extension
> > can only be from Java to Scala, which will help the situation. However,
> I'm
> > not sure if this is practical.
> > >>
> > >> Thanks,
> > >> Xuefu
> > >>
> > >>
> > >> ------------------------------------------------------------------
> > >> Sender:jincheng sun <su...@gmail.com>
> > >> Sent at:2018 Nov 23 (Fri) 09:49
> > >> Recipient:dev <de...@flink.apache.org>
> > >> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
> > >>
> > >> Hi Timo,
> > >> Thanks for initiating this great discussion.
> > >>
> > >> Currently when using SQL/TableAPI should include many dependence. In
> > >> particular, it is not necessary to introduce the specific
> implementation
> > >> dependencies which users do not care about. So I am glad to see your
> > >> proposal, and hope when we consider splitting the API interface into a
> > >> separate module, so that the user can introduce minimum of
> dependencies.
> > >>
> > >> So, +1 to [separation of interface and implementation; e.g. `Table` &
> > >> `TableImpl`] which you mentioned in the google doc.
> > >> Best,
> > >> Jincheng
> > >>
> > >> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> > >>
> > >>> Hi Timo, thanks for driving this! I think that this is a nice thing
> to
> > do.
> > >>> While we are doing this, can we also keep in mind that we want to
> > >>> eventually have a TableAPI interface only module which users can take
> > >>> dependency on, but without including any implementation details?
> > >>>
> > >>> Xiaowei
> > >>>
> > >>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi Timo,
> > >>>>
> > >>>> Thanks for writing up this document.
> > >>>> I like the new structure and agree to prioritize the porting of the
> > >>>> flink-table-common classes.
> > >>>> Since flink-table-runtime is (or should be) independent of the API
> and
> > >>>> planner modules, we could start porting these classes once the code
> is
> > >>>> split into the new module structure.
> > >>>> The benefits of a Scala-free flink-table-runtime would be a
> Scala-free
> > >>>> execution Jar.
> > >>>>
> > >>>> Best, Fabian
> > >>>>
> > >>>>
> > >>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> > >>>> twalthr@apache.org
> > >>>>> :
> > >>>>> Hi everyone,
> > >>>>>
> > >>>>> I would like to continue this discussion thread and convert the
> > outcome
> > >>>>> into a FLIP such that users and contributors know what to expect in
> > the
> > >>>>> upcoming releases.
> > >>>>>
> > >>>>> I created a design document [1] that clarifies our motivation why
> we
> > >>>>> want to do this, how a Maven module structure could look like, and
> a
> > >>>>> suggestion for a migration plan.
> > >>>>>
> > >>>>> It would be great to start with the efforts for the 1.8 release
> such
> > >>>>> that new features can be developed in Java and major refactorings
> > such
> > >>>>> as improvements to the connectors and external catalog support are
> > not
> > >>>>> blocked.
> > >>>>>
> > >>>>> Please let me know what you think.
> > >>>>>
> > >>>>> Regards,
> > >>>>> Timo
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>>
> > >>>
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> > >>>>>
> > >>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > >>>>>> Hi Piotr,
> > >>>>>>
> > >>>>>> thanks for bumping this thread and thanks for Xingcan for the
> > >>> comments.
> > >>>>>> I think the first step would be to separate the flink-table module
> > >>> into
> > >>>>>> multiple sub modules. These could be:
> > >>>>>>
> > >>>>>> - flink-table-api: All API facing classes. Can be later divided
> > >>> further
> > >>>>>> into Java/Scala Table API/SQL
> > >>>>>> - flink-table-planning: involves all planning (basically
> everything
> > >>> we
> > >>>> do
> > >>>>>> with Calcite)
> > >>>>>> - flink-table-runtime: the runtime code
> > >>>>>>
> > >>>>>> IMO, a realistic mid-term goal is to have the runtime module and
> > >>>> certain
> > >>>>>> parts of the planning module ported to Java.
> > >>>>>> The api module will be much harder to port because of several
> > >>>>> dependencies
> > >>>>>> to Scala core classes (the parser framework, tree iterations,
> etc.).
> > >>>> I'm
> > >>>>>> not saying we should not port this to Java, but it is not clear to
> > me
> > >>>>> (yet)
> > >>>>>> how to do it.
> > >>>>>>
> > >>>>>> I think flink-table-runtime should not be too hard to port. The
> code
> > >>>> does
> > >>>>>> not make use of many Scala features, i.e., it's writing very
> > >>> Java-like.
> > >>>>>> Also, there are not many dependencies and operators can be
> > >>> individually
> > >>>>>> ported step-by-step.
> > >>>>>> For flink-table-planning, we can have certain packages that we
> port
> > >>> to
> > >>>>> Java
> > >>>>>> like planning rules or plan nodes. The related classes mostly
> extend
> > >>>>>> Calcite's Java interfaces/classes and would be natural choices for
> > >>>> being
> > >>>>>> ported. The code generation classes will require more effort to
> > port.
> > >>>>> There
> > >>>>>> are also some dependencies in planning on the api module that we
> > >>> would
> > >>>>> need
> > >>>>>> to resolve somehow.
> > >>>>>>
> > >>>>>> For SQL most work when adding new features is done in the planning
> > >>> and
> > >>>>>> runtime modules. So, this separation should already reduce
> > >>>> "technological
> > >>>>>> dept" quite a lot.
> > >>>>>> The Table API depends much more on Scala than SQL.
> > >>>>>>
> > >>>>>> Cheers, Fabian
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> > >>>>>>
> > >>>>>>> Hi all,
> > >>>>>>>
> > >>>>>>> I also think about this problem these days and here are my
> > thoughts.
> > >>>>>>>
> > >>>>>>> 1) We must admit that it’s really a tough task to interoperate
> with
> > >>>> Java
> > >>>>>>> and Scala. E.g., they have different collection types (Scala
> > >>>> collections
> > >>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
> > which
> > >>>>> takes
> > >>>>>>> Scala functions as parameters. Considering the major part of the
> > >>> code
> > >>>>> base
> > >>>>>>> is implemented in Java, +1 for this goal from a long-term view.
> > >>>>>>>
> > >>>>>>> 2) The ideal solution would be to just expose a Scala API and
> make
> > >>> all
> > >>>>> the
> > >>>>>>> other parts Scala-free. But I am not sure if it could be achieved
> > >>> even
> > >>>>> in a
> > >>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> > >>>>>>> "flink-table-core" would be a compromise solution.
> > >>>>>>>
> > >>>>>>> 3) If the community makes the final decision, maybe any new
> > features
> > >>>>>>> should be added in Java (regardless of the modules), in order to
> > >>>> prevent
> > >>>>>>> the Scala codes from growing.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Xingcan
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> > >>> piotr@data-artisans.com>
> > >>>>>>> wrote:
> > >>>>>>>> Bumping the topic.
> > >>>>>>>>
> > >>>>>>>> If we want to do this, the sooner we decide, the less code we
> will
> > >>>> have
> > >>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
> > >>>>> proposal
> > >>>>>>> of doing it module wise and one module at a time.
> > >>>>>>>> First, I do not see a problem of having java/scala code even
> > within
> > >>>> one
> > >>>>>>> module, especially not if there are clean boundaries. Like we
> could
> > >>>> have
> > >>>>>>> API in Scala and optimizer rules/logical nodes written in Java in
> > >>> the
> > >>>>> same
> > >>>>>>> module. However I haven’t previously maintained mixed scala/java
> > >>> code
> > >>>>> bases
> > >>>>>>> before, so I might be missing something here.
> > >>>>>>>> Secondly this whole migration might and most like will take
> longer
> > >>>> then
> > >>>>>>> expected, so that creates a problem for a new code that we will
> be
> > >>>>>>> creating. After making a decision to migrate to Java, almost any
> > new
> > >>>>> Scala
> > >>>>>>> line of code will be immediately a technological debt and we will
> > >>> have
> > >>>>> to
> > >>>>>>> rewrite it to Java later.
> > >>>>>>>> Thus I would propose first to state our end goal - modules
> > >>> structure
> > >>>>> and
> > >>>>>>> which parts of modules we want to have eventually Scala-free.
> > >>> Secondly
> > >>>>>>> taking all steps necessary that will allow us to write new code
> > >>>>> complaint
> > >>>>>>> with our end goal. Only after that we should/could focus on
> > >>>>> incrementally
> > >>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
> > >>> years
> > >>>>>>> writing new code in Scala (and increasing technological debt),
> > >>> because
> > >>>>>>> nobody have found a time to rewrite some non important and not
> > >>>> actively
> > >>>>>>> developed part of some module.
> > >>>>>>>> Piotrek
> > >>>>>>>>
> > >>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> > >>> wrote:
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> In general, I think this is a good effort. However, it won't be
> > >>> easy
> > >>>>>>> and I
> > >>>>>>>>> think we have to plan this well.
> > >>>>>>>>> I don't like the idea of having the whole code base fragmented
> > >>> into
> > >>>>> Java
> > >>>>>>>>> and Scala code for too long.
> > >>>>>>>>>
> > >>>>>>>>> I think we should do this one step at a time and focus on
> > >>> migrating
> > >>>>> one
> > >>>>>>>>> module at a time.
> > >>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
> > >>>>>>>>> Extracting the API classes into an own module, porting them to
> > >>> Java,
> > >>>>> and
> > >>>>>>>>> removing the Scala dependency won't be possible without
> breaking
> > >>> the
> > >>>>> API
> > >>>>>>>>> since a few classes depend on the Scala Table API.
> > >>>>>>>>>
> > >>>>>>>>> Best, Fabian
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <trohrmann@apache.org
> >:
> > >>>>>>>>>
> > >>>>>>>>>> I think that is a noble and honorable goal and we should
> strive
> > >>> for
> > >>>>> it.
> > >>>>>>>>>> This, however, must be an iterative process given the sheer
> size
> > >>> of
> > >>>>> the
> > >>>>>>>>>> code base. I like the approach to define common Java modules
> > >>> which
> > >>>>> are
> > >>>>>>> used
> > >>>>>>>>>> by more specific Scala modules and slowly moving classes from
> > >>> Scala
> > >>>>> to
> > >>>>>>>>>> Java. Thus +1 for the proposal.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Till
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > >>>>>>> piotr@data-artisans.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi,
> > >>>>>>>>>>>
> > >>>>>>>>>>> I do not have an experience with how scala and java interacts
> > >>> with
> > >>>>>>> each
> > >>>>>>>>>>> other, so I can not fully validate your proposal, but
> generally
> > >>>>>>> speaking
> > >>>>>>>>>> +1
> > >>>>>>>>>>> from me.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Does it also mean, that we should slowly migrate
> > >>>> `flink-table-core`
> > >>>>> to
> > >>>>>>>>>>> Java? How would you envision it? It would be nice to be able
> to
> > >>>> add
> > >>>>>>> new
> > >>>>>>>>>>> classes/features written in Java and so that they can coexist
> > >>> with
> > >>>>> old
> > >>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Piotrek
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
> > >>>> wrote:
> > >>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> as you all know, currently the Table & SQL API is
> implemented
> > >>> in
> > >>>>>>> Scala.
> > >>>>>>>>>>> This decision was made a long-time ago when the initital code
> > >>> base
> > >>>>> was
> > >>>>>>>>>>> created as part of a master's thesis. The community kept
> Scala
> > >>>>>>> because of
> > >>>>>>>>>>> the nice language features that enable a fluent Table API
> like
> > >>>>>>>>>>> table.select('field.trim()) and because Scala allows for
> quick
> > >>>>>>>>>> prototyping
> > >>>>>>>>>>> (e.g. multi-line comments for code generation). The
> committers
> > >>>>>>> enforced
> > >>>>>>>>>> not
> > >>>>>>>>>>> splitting the code-base into two programming languages.
> > >>>>>>>>>>>> However, nowadays the flink-table module more and more
> becomes
> > >>> an
> > >>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats,
> and
> > >>>> SQL
> > >>>>>>>>>> client
> > >>>>>>>>>>> are actually implemented in Java but need to interoperate
> with
> > >>>>>>>>>> flink-table
> > >>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in
> > an
> > >>>>>>> earlier
> > >>>>>>>>>>> mail thread, using Scala for API classes also exposes member
> > >>>>> variables
> > >>>>>>>>>> and
> > >>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java
> > is
> > >>>>> still
> > >>>>>>>>>> the
> > >>>>>>>>>>> most important API language and right now we treat it as a
> > >>>>>>> second-class
> > >>>>>>>>>>> citizen. I just noticed that you even need to add Scala if
> you
> > >>>> just
> > >>>>>>> want
> > >>>>>>>>>> to
> > >>>>>>>>>>> implement a ScalarFunction because of method clashes between
> > >>>> `public
> > >>>>>>>>>> String
> > >>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
> > >>>>>>>>>>>> Given the size of the current code base, reimplementing the
> > >>>> entire
> > >>>>>>>>>>> flink-table code in Java is a goal that we might never reach.
> > >>>>>>> However, we
> > >>>>>>>>>>> should at least treat the symptoms and have this as a
> long-term
> > >>>> goal
> > >>>>>>> in
> > >>>>>>>>>>> mind. My suggestion would be to convert user-facing and
> runtime
> > >>>>>>> classes
> > >>>>>>>>>> and
> > >>>>>>>>>>> split the code base into multiple modules:
> > >>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> > >>>>>>>>>>>> Implemented in Java. Java users can use this. This would
> > >>> require
> > >>>> to
> > >>>>>>>>>>> convert classes like TableEnvironment, Table.
> > >>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> > >>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> flink-table-common
> > >>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
> > >>> this.
> > >>>> It
> > >>>>>>>>>>> contains interface classes such as descriptors, table sink,
> > >>> table
> > >>>>>>> source.
> > >>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> > >>>>>>>>>>> flink-table-runtime}
> > >>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> flink-table-runtime
> > >>>>>>>>>>>> Implemented in Java. This would require to convert classes
> in
> > >>>>>>>>>>> o.a.f.table.runtime but would improve the runtime
> potentially.
> > >>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Timo
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1]
> > >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > >>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > >>>>>>> traits-tp21335.html
> > >>>>>
> >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Hequn Cheng <ch...@gmail.com>.
Hi Timo,

Thanks for the effort and writing up this document. I like the idea to make
flink-table scala free, so +1 for the proposal!

It's good to make Java the first-class citizen. For a long time, we have
neglected java so that many features in Table are missed in Java Test
cases, such as this one[1] I found recently. And I think we may also need
to migrate our test cases, i.e, add java tests.

This definitely is a big change and will break API compatible. In order to
bring a smaller impact on users, I think we should go fast when we migrate
APIs targeted to users. It's better to introduce the user sensitive changes
within a release. However, it may be not that easy. I can help to
contribute.

Separation of interface and implementation is a good idea. This may
introduce a minimum of dependencies or even no dependencies. I saw your
reply in the google doc. Java8 has already supported static method for
interfaces, I think we can make use of it?

Best,
Hequn

[1] https://issues.apache.org/jira/browse/FLINK-11001


On Fri, Nov 23, 2018 at 5:36 PM Timo Walther <tw...@apache.org> wrote:

> Hi everyone,
>
> thanks for the great feedback so far. I updated the document with the
> input I got so far
>
> @Fabian: I moved the porting of flink-table-runtime classes up in the list.
>
> @Xiaowei: Could you elaborate what "interface only" means to you? Do you
> mean a module containing pure Java `interface`s? Or is the validation
> logic also part of the API module? Are 50+ expression classes part of
> the API interface or already too implementation-specific?
>
> @Xuefu: I extended the document by almost a page to clarify when we
> should develop in Scala and when in Java. As Piotr said, every new Scala
> line is instant technical debt.
>
> Thanks,
> Timo
>
>
> Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> > Hi Timo,
> >
> > Thanks for writing this down +1 from my side :)
> >
> >> I'm wondering that whether we can have rule in the interim when Java
> and Scala coexist that dependency can only be one-way. I found that in the
> current code base there are cases where a Scala class extends Java and vise
> versa. This is quite painful. I'm thinking if we could say that extension
> can only be from Java to Scala, which will help the situation. However, I'm
> not sure if this is practical.
> > Xuefu: I’m also not sure what’s the best approach here, probably we will
> have to work it out as we go. One thing to consider is that from now on,
> every single new code line written in Scala anywhere in Flink-table (except
> of Flink-table-api-scala) is an instant technological debt. From this
> perspective I would be in favour of tolerating quite big inchonvieneces
> just to avoid any new Scala code.
> >
> > Piotrek
> >
> >> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com> wrote:
> >>
> >> Hi Timo,
> >>
> >> Thanks for the effort and the Google writeup. During our external
> catalog rework, we found much confusion between Java and Scala, and this
> Scala-free roadmap should greatly mitigate that.
> >>
> >> I'm wondering that whether we can have rule in the interim when Java
> and Scala coexist that dependency can only be one-way. I found that in the
> current code base there are cases where a Scala class extends Java and vise
> versa. This is quite painful. I'm thinking if we could say that extension
> can only be from Java to Scala, which will help the situation. However, I'm
> not sure if this is practical.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >>
> >> ------------------------------------------------------------------
> >> Sender:jincheng sun <su...@gmail.com>
> >> Sent at:2018 Nov 23 (Fri) 09:49
> >> Recipient:dev <de...@flink.apache.org>
> >> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
> >>
> >> Hi Timo,
> >> Thanks for initiating this great discussion.
> >>
> >> Currently when using SQL/TableAPI should include many dependence. In
> >> particular, it is not necessary to introduce the specific implementation
> >> dependencies which users do not care about. So I am glad to see your
> >> proposal, and hope when we consider splitting the API interface into a
> >> separate module, so that the user can introduce minimum of dependencies.
> >>
> >> So, +1 to [separation of interface and implementation; e.g. `Table` &
> >> `TableImpl`] which you mentioned in the google doc.
> >> Best,
> >> Jincheng
> >>
> >> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> >>
> >>> Hi Timo, thanks for driving this! I think that this is a nice thing to
> do.
> >>> While we are doing this, can we also keep in mind that we want to
> >>> eventually have a TableAPI interface only module which users can take
> >>> dependency on, but without including any implementation details?
> >>>
> >>> Xiaowei
> >>>
> >>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com>
> wrote:
> >>>
> >>>> Hi Timo,
> >>>>
> >>>> Thanks for writing up this document.
> >>>> I like the new structure and agree to prioritize the porting of the
> >>>> flink-table-common classes.
> >>>> Since flink-table-runtime is (or should be) independent of the API and
> >>>> planner modules, we could start porting these classes once the code is
> >>>> split into the new module structure.
> >>>> The benefits of a Scala-free flink-table-runtime would be a Scala-free
> >>>> execution Jar.
> >>>>
> >>>> Best, Fabian
> >>>>
> >>>>
> >>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> >>>> twalthr@apache.org
> >>>>> :
> >>>>> Hi everyone,
> >>>>>
> >>>>> I would like to continue this discussion thread and convert the
> outcome
> >>>>> into a FLIP such that users and contributors know what to expect in
> the
> >>>>> upcoming releases.
> >>>>>
> >>>>> I created a design document [1] that clarifies our motivation why we
> >>>>> want to do this, how a Maven module structure could look like, and a
> >>>>> suggestion for a migration plan.
> >>>>>
> >>>>> It would be great to start with the efforts for the 1.8 release such
> >>>>> that new features can be developed in Java and major refactorings
> such
> >>>>> as improvements to the connectors and external catalog support are
> not
> >>>>> blocked.
> >>>>>
> >>>>> Please let me know what you think.
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> >>>
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> >>>>>
> >>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> >>>>>> Hi Piotr,
> >>>>>>
> >>>>>> thanks for bumping this thread and thanks for Xingcan for the
> >>> comments.
> >>>>>> I think the first step would be to separate the flink-table module
> >>> into
> >>>>>> multiple sub modules. These could be:
> >>>>>>
> >>>>>> - flink-table-api: All API facing classes. Can be later divided
> >>> further
> >>>>>> into Java/Scala Table API/SQL
> >>>>>> - flink-table-planning: involves all planning (basically everything
> >>> we
> >>>> do
> >>>>>> with Calcite)
> >>>>>> - flink-table-runtime: the runtime code
> >>>>>>
> >>>>>> IMO, a realistic mid-term goal is to have the runtime module and
> >>>> certain
> >>>>>> parts of the planning module ported to Java.
> >>>>>> The api module will be much harder to port because of several
> >>>>> dependencies
> >>>>>> to Scala core classes (the parser framework, tree iterations, etc.).
> >>>> I'm
> >>>>>> not saying we should not port this to Java, but it is not clear to
> me
> >>>>> (yet)
> >>>>>> how to do it.
> >>>>>>
> >>>>>> I think flink-table-runtime should not be too hard to port. The code
> >>>> does
> >>>>>> not make use of many Scala features, i.e., it's writing very
> >>> Java-like.
> >>>>>> Also, there are not many dependencies and operators can be
> >>> individually
> >>>>>> ported step-by-step.
> >>>>>> For flink-table-planning, we can have certain packages that we port
> >>> to
> >>>>> Java
> >>>>>> like planning rules or plan nodes. The related classes mostly extend
> >>>>>> Calcite's Java interfaces/classes and would be natural choices for
> >>>> being
> >>>>>> ported. The code generation classes will require more effort to
> port.
> >>>>> There
> >>>>>> are also some dependencies in planning on the api module that we
> >>> would
> >>>>> need
> >>>>>> to resolve somehow.
> >>>>>>
> >>>>>> For SQL most work when adding new features is done in the planning
> >>> and
> >>>>>> runtime modules. So, this separation should already reduce
> >>>> "technological
> >>>>>> dept" quite a lot.
> >>>>>> The Table API depends much more on Scala than SQL.
> >>>>>>
> >>>>>> Cheers, Fabian
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I also think about this problem these days and here are my
> thoughts.
> >>>>>>>
> >>>>>>> 1) We must admit that it’s really a tough task to interoperate with
> >>>> Java
> >>>>>>> and Scala. E.g., they have different collection types (Scala
> >>>> collections
> >>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method
> which
> >>>>> takes
> >>>>>>> Scala functions as parameters. Considering the major part of the
> >>> code
> >>>>> base
> >>>>>>> is implemented in Java, +1 for this goal from a long-term view.
> >>>>>>>
> >>>>>>> 2) The ideal solution would be to just expose a Scala API and make
> >>> all
> >>>>> the
> >>>>>>> other parts Scala-free. But I am not sure if it could be achieved
> >>> even
> >>>>> in a
> >>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
> >>>>>>> "flink-table-core" would be a compromise solution.
> >>>>>>>
> >>>>>>> 3) If the community makes the final decision, maybe any new
> features
> >>>>>>> should be added in Java (regardless of the modules), in order to
> >>>> prevent
> >>>>>>> the Scala codes from growing.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Xingcan
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> >>> piotr@data-artisans.com>
> >>>>>>> wrote:
> >>>>>>>> Bumping the topic.
> >>>>>>>>
> >>>>>>>> If we want to do this, the sooner we decide, the less code we will
> >>>> have
> >>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
> >>>>> proposal
> >>>>>>> of doing it module wise and one module at a time.
> >>>>>>>> First, I do not see a problem of having java/scala code even
> within
> >>>> one
> >>>>>>> module, especially not if there are clean boundaries. Like we could
> >>>> have
> >>>>>>> API in Scala and optimizer rules/logical nodes written in Java in
> >>> the
> >>>>> same
> >>>>>>> module. However I haven’t previously maintained mixed scala/java
> >>> code
> >>>>> bases
> >>>>>>> before, so I might be missing something here.
> >>>>>>>> Secondly this whole migration might and most like will take longer
> >>>> then
> >>>>>>> expected, so that creates a problem for a new code that we will be
> >>>>>>> creating. After making a decision to migrate to Java, almost any
> new
> >>>>> Scala
> >>>>>>> line of code will be immediately a technological debt and we will
> >>> have
> >>>>> to
> >>>>>>> rewrite it to Java later.
> >>>>>>>> Thus I would propose first to state our end goal - modules
> >>> structure
> >>>>> and
> >>>>>>> which parts of modules we want to have eventually Scala-free.
> >>> Secondly
> >>>>>>> taking all steps necessary that will allow us to write new code
> >>>>> complaint
> >>>>>>> with our end goal. Only after that we should/could focus on
> >>>>> incrementally
> >>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
> >>> years
> >>>>>>> writing new code in Scala (and increasing technological debt),
> >>> because
> >>>>>>> nobody have found a time to rewrite some non important and not
> >>>> actively
> >>>>>>> developed part of some module.
> >>>>>>>> Piotrek
> >>>>>>>>
> >>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> >>> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> In general, I think this is a good effort. However, it won't be
> >>> easy
> >>>>>>> and I
> >>>>>>>>> think we have to plan this well.
> >>>>>>>>> I don't like the idea of having the whole code base fragmented
> >>> into
> >>>>> Java
> >>>>>>>>> and Scala code for too long.
> >>>>>>>>>
> >>>>>>>>> I think we should do this one step at a time and focus on
> >>> migrating
> >>>>> one
> >>>>>>>>> module at a time.
> >>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
> >>>>>>>>> Extracting the API classes into an own module, porting them to
> >>> Java,
> >>>>> and
> >>>>>>>>> removing the Scala dependency won't be possible without breaking
> >>> the
> >>>>> API
> >>>>>>>>> since a few classes depend on the Scala Table API.
> >>>>>>>>>
> >>>>>>>>> Best, Fabian
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
> >>>>>>>>>
> >>>>>>>>>> I think that is a noble and honorable goal and we should strive
> >>> for
> >>>>> it.
> >>>>>>>>>> This, however, must be an iterative process given the sheer size
> >>> of
> >>>>> the
> >>>>>>>>>> code base. I like the approach to define common Java modules
> >>> which
> >>>>> are
> >>>>>>> used
> >>>>>>>>>> by more specific Scala modules and slowly moving classes from
> >>> Scala
> >>>>> to
> >>>>>>>>>> Java. Thus +1 for the proposal.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Till
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> >>>>>>> piotr@data-artisans.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> I do not have an experience with how scala and java interacts
> >>> with
> >>>>>>> each
> >>>>>>>>>>> other, so I can not fully validate your proposal, but generally
> >>>>>>> speaking
> >>>>>>>>>> +1
> >>>>>>>>>>> from me.
> >>>>>>>>>>>
> >>>>>>>>>>> Does it also mean, that we should slowly migrate
> >>>> `flink-table-core`
> >>>>> to
> >>>>>>>>>>> Java? How would you envision it? It would be nice to be able to
> >>>> add
> >>>>>>> new
> >>>>>>>>>>> classes/features written in Java and so that they can coexist
> >>> with
> >>>>> old
> >>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
> >>>>>>>>>>>
> >>>>>>>>>>> Piotrek
> >>>>>>>>>>>
> >>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
> >>>> wrote:
> >>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> as you all know, currently the Table & SQL API is implemented
> >>> in
> >>>>>>> Scala.
> >>>>>>>>>>> This decision was made a long-time ago when the initital code
> >>> base
> >>>>> was
> >>>>>>>>>>> created as part of a master's thesis. The community kept Scala
> >>>>>>> because of
> >>>>>>>>>>> the nice language features that enable a fluent Table API like
> >>>>>>>>>>> table.select('field.trim()) and because Scala allows for quick
> >>>>>>>>>> prototyping
> >>>>>>>>>>> (e.g. multi-line comments for code generation). The committers
> >>>>>>> enforced
> >>>>>>>>>> not
> >>>>>>>>>>> splitting the code-base into two programming languages.
> >>>>>>>>>>>> However, nowadays the flink-table module more and more becomes
> >>> an
> >>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats, and
> >>>> SQL
> >>>>>>>>>> client
> >>>>>>>>>>> are actually implemented in Java but need to interoperate with
> >>>>>>>>>> flink-table
> >>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in
> an
> >>>>>>> earlier
> >>>>>>>>>>> mail thread, using Scala for API classes also exposes member
> >>>>> variables
> >>>>>>>>>> and
> >>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java
> is
> >>>>> still
> >>>>>>>>>> the
> >>>>>>>>>>> most important API language and right now we treat it as a
> >>>>>>> second-class
> >>>>>>>>>>> citizen. I just noticed that you even need to add Scala if you
> >>>> just
> >>>>>>> want
> >>>>>>>>>> to
> >>>>>>>>>>> implement a ScalarFunction because of method clashes between
> >>>> `public
> >>>>>>>>>> String
> >>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
> >>>>>>>>>>>> Given the size of the current code base, reimplementing the
> >>>> entire
> >>>>>>>>>>> flink-table code in Java is a goal that we might never reach.
> >>>>>>> However, we
> >>>>>>>>>>> should at least treat the symptoms and have this as a long-term
> >>>> goal
> >>>>>>> in
> >>>>>>>>>>> mind. My suggestion would be to convert user-facing and runtime
> >>>>>>> classes
> >>>>>>>>>> and
> >>>>>>>>>>> split the code base into multiple modules:
> >>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
> >>>>>>>>>>>> Implemented in Java. Java users can use this. This would
> >>> require
> >>>> to
> >>>>>>>>>>> convert classes like TableEnvironment, Table.
> >>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
> >>>>>>>>>>>> Implemented in Scala. Scala users can use this.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> flink-table-common
> >>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
> >>> this.
> >>>> It
> >>>>>>>>>>> contains interface classes such as descriptors, table sink,
> >>> table
> >>>>>>> source.
> >>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
> >>>>>>>>>>> flink-table-runtime}
> >>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> flink-table-runtime
> >>>>>>>>>>>> Implemented in Java. This would require to convert classes in
> >>>>>>>>>>> o.a.f.table.runtime but would improve the runtime potentially.
> >>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> >>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> >>>>>>> traits-tp21335.html
> >>>>>
>
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Timo Walther <tw...@apache.org>.
Hi everyone,

thanks for the great feedback so far. I updated the document with the 
input I got so far

@Fabian: I moved the porting of flink-table-runtime classes up in the list.

@Xiaowei: Could you elaborate what "interface only" means to you? Do you 
mean a module containing pure Java `interface`s? Or is the validation 
logic also part of the API module? Are 50+ expression classes part of 
the API interface or already too implementation-specific?

@Xuefu: I extended the document by almost a page to clarify when we 
should develop in Scala and when in Java. As Piotr said, every new Scala 
line is instant technical debt.

Thanks,
Timo


Am 23.11.18 um 10:29 schrieb Piotr Nowojski:
> Hi Timo,
>
> Thanks for writing this down +1 from my side :)
>
>> I'm wondering that whether we can have rule in the interim when Java and Scala coexist that dependency can only be one-way. I found that in the current code base there are cases where a Scala class extends Java and vise versa. This is quite painful. I'm thinking if we could say that extension can only be from Java to Scala, which will help the situation. However, I'm not sure if this is practical.
> Xuefu: I’m also not sure what’s the best approach here, probably we will have to work it out as we go. One thing to consider is that from now on, every single new code line written in Scala anywhere in Flink-table (except of Flink-table-api-scala) is an instant technological debt. From this perspective I would be in favour of tolerating quite big inchonvieneces just to avoid any new Scala code.
>
> Piotrek
>
>> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com> wrote:
>>
>> Hi Timo,
>>
>> Thanks for the effort and the Google writeup. During our external catalog rework, we found much confusion between Java and Scala, and this Scala-free roadmap should greatly mitigate that.
>>
>> I'm wondering that whether we can have rule in the interim when Java and Scala coexist that dependency can only be one-way. I found that in the current code base there are cases where a Scala class extends Java and vise versa. This is quite painful. I'm thinking if we could say that extension can only be from Java to Scala, which will help the situation. However, I'm not sure if this is practical.
>>
>> Thanks,
>> Xuefu
>>
>>
>> ------------------------------------------------------------------
>> Sender:jincheng sun <su...@gmail.com>
>> Sent at:2018 Nov 23 (Fri) 09:49
>> Recipient:dev <de...@flink.apache.org>
>> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
>>
>> Hi Timo,
>> Thanks for initiating this great discussion.
>>
>> Currently when using SQL/TableAPI should include many dependence. In
>> particular, it is not necessary to introduce the specific implementation
>> dependencies which users do not care about. So I am glad to see your
>> proposal, and hope when we consider splitting the API interface into a
>> separate module, so that the user can introduce minimum of dependencies.
>>
>> So, +1 to [separation of interface and implementation; e.g. `Table` &
>> `TableImpl`] which you mentioned in the google doc.
>> Best,
>> Jincheng
>>
>> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
>>
>>> Hi Timo, thanks for driving this! I think that this is a nice thing to do.
>>> While we are doing this, can we also keep in mind that we want to
>>> eventually have a TableAPI interface only module which users can take
>>> dependency on, but without including any implementation details?
>>>
>>> Xiaowei
>>>
>>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com> wrote:
>>>
>>>> Hi Timo,
>>>>
>>>> Thanks for writing up this document.
>>>> I like the new structure and agree to prioritize the porting of the
>>>> flink-table-common classes.
>>>> Since flink-table-runtime is (or should be) independent of the API and
>>>> planner modules, we could start porting these classes once the code is
>>>> split into the new module structure.
>>>> The benefits of a Scala-free flink-table-runtime would be a Scala-free
>>>> execution Jar.
>>>>
>>>> Best, Fabian
>>>>
>>>>
>>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>>> twalthr@apache.org
>>>>> :
>>>>> Hi everyone,
>>>>>
>>>>> I would like to continue this discussion thread and convert the outcome
>>>>> into a FLIP such that users and contributors know what to expect in the
>>>>> upcoming releases.
>>>>>
>>>>> I created a design document [1] that clarifies our motivation why we
>>>>> want to do this, how a Maven module structure could look like, and a
>>>>> suggestion for a migration plan.
>>>>>
>>>>> It would be great to start with the efforts for the 1.8 release such
>>>>> that new features can be developed in Java and major refactorings such
>>>>> as improvements to the connectors and external catalog support are not
>>>>> blocked.
>>>>>
>>>>> Please let me know what you think.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>> [1]
>>>>>
>>>>>
>>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>>>
>>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>>> Hi Piotr,
>>>>>>
>>>>>> thanks for bumping this thread and thanks for Xingcan for the
>>> comments.
>>>>>> I think the first step would be to separate the flink-table module
>>> into
>>>>>> multiple sub modules. These could be:
>>>>>>
>>>>>> - flink-table-api: All API facing classes. Can be later divided
>>> further
>>>>>> into Java/Scala Table API/SQL
>>>>>> - flink-table-planning: involves all planning (basically everything
>>> we
>>>> do
>>>>>> with Calcite)
>>>>>> - flink-table-runtime: the runtime code
>>>>>>
>>>>>> IMO, a realistic mid-term goal is to have the runtime module and
>>>> certain
>>>>>> parts of the planning module ported to Java.
>>>>>> The api module will be much harder to port because of several
>>>>> dependencies
>>>>>> to Scala core classes (the parser framework, tree iterations, etc.).
>>>> I'm
>>>>>> not saying we should not port this to Java, but it is not clear to me
>>>>> (yet)
>>>>>> how to do it.
>>>>>>
>>>>>> I think flink-table-runtime should not be too hard to port. The code
>>>> does
>>>>>> not make use of many Scala features, i.e., it's writing very
>>> Java-like.
>>>>>> Also, there are not many dependencies and operators can be
>>> individually
>>>>>> ported step-by-step.
>>>>>> For flink-table-planning, we can have certain packages that we port
>>> to
>>>>> Java
>>>>>> like planning rules or plan nodes. The related classes mostly extend
>>>>>> Calcite's Java interfaces/classes and would be natural choices for
>>>> being
>>>>>> ported. The code generation classes will require more effort to port.
>>>>> There
>>>>>> are also some dependencies in planning on the api module that we
>>> would
>>>>> need
>>>>>> to resolve somehow.
>>>>>>
>>>>>> For SQL most work when adding new features is done in the planning
>>> and
>>>>>> runtime modules. So, this separation should already reduce
>>>> "technological
>>>>>> dept" quite a lot.
>>>>>> The Table API depends much more on Scala than SQL.
>>>>>>
>>>>>> Cheers, Fabian
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I also think about this problem these days and here are my thoughts.
>>>>>>>
>>>>>>> 1) We must admit that it’s really a tough task to interoperate with
>>>> Java
>>>>>>> and Scala. E.g., they have different collection types (Scala
>>>> collections
>>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method which
>>>>> takes
>>>>>>> Scala functions as parameters. Considering the major part of the
>>> code
>>>>> base
>>>>>>> is implemented in Java, +1 for this goal from a long-term view.
>>>>>>>
>>>>>>> 2) The ideal solution would be to just expose a Scala API and make
>>> all
>>>>> the
>>>>>>> other parts Scala-free. But I am not sure if it could be achieved
>>> even
>>>>> in a
>>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>>> "flink-table-core" would be a compromise solution.
>>>>>>>
>>>>>>> 3) If the community makes the final decision, maybe any new features
>>>>>>> should be added in Java (regardless of the modules), in order to
>>>> prevent
>>>>>>> the Scala codes from growing.
>>>>>>>
>>>>>>> Best,
>>>>>>> Xingcan
>>>>>>>
>>>>>>>
>>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>>> piotr@data-artisans.com>
>>>>>>> wrote:
>>>>>>>> Bumping the topic.
>>>>>>>>
>>>>>>>> If we want to do this, the sooner we decide, the less code we will
>>>> have
>>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
>>>>> proposal
>>>>>>> of doing it module wise and one module at a time.
>>>>>>>> First, I do not see a problem of having java/scala code even within
>>>> one
>>>>>>> module, especially not if there are clean boundaries. Like we could
>>>> have
>>>>>>> API in Scala and optimizer rules/logical nodes written in Java in
>>> the
>>>>> same
>>>>>>> module. However I haven’t previously maintained mixed scala/java
>>> code
>>>>> bases
>>>>>>> before, so I might be missing something here.
>>>>>>>> Secondly this whole migration might and most like will take longer
>>>> then
>>>>>>> expected, so that creates a problem for a new code that we will be
>>>>>>> creating. After making a decision to migrate to Java, almost any new
>>>>> Scala
>>>>>>> line of code will be immediately a technological debt and we will
>>> have
>>>>> to
>>>>>>> rewrite it to Java later.
>>>>>>>> Thus I would propose first to state our end goal - modules
>>> structure
>>>>> and
>>>>>>> which parts of modules we want to have eventually Scala-free.
>>> Secondly
>>>>>>> taking all steps necessary that will allow us to write new code
>>>>> complaint
>>>>>>> with our end goal. Only after that we should/could focus on
>>>>> incrementally
>>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
>>> years
>>>>>>> writing new code in Scala (and increasing technological debt),
>>> because
>>>>>>> nobody have found a time to rewrite some non important and not
>>>> actively
>>>>>>> developed part of some module.
>>>>>>>> Piotrek
>>>>>>>>
>>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> In general, I think this is a good effort. However, it won't be
>>> easy
>>>>>>> and I
>>>>>>>>> think we have to plan this well.
>>>>>>>>> I don't like the idea of having the whole code base fragmented
>>> into
>>>>> Java
>>>>>>>>> and Scala code for too long.
>>>>>>>>>
>>>>>>>>> I think we should do this one step at a time and focus on
>>> migrating
>>>>> one
>>>>>>>>> module at a time.
>>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
>>>>>>>>> Extracting the API classes into an own module, porting them to
>>> Java,
>>>>> and
>>>>>>>>> removing the Scala dependency won't be possible without breaking
>>> the
>>>>> API
>>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>>>
>>>>>>>>> Best, Fabian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
>>>>>>>>>
>>>>>>>>>> I think that is a noble and honorable goal and we should strive
>>> for
>>>>> it.
>>>>>>>>>> This, however, must be an iterative process given the sheer size
>>> of
>>>>> the
>>>>>>>>>> code base. I like the approach to define common Java modules
>>> which
>>>>> are
>>>>>>> used
>>>>>>>>>> by more specific Scala modules and slowly moving classes from
>>> Scala
>>>>> to
>>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>>> piotr@data-artisans.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I do not have an experience with how scala and java interacts
>>> with
>>>>>>> each
>>>>>>>>>>> other, so I can not fully validate your proposal, but generally
>>>>>>> speaking
>>>>>>>>>> +1
>>>>>>>>>>> from me.
>>>>>>>>>>>
>>>>>>>>>>> Does it also mean, that we should slowly migrate
>>>> `flink-table-core`
>>>>> to
>>>>>>>>>>> Java? How would you envision it? It would be nice to be able to
>>>> add
>>>>>>> new
>>>>>>>>>>> classes/features written in Java and so that they can coexist
>>> with
>>>>> old
>>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
>>>>>>>>>>>
>>>>>>>>>>> Piotrek
>>>>>>>>>>>
>>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
>>>> wrote:
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> as you all know, currently the Table & SQL API is implemented
>>> in
>>>>>>> Scala.
>>>>>>>>>>> This decision was made a long-time ago when the initital code
>>> base
>>>>> was
>>>>>>>>>>> created as part of a master's thesis. The community kept Scala
>>>>>>> because of
>>>>>>>>>>> the nice language features that enable a fluent Table API like
>>>>>>>>>>> table.select('field.trim()) and because Scala allows for quick
>>>>>>>>>> prototyping
>>>>>>>>>>> (e.g. multi-line comments for code generation). The committers
>>>>>>> enforced
>>>>>>>>>> not
>>>>>>>>>>> splitting the code-base into two programming languages.
>>>>>>>>>>>> However, nowadays the flink-table module more and more becomes
>>> an
>>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats, and
>>>> SQL
>>>>>>>>>> client
>>>>>>>>>>> are actually implemented in Java but need to interoperate with
>>>>>>>>>> flink-table
>>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in an
>>>>>>> earlier
>>>>>>>>>>> mail thread, using Scala for API classes also exposes member
>>>>> variables
>>>>>>>>>> and
>>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java is
>>>>> still
>>>>>>>>>> the
>>>>>>>>>>> most important API language and right now we treat it as a
>>>>>>> second-class
>>>>>>>>>>> citizen. I just noticed that you even need to add Scala if you
>>>> just
>>>>>>> want
>>>>>>>>>> to
>>>>>>>>>>> implement a ScalarFunction because of method clashes between
>>>> `public
>>>>>>>>>> String
>>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
>>>>>>>>>>>> Given the size of the current code base, reimplementing the
>>>> entire
>>>>>>>>>>> flink-table code in Java is a goal that we might never reach.
>>>>>>> However, we
>>>>>>>>>>> should at least treat the symptoms and have this as a long-term
>>>> goal
>>>>>>> in
>>>>>>>>>>> mind. My suggestion would be to convert user-facing and runtime
>>>>>>> classes
>>>>>>>>>> and
>>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>>> Implemented in Java. Java users can use this. This would
>>> require
>>>> to
>>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>>>
>>>>>>>>>>>>> flink-table-common
>>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
>>> this.
>>>> It
>>>>>>>>>>> contains interface classes such as descriptors, table sink,
>>> table
>>>>>>> source.
>>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
>>>>>>>>>>>>
>>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>>> Implemented in Java. This would require to convert classes in
>>>>>>>>>>> o.a.f.table.runtime but would improve the runtime potentially.
>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Timo
>>>>>>>>>>>>
>>>>>>>>>>>> [1]
>>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>>> traits-tp21335.html
>>>>>


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Piotr Nowojski <pi...@data-artisans.com>.
Hi Timo,

Thanks for writing this down +1 from my side :)

> I'm wondering that whether we can have rule in the interim when Java and Scala coexist that dependency can only be one-way. I found that in the current code base there are cases where a Scala class extends Java and vise versa. This is quite painful. I'm thinking if we could say that extension can only be from Java to Scala, which will help the situation. However, I'm not sure if this is practical.

Xuefu: I’m also not sure what’s the best approach here, probably we will have to work it out as we go. One thing to consider is that from now on, every single new code line written in Scala anywhere in Flink-table (except of Flink-table-api-scala) is an instant technological debt. From this perspective I would be in favour of tolerating quite big inchonvieneces just to avoid any new Scala code.

Piotrek

> On 23 Nov 2018, at 03:25, Zhang, Xuefu <xu...@alibaba-inc.com> wrote:
> 
> Hi Timo,
> 
> Thanks for the effort and the Google writeup. During our external catalog rework, we found much confusion between Java and Scala, and this Scala-free roadmap should greatly mitigate that.
> 
> I'm wondering that whether we can have rule in the interim when Java and Scala coexist that dependency can only be one-way. I found that in the current code base there are cases where a Scala class extends Java and vise versa. This is quite painful. I'm thinking if we could say that extension can only be from Java to Scala, which will help the situation. However, I'm not sure if this is practical.
> 
> Thanks,
> Xuefu
> 
> 
> ------------------------------------------------------------------
> Sender:jincheng sun <su...@gmail.com>
> Sent at:2018 Nov 23 (Fri) 09:49
> Recipient:dev <de...@flink.apache.org>
> Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free
> 
> Hi Timo,
> Thanks for initiating this great discussion.
> 
> Currently when using SQL/TableAPI should include many dependence. In
> particular, it is not necessary to introduce the specific implementation
> dependencies which users do not care about. So I am glad to see your
> proposal, and hope when we consider splitting the API interface into a
> separate module, so that the user can introduce minimum of dependencies.
> 
> So, +1 to [separation of interface and implementation; e.g. `Table` &
> `TableImpl`] which you mentioned in the google doc.
> Best,
> Jincheng
> 
> Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:
> 
>> Hi Timo, thanks for driving this! I think that this is a nice thing to do.
>> While we are doing this, can we also keep in mind that we want to
>> eventually have a TableAPI interface only module which users can take
>> dependency on, but without including any implementation details?
>> 
>> Xiaowei
>> 
>> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com> wrote:
>> 
>>> Hi Timo,
>>> 
>>> Thanks for writing up this document.
>>> I like the new structure and agree to prioritize the porting of the
>>> flink-table-common classes.
>>> Since flink-table-runtime is (or should be) independent of the API and
>>> planner modules, we could start porting these classes once the code is
>>> split into the new module structure.
>>> The benefits of a Scala-free flink-table-runtime would be a Scala-free
>>> execution Jar.
>>> 
>>> Best, Fabian
>>> 
>>> 
>>> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
>>> twalthr@apache.org
>>>> :
>>> 
>>>> Hi everyone,
>>>> 
>>>> I would like to continue this discussion thread and convert the outcome
>>>> into a FLIP such that users and contributors know what to expect in the
>>>> upcoming releases.
>>>> 
>>>> I created a design document [1] that clarifies our motivation why we
>>>> want to do this, how a Maven module structure could look like, and a
>>>> suggestion for a migration plan.
>>>> 
>>>> It would be great to start with the efforts for the 1.8 release such
>>>> that new features can be developed in Java and major refactorings such
>>>> as improvements to the connectors and external catalog support are not
>>>> blocked.
>>>> 
>>>> Please let me know what you think.
>>>> 
>>>> Regards,
>>>> Timo
>>>> 
>>>> [1]
>>>> 
>>>> 
>>> 
>> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>>>> 
>>>> 
>>>> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
>>>>> Hi Piotr,
>>>>> 
>>>>> thanks for bumping this thread and thanks for Xingcan for the
>> comments.
>>>>> 
>>>>> I think the first step would be to separate the flink-table module
>> into
>>>>> multiple sub modules. These could be:
>>>>> 
>>>>> - flink-table-api: All API facing classes. Can be later divided
>> further
>>>>> into Java/Scala Table API/SQL
>>>>> - flink-table-planning: involves all planning (basically everything
>> we
>>> do
>>>>> with Calcite)
>>>>> - flink-table-runtime: the runtime code
>>>>> 
>>>>> IMO, a realistic mid-term goal is to have the runtime module and
>>> certain
>>>>> parts of the planning module ported to Java.
>>>>> The api module will be much harder to port because of several
>>>> dependencies
>>>>> to Scala core classes (the parser framework, tree iterations, etc.).
>>> I'm
>>>>> not saying we should not port this to Java, but it is not clear to me
>>>> (yet)
>>>>> how to do it.
>>>>> 
>>>>> I think flink-table-runtime should not be too hard to port. The code
>>> does
>>>>> not make use of many Scala features, i.e., it's writing very
>> Java-like.
>>>>> Also, there are not many dependencies and operators can be
>> individually
>>>>> ported step-by-step.
>>>>> For flink-table-planning, we can have certain packages that we port
>> to
>>>> Java
>>>>> like planning rules or plan nodes. The related classes mostly extend
>>>>> Calcite's Java interfaces/classes and would be natural choices for
>>> being
>>>>> ported. The code generation classes will require more effort to port.
>>>> There
>>>>> are also some dependencies in planning on the api module that we
>> would
>>>> need
>>>>> to resolve somehow.
>>>>> 
>>>>> For SQL most work when adding new features is done in the planning
>> and
>>>>> runtime modules. So, this separation should already reduce
>>> "technological
>>>>> dept" quite a lot.
>>>>> The Table API depends much more on Scala than SQL.
>>>>> 
>>>>> Cheers, Fabian
>>>>> 
>>>>> 
>>>>> 
>>>>> 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> I also think about this problem these days and here are my thoughts.
>>>>>> 
>>>>>> 1) We must admit that it’s really a tough task to interoperate with
>>> Java
>>>>>> and Scala. E.g., they have different collection types (Scala
>>> collections
>>>>>> v.s. java.util.*) and in Java, it's hard to implement a method which
>>>> takes
>>>>>> Scala functions as parameters. Considering the major part of the
>> code
>>>> base
>>>>>> is implemented in Java, +1 for this goal from a long-term view.
>>>>>> 
>>>>>> 2) The ideal solution would be to just expose a Scala API and make
>> all
>>>> the
>>>>>> other parts Scala-free. But I am not sure if it could be achieved
>> even
>>>> in a
>>>>>> long-term. Thus as Timo suggested, keep the Scala codes in
>>>>>> "flink-table-core" would be a compromise solution.
>>>>>> 
>>>>>> 3) If the community makes the final decision, maybe any new features
>>>>>> should be added in Java (regardless of the modules), in order to
>>> prevent
>>>>>> the Scala codes from growing.
>>>>>> 
>>>>>> Best,
>>>>>> Xingcan
>>>>>> 
>>>>>> 
>>>>>>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
>> piotr@data-artisans.com>
>>>>>> wrote:
>>>>>>> Bumping the topic.
>>>>>>> 
>>>>>>> If we want to do this, the sooner we decide, the less code we will
>>> have
>>>>>> to rewrite. I have some objections/counter proposals to Fabian's
>>>> proposal
>>>>>> of doing it module wise and one module at a time.
>>>>>>> First, I do not see a problem of having java/scala code even within
>>> one
>>>>>> module, especially not if there are clean boundaries. Like we could
>>> have
>>>>>> API in Scala and optimizer rules/logical nodes written in Java in
>> the
>>>> same
>>>>>> module. However I haven’t previously maintained mixed scala/java
>> code
>>>> bases
>>>>>> before, so I might be missing something here.
>>>>>>> Secondly this whole migration might and most like will take longer
>>> then
>>>>>> expected, so that creates a problem for a new code that we will be
>>>>>> creating. After making a decision to migrate to Java, almost any new
>>>> Scala
>>>>>> line of code will be immediately a technological debt and we will
>> have
>>>> to
>>>>>> rewrite it to Java later.
>>>>>>> Thus I would propose first to state our end goal - modules
>> structure
>>>> and
>>>>>> which parts of modules we want to have eventually Scala-free.
>> Secondly
>>>>>> taking all steps necessary that will allow us to write new code
>>>> complaint
>>>>>> with our end goal. Only after that we should/could focus on
>>>> incrementally
>>>>>> rewriting the old code. Otherwise we could be stuck/blocked for
>> years
>>>>>> writing new code in Scala (and increasing technological debt),
>> because
>>>>>> nobody have found a time to rewrite some non important and not
>>> actively
>>>>>> developed part of some module.
>>>>>>> Piotrek
>>>>>>> 
>>>>>>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
>> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> In general, I think this is a good effort. However, it won't be
>> easy
>>>>>> and I
>>>>>>>> think we have to plan this well.
>>>>>>>> I don't like the idea of having the whole code base fragmented
>> into
>>>> Java
>>>>>>>> and Scala code for too long.
>>>>>>>> 
>>>>>>>> I think we should do this one step at a time and focus on
>> migrating
>>>> one
>>>>>>>> module at a time.
>>>>>>>> IMO, the easiest start would be to port the runtime to Java.
>>>>>>>> Extracting the API classes into an own module, porting them to
>> Java,
>>>> and
>>>>>>>> removing the Scala dependency won't be possible without breaking
>> the
>>>> API
>>>>>>>> since a few classes depend on the Scala Table API.
>>>>>>>> 
>>>>>>>> Best, Fabian
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
>>>>>>>> 
>>>>>>>>> I think that is a noble and honorable goal and we should strive
>> for
>>>> it.
>>>>>>>>> This, however, must be an iterative process given the sheer size
>> of
>>>> the
>>>>>>>>> code base. I like the approach to define common Java modules
>> which
>>>> are
>>>>>> used
>>>>>>>>> by more specific Scala modules and slowly moving classes from
>> Scala
>>>> to
>>>>>>>>> Java. Thus +1 for the proposal.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Till
>>>>>>>>> 
>>>>>>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
>>>>>> piotr@data-artisans.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi,
>>>>>>>>>> 
>>>>>>>>>> I do not have an experience with how scala and java interacts
>> with
>>>>>> each
>>>>>>>>>> other, so I can not fully validate your proposal, but generally
>>>>>> speaking
>>>>>>>>> +1
>>>>>>>>>> from me.
>>>>>>>>>> 
>>>>>>>>>> Does it also mean, that we should slowly migrate
>>> `flink-table-core`
>>>> to
>>>>>>>>>> Java? How would you envision it? It would be nice to be able to
>>> add
>>>>>> new
>>>>>>>>>> classes/features written in Java and so that they can coexist
>> with
>>>> old
>>>>>>>>>> Scala code until we gradually switch from Scala to Java.
>>>>>>>>>> 
>>>>>>>>>> Piotrek
>>>>>>>>>> 
>>>>>>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>> 
>>>>>>>>>>> as you all know, currently the Table & SQL API is implemented
>> in
>>>>>> Scala.
>>>>>>>>>> This decision was made a long-time ago when the initital code
>> base
>>>> was
>>>>>>>>>> created as part of a master's thesis. The community kept Scala
>>>>>> because of
>>>>>>>>>> the nice language features that enable a fluent Table API like
>>>>>>>>>> table.select('field.trim()) and because Scala allows for quick
>>>>>>>>> prototyping
>>>>>>>>>> (e.g. multi-line comments for code generation). The committers
>>>>>> enforced
>>>>>>>>> not
>>>>>>>>>> splitting the code-base into two programming languages.
>>>>>>>>>>> However, nowadays the flink-table module more and more becomes
>> an
>>>>>>>>>> important part in the Flink ecosystem. Connectors, formats, and
>>> SQL
>>>>>>>>> client
>>>>>>>>>> are actually implemented in Java but need to interoperate with
>>>>>>>>> flink-table
>>>>>>>>>> which makes these modules dependent on Scala. As mentioned in an
>>>>>> earlier
>>>>>>>>>> mail thread, using Scala for API classes also exposes member
>>>> variables
>>>>>>>>> and
>>>>>>>>>> methods in Java that should not be exposed to users [1]. Java is
>>>> still
>>>>>>>>> the
>>>>>>>>>> most important API language and right now we treat it as a
>>>>>> second-class
>>>>>>>>>> citizen. I just noticed that you even need to add Scala if you
>>> just
>>>>>> want
>>>>>>>>> to
>>>>>>>>>> implement a ScalarFunction because of method clashes between
>>> `public
>>>>>>>>> String
>>>>>>>>>> toString()` and `public scala.Predef.String toString()`.
>>>>>>>>>>> Given the size of the current code base, reimplementing the
>>> entire
>>>>>>>>>> flink-table code in Java is a goal that we might never reach.
>>>>>> However, we
>>>>>>>>>> should at least treat the symptoms and have this as a long-term
>>> goal
>>>>>> in
>>>>>>>>>> mind. My suggestion would be to convert user-facing and runtime
>>>>>> classes
>>>>>>>>> and
>>>>>>>>>> split the code base into multiple modules:
>>>>>>>>>>>> flink-table-java {depends on flink-table-core}
>>>>>>>>>>> Implemented in Java. Java users can use this. This would
>> require
>>> to
>>>>>>>>>> convert classes like TableEnvironment, Table.
>>>>>>>>>>>> flink-table-scala {depends on flink-table-core}
>>>>>>>>>>> Implemented in Scala. Scala users can use this.
>>>>>>>>>>> 
>>>>>>>>>>>> flink-table-common
>>>>>>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
>> this.
>>> It
>>>>>>>>>> contains interface classes such as descriptors, table sink,
>> table
>>>>>> source.
>>>>>>>>>>>> flink-table-core {depends on flink-table-common and
>>>>>>>>>> flink-table-runtime}
>>>>>>>>>>> Implemented in Scala. Contains the current main code base.
>>>>>>>>>>> 
>>>>>>>>>>>> flink-table-runtime
>>>>>>>>>>> Implemented in Java. This would require to convert classes in
>>>>>>>>>> o.a.f.table.runtime but would improve the runtime potentially.
>>>>>>>>>>> 
>>>>>>>>>>> What do you think?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> 
>>>>>>>>>>> Timo
>>>>>>>>>>> 
>>>>>>>>>>> [1]
>>>>>>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
>>>>>>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
>>>>>> traits-tp21335.html
>>>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>> 


Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by "Zhang, Xuefu" <xu...@alibaba-inc.com>.
Hi Timo,

Thanks for the effort and the Google writeup. During our external catalog rework, we found much confusion between Java and Scala, and this Scala-free roadmap should greatly mitigate that.

I'm wondering that whether we can have rule in the interim when Java and Scala coexist that dependency can only be one-way. I found that in the current code base there are cases where a Scala class extends Java and vise versa. This is quite painful. I'm thinking if we could say that extension can only be from Java to Scala, which will help the situation. However, I'm not sure if this is practical.

Thanks,
Xuefu


------------------------------------------------------------------
Sender:jincheng sun <su...@gmail.com>
Sent at:2018 Nov 23 (Fri) 09:49
Recipient:dev <de...@flink.apache.org>
Subject:Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Hi Timo,
Thanks for initiating this great discussion.

Currently when using SQL/TableAPI should include many dependence. In
particular, it is not necessary to introduce the specific implementation
dependencies which users do not care about. So I am glad to see your
proposal, and hope when we consider splitting the API interface into a
separate module, so that the user can introduce minimum of dependencies.

So, +1 to [separation of interface and implementation; e.g. `Table` &
`TableImpl`] which you mentioned in the google doc.
Best,
Jincheng

Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:

> Hi Timo, thanks for driving this! I think that this is a nice thing to do.
> While we are doing this, can we also keep in mind that we want to
> eventually have a TableAPI interface only module which users can take
> dependency on, but without including any implementation details?
>
> Xiaowei
>
> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com> wrote:
>
> > Hi Timo,
> >
> > Thanks for writing up this document.
> > I like the new structure and agree to prioritize the porting of the
> > flink-table-common classes.
> > Since flink-table-runtime is (or should be) independent of the API and
> > planner modules, we could start porting these classes once the code is
> > split into the new module structure.
> > The benefits of a Scala-free flink-table-runtime would be a Scala-free
> > execution Jar.
> >
> > Best, Fabian
> >
> >
> > Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> > twalthr@apache.org
> > >:
> >
> > > Hi everyone,
> > >
> > > I would like to continue this discussion thread and convert the outcome
> > > into a FLIP such that users and contributors know what to expect in the
> > > upcoming releases.
> > >
> > > I created a design document [1] that clarifies our motivation why we
> > > want to do this, how a Maven module structure could look like, and a
> > > suggestion for a migration plan.
> > >
> > > It would be great to start with the efforts for the 1.8 release such
> > > that new features can be developed in Java and major refactorings such
> > > as improvements to the connectors and external catalog support are not
> > > blocked.
> > >
> > > Please let me know what you think.
> > >
> > > Regards,
> > > Timo
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> > >
> > >
> > > Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > > > Hi Piotr,
> > > >
> > > > thanks for bumping this thread and thanks for Xingcan for the
> comments.
> > > >
> > > > I think the first step would be to separate the flink-table module
> into
> > > > multiple sub modules. These could be:
> > > >
> > > > - flink-table-api: All API facing classes. Can be later divided
> further
> > > > into Java/Scala Table API/SQL
> > > > - flink-table-planning: involves all planning (basically everything
> we
> > do
> > > > with Calcite)
> > > > - flink-table-runtime: the runtime code
> > > >
> > > > IMO, a realistic mid-term goal is to have the runtime module and
> > certain
> > > > parts of the planning module ported to Java.
> > > > The api module will be much harder to port because of several
> > > dependencies
> > > > to Scala core classes (the parser framework, tree iterations, etc.).
> > I'm
> > > > not saying we should not port this to Java, but it is not clear to me
> > > (yet)
> > > > how to do it.
> > > >
> > > > I think flink-table-runtime should not be too hard to port. The code
> > does
> > > > not make use of many Scala features, i.e., it's writing very
> Java-like.
> > > > Also, there are not many dependencies and operators can be
> individually
> > > > ported step-by-step.
> > > > For flink-table-planning, we can have certain packages that we port
> to
> > > Java
> > > > like planning rules or plan nodes. The related classes mostly extend
> > > > Calcite's Java interfaces/classes and would be natural choices for
> > being
> > > > ported. The code generation classes will require more effort to port.
> > > There
> > > > are also some dependencies in planning on the api module that we
> would
> > > need
> > > > to resolve somehow.
> > > >
> > > > For SQL most work when adding new features is done in the planning
> and
> > > > runtime modules. So, this separation should already reduce
> > "technological
> > > > dept" quite a lot.
> > > > The Table API depends much more on Scala than SQL.
> > > >
> > > > Cheers, Fabian
> > > >
> > > >
> > > >
> > > > 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I also think about this problem these days and here are my thoughts.
> > > >>
> > > >> 1) We must admit that it’s really a tough task to interoperate with
> > Java
> > > >> and Scala. E.g., they have different collection types (Scala
> > collections
> > > >> v.s. java.util.*) and in Java, it's hard to implement a method which
> > > takes
> > > >> Scala functions as parameters. Considering the major part of the
> code
> > > base
> > > >> is implemented in Java, +1 for this goal from a long-term view.
> > > >>
> > > >> 2) The ideal solution would be to just expose a Scala API and make
> all
> > > the
> > > >> other parts Scala-free. But I am not sure if it could be achieved
> even
> > > in a
> > > >> long-term. Thus as Timo suggested, keep the Scala codes in
> > > >> "flink-table-core" would be a compromise solution.
> > > >>
> > > >> 3) If the community makes the final decision, maybe any new features
> > > >> should be added in Java (regardless of the modules), in order to
> > prevent
> > > >> the Scala codes from growing.
> > > >>
> > > >> Best,
> > > >> Xingcan
> > > >>
> > > >>
> > > >>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> piotr@data-artisans.com>
> > > >> wrote:
> > > >>> Bumping the topic.
> > > >>>
> > > >>> If we want to do this, the sooner we decide, the less code we will
> > have
> > > >> to rewrite. I have some objections/counter proposals to Fabian's
> > > proposal
> > > >> of doing it module wise and one module at a time.
> > > >>> First, I do not see a problem of having java/scala code even within
> > one
> > > >> module, especially not if there are clean boundaries. Like we could
> > have
> > > >> API in Scala and optimizer rules/logical nodes written in Java in
> the
> > > same
> > > >> module. However I haven’t previously maintained mixed scala/java
> code
> > > bases
> > > >> before, so I might be missing something here.
> > > >>> Secondly this whole migration might and most like will take longer
> > then
> > > >> expected, so that creates a problem for a new code that we will be
> > > >> creating. After making a decision to migrate to Java, almost any new
> > > Scala
> > > >> line of code will be immediately a technological debt and we will
> have
> > > to
> > > >> rewrite it to Java later.
> > > >>> Thus I would propose first to state our end goal - modules
> structure
> > > and
> > > >> which parts of modules we want to have eventually Scala-free.
> Secondly
> > > >> taking all steps necessary that will allow us to write new code
> > > complaint
> > > >> with our end goal. Only after that we should/could focus on
> > > incrementally
> > > >> rewriting the old code. Otherwise we could be stuck/blocked for
> years
> > > >> writing new code in Scala (and increasing technological debt),
> because
> > > >> nobody have found a time to rewrite some non important and not
> > actively
> > > >> developed part of some module.
> > > >>> Piotrek
> > > >>>
> > > >>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> wrote:
> > > >>>>
> > > >>>> Hi,
> > > >>>>
> > > >>>> In general, I think this is a good effort. However, it won't be
> easy
> > > >> and I
> > > >>>> think we have to plan this well.
> > > >>>> I don't like the idea of having the whole code base fragmented
> into
> > > Java
> > > >>>> and Scala code for too long.
> > > >>>>
> > > >>>> I think we should do this one step at a time and focus on
> migrating
> > > one
> > > >>>> module at a time.
> > > >>>> IMO, the easiest start would be to port the runtime to Java.
> > > >>>> Extracting the API classes into an own module, porting them to
> Java,
> > > and
> > > >>>> removing the Scala dependency won't be possible without breaking
> the
> > > API
> > > >>>> since a few classes depend on the Scala Table API.
> > > >>>>
> > > >>>> Best, Fabian
> > > >>>>
> > > >>>>
> > > >>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
> > > >>>>
> > > >>>>> I think that is a noble and honorable goal and we should strive
> for
> > > it.
> > > >>>>> This, however, must be an iterative process given the sheer size
> of
> > > the
> > > >>>>> code base. I like the approach to define common Java modules
> which
> > > are
> > > >> used
> > > >>>>> by more specific Scala modules and slowly moving classes from
> Scala
> > > to
> > > >>>>> Java. Thus +1 for the proposal.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Till
> > > >>>>>
> > > >>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > > >> piotr@data-artisans.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> I do not have an experience with how scala and java interacts
> with
> > > >> each
> > > >>>>>> other, so I can not fully validate your proposal, but generally
> > > >> speaking
> > > >>>>> +1
> > > >>>>>> from me.
> > > >>>>>>
> > > >>>>>> Does it also mean, that we should slowly migrate
> > `flink-table-core`
> > > to
> > > >>>>>> Java? How would you envision it? It would be nice to be able to
> > add
> > > >> new
> > > >>>>>> classes/features written in Java and so that they can coexist
> with
> > > old
> > > >>>>>> Scala code until we gradually switch from Scala to Java.
> > > >>>>>>
> > > >>>>>> Piotrek
> > > >>>>>>
> > > >>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
> > wrote:
> > > >>>>>>>
> > > >>>>>>> Hi everyone,
> > > >>>>>>>
> > > >>>>>>> as you all know, currently the Table & SQL API is implemented
> in
> > > >> Scala.
> > > >>>>>> This decision was made a long-time ago when the initital code
> base
> > > was
> > > >>>>>> created as part of a master's thesis. The community kept Scala
> > > >> because of
> > > >>>>>> the nice language features that enable a fluent Table API like
> > > >>>>>> table.select('field.trim()) and because Scala allows for quick
> > > >>>>> prototyping
> > > >>>>>> (e.g. multi-line comments for code generation). The committers
> > > >> enforced
> > > >>>>> not
> > > >>>>>> splitting the code-base into two programming languages.
> > > >>>>>>> However, nowadays the flink-table module more and more becomes
> an
> > > >>>>>> important part in the Flink ecosystem. Connectors, formats, and
> > SQL
> > > >>>>> client
> > > >>>>>> are actually implemented in Java but need to interoperate with
> > > >>>>> flink-table
> > > >>>>>> which makes these modules dependent on Scala. As mentioned in an
> > > >> earlier
> > > >>>>>> mail thread, using Scala for API classes also exposes member
> > > variables
> > > >>>>> and
> > > >>>>>> methods in Java that should not be exposed to users [1]. Java is
> > > still
> > > >>>>> the
> > > >>>>>> most important API language and right now we treat it as a
> > > >> second-class
> > > >>>>>> citizen. I just noticed that you even need to add Scala if you
> > just
> > > >> want
> > > >>>>> to
> > > >>>>>> implement a ScalarFunction because of method clashes between
> > `public
> > > >>>>> String
> > > >>>>>> toString()` and `public scala.Predef.String toString()`.
> > > >>>>>>> Given the size of the current code base, reimplementing the
> > entire
> > > >>>>>> flink-table code in Java is a goal that we might never reach.
> > > >> However, we
> > > >>>>>> should at least treat the symptoms and have this as a long-term
> > goal
> > > >> in
> > > >>>>>> mind. My suggestion would be to convert user-facing and runtime
> > > >> classes
> > > >>>>> and
> > > >>>>>> split the code base into multiple modules:
> > > >>>>>>>> flink-table-java {depends on flink-table-core}
> > > >>>>>>> Implemented in Java. Java users can use this. This would
> require
> > to
> > > >>>>>> convert classes like TableEnvironment, Table.
> > > >>>>>>>> flink-table-scala {depends on flink-table-core}
> > > >>>>>>> Implemented in Scala. Scala users can use this.
> > > >>>>>>>
> > > >>>>>>>> flink-table-common
> > > >>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
> this.
> > It
> > > >>>>>> contains interface classes such as descriptors, table sink,
> table
> > > >> source.
> > > >>>>>>>> flink-table-core {depends on flink-table-common and
> > > >>>>>> flink-table-runtime}
> > > >>>>>>> Implemented in Scala. Contains the current main code base.
> > > >>>>>>>
> > > >>>>>>>> flink-table-runtime
> > > >>>>>>> Implemented in Java. This would require to convert classes in
> > > >>>>>> o.a.f.table.runtime but would improve the runtime potentially.
> > > >>>>>>>
> > > >>>>>>> What do you think?
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Regards,
> > > >>>>>>>
> > > >>>>>>> Timo
> > > >>>>>>>
> > > >>>>>>> [1]
> > > >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > > >>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > > >> traits-tp21335.html
> > > >>>>>>
> > > >>
> > >
> > >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by jincheng sun <su...@gmail.com>.
Hi Timo,
Thanks for initiating this great discussion.

Currently when using SQL/TableAPI should include many dependence. In
particular, it is not necessary to introduce the specific implementation
dependencies which users do not care about. So I am glad to see your
proposal, and hope when we consider splitting the API interface into a
separate module, so that the user can introduce minimum of dependencies.

So, +1 to [separation of interface and implementation; e.g. `Table` &
`TableImpl`] which you mentioned in the google doc.
Best,
Jincheng

Xiaowei Jiang <xi...@gmail.com> 于2018年11月22日周四 下午10:50写道:

> Hi Timo, thanks for driving this! I think that this is a nice thing to do.
> While we are doing this, can we also keep in mind that we want to
> eventually have a TableAPI interface only module which users can take
> dependency on, but without including any implementation details?
>
> Xiaowei
>
> On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com> wrote:
>
> > Hi Timo,
> >
> > Thanks for writing up this document.
> > I like the new structure and agree to prioritize the porting of the
> > flink-table-common classes.
> > Since flink-table-runtime is (or should be) independent of the API and
> > planner modules, we could start porting these classes once the code is
> > split into the new module structure.
> > The benefits of a Scala-free flink-table-runtime would be a Scala-free
> > execution Jar.
> >
> > Best, Fabian
> >
> >
> > Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> > twalthr@apache.org
> > >:
> >
> > > Hi everyone,
> > >
> > > I would like to continue this discussion thread and convert the outcome
> > > into a FLIP such that users and contributors know what to expect in the
> > > upcoming releases.
> > >
> > > I created a design document [1] that clarifies our motivation why we
> > > want to do this, how a Maven module structure could look like, and a
> > > suggestion for a migration plan.
> > >
> > > It would be great to start with the efforts for the 1.8 release such
> > > that new features can be developed in Java and major refactorings such
> > > as improvements to the connectors and external catalog support are not
> > > blocked.
> > >
> > > Please let me know what you think.
> > >
> > > Regards,
> > > Timo
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> > >
> > >
> > > Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > > > Hi Piotr,
> > > >
> > > > thanks for bumping this thread and thanks for Xingcan for the
> comments.
> > > >
> > > > I think the first step would be to separate the flink-table module
> into
> > > > multiple sub modules. These could be:
> > > >
> > > > - flink-table-api: All API facing classes. Can be later divided
> further
> > > > into Java/Scala Table API/SQL
> > > > - flink-table-planning: involves all planning (basically everything
> we
> > do
> > > > with Calcite)
> > > > - flink-table-runtime: the runtime code
> > > >
> > > > IMO, a realistic mid-term goal is to have the runtime module and
> > certain
> > > > parts of the planning module ported to Java.
> > > > The api module will be much harder to port because of several
> > > dependencies
> > > > to Scala core classes (the parser framework, tree iterations, etc.).
> > I'm
> > > > not saying we should not port this to Java, but it is not clear to me
> > > (yet)
> > > > how to do it.
> > > >
> > > > I think flink-table-runtime should not be too hard to port. The code
> > does
> > > > not make use of many Scala features, i.e., it's writing very
> Java-like.
> > > > Also, there are not many dependencies and operators can be
> individually
> > > > ported step-by-step.
> > > > For flink-table-planning, we can have certain packages that we port
> to
> > > Java
> > > > like planning rules or plan nodes. The related classes mostly extend
> > > > Calcite's Java interfaces/classes and would be natural choices for
> > being
> > > > ported. The code generation classes will require more effort to port.
> > > There
> > > > are also some dependencies in planning on the api module that we
> would
> > > need
> > > > to resolve somehow.
> > > >
> > > > For SQL most work when adding new features is done in the planning
> and
> > > > runtime modules. So, this separation should already reduce
> > "technological
> > > > dept" quite a lot.
> > > > The Table API depends much more on Scala than SQL.
> > > >
> > > > Cheers, Fabian
> > > >
> > > >
> > > >
> > > > 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I also think about this problem these days and here are my thoughts.
> > > >>
> > > >> 1) We must admit that it’s really a tough task to interoperate with
> > Java
> > > >> and Scala. E.g., they have different collection types (Scala
> > collections
> > > >> v.s. java.util.*) and in Java, it's hard to implement a method which
> > > takes
> > > >> Scala functions as parameters. Considering the major part of the
> code
> > > base
> > > >> is implemented in Java, +1 for this goal from a long-term view.
> > > >>
> > > >> 2) The ideal solution would be to just expose a Scala API and make
> all
> > > the
> > > >> other parts Scala-free. But I am not sure if it could be achieved
> even
> > > in a
> > > >> long-term. Thus as Timo suggested, keep the Scala codes in
> > > >> "flink-table-core" would be a compromise solution.
> > > >>
> > > >> 3) If the community makes the final decision, maybe any new features
> > > >> should be added in Java (regardless of the modules), in order to
> > prevent
> > > >> the Scala codes from growing.
> > > >>
> > > >> Best,
> > > >> Xingcan
> > > >>
> > > >>
> > > >>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <
> piotr@data-artisans.com>
> > > >> wrote:
> > > >>> Bumping the topic.
> > > >>>
> > > >>> If we want to do this, the sooner we decide, the less code we will
> > have
> > > >> to rewrite. I have some objections/counter proposals to Fabian's
> > > proposal
> > > >> of doing it module wise and one module at a time.
> > > >>> First, I do not see a problem of having java/scala code even within
> > one
> > > >> module, especially not if there are clean boundaries. Like we could
> > have
> > > >> API in Scala and optimizer rules/logical nodes written in Java in
> the
> > > same
> > > >> module. However I haven’t previously maintained mixed scala/java
> code
> > > bases
> > > >> before, so I might be missing something here.
> > > >>> Secondly this whole migration might and most like will take longer
> > then
> > > >> expected, so that creates a problem for a new code that we will be
> > > >> creating. After making a decision to migrate to Java, almost any new
> > > Scala
> > > >> line of code will be immediately a technological debt and we will
> have
> > > to
> > > >> rewrite it to Java later.
> > > >>> Thus I would propose first to state our end goal - modules
> structure
> > > and
> > > >> which parts of modules we want to have eventually Scala-free.
> Secondly
> > > >> taking all steps necessary that will allow us to write new code
> > > complaint
> > > >> with our end goal. Only after that we should/could focus on
> > > incrementally
> > > >> rewriting the old code. Otherwise we could be stuck/blocked for
> years
> > > >> writing new code in Scala (and increasing technological debt),
> because
> > > >> nobody have found a time to rewrite some non important and not
> > actively
> > > >> developed part of some module.
> > > >>> Piotrek
> > > >>>
> > > >>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com>
> wrote:
> > > >>>>
> > > >>>> Hi,
> > > >>>>
> > > >>>> In general, I think this is a good effort. However, it won't be
> easy
> > > >> and I
> > > >>>> think we have to plan this well.
> > > >>>> I don't like the idea of having the whole code base fragmented
> into
> > > Java
> > > >>>> and Scala code for too long.
> > > >>>>
> > > >>>> I think we should do this one step at a time and focus on
> migrating
> > > one
> > > >>>> module at a time.
> > > >>>> IMO, the easiest start would be to port the runtime to Java.
> > > >>>> Extracting the API classes into an own module, porting them to
> Java,
> > > and
> > > >>>> removing the Scala dependency won't be possible without breaking
> the
> > > API
> > > >>>> since a few classes depend on the Scala Table API.
> > > >>>>
> > > >>>> Best, Fabian
> > > >>>>
> > > >>>>
> > > >>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
> > > >>>>
> > > >>>>> I think that is a noble and honorable goal and we should strive
> for
> > > it.
> > > >>>>> This, however, must be an iterative process given the sheer size
> of
> > > the
> > > >>>>> code base. I like the approach to define common Java modules
> which
> > > are
> > > >> used
> > > >>>>> by more specific Scala modules and slowly moving classes from
> Scala
> > > to
> > > >>>>> Java. Thus +1 for the proposal.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Till
> > > >>>>>
> > > >>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > > >> piotr@data-artisans.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Hi,
> > > >>>>>>
> > > >>>>>> I do not have an experience with how scala and java interacts
> with
> > > >> each
> > > >>>>>> other, so I can not fully validate your proposal, but generally
> > > >> speaking
> > > >>>>> +1
> > > >>>>>> from me.
> > > >>>>>>
> > > >>>>>> Does it also mean, that we should slowly migrate
> > `flink-table-core`
> > > to
> > > >>>>>> Java? How would you envision it? It would be nice to be able to
> > add
> > > >> new
> > > >>>>>> classes/features written in Java and so that they can coexist
> with
> > > old
> > > >>>>>> Scala code until we gradually switch from Scala to Java.
> > > >>>>>>
> > > >>>>>> Piotrek
> > > >>>>>>
> > > >>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
> > wrote:
> > > >>>>>>>
> > > >>>>>>> Hi everyone,
> > > >>>>>>>
> > > >>>>>>> as you all know, currently the Table & SQL API is implemented
> in
> > > >> Scala.
> > > >>>>>> This decision was made a long-time ago when the initital code
> base
> > > was
> > > >>>>>> created as part of a master's thesis. The community kept Scala
> > > >> because of
> > > >>>>>> the nice language features that enable a fluent Table API like
> > > >>>>>> table.select('field.trim()) and because Scala allows for quick
> > > >>>>> prototyping
> > > >>>>>> (e.g. multi-line comments for code generation). The committers
> > > >> enforced
> > > >>>>> not
> > > >>>>>> splitting the code-base into two programming languages.
> > > >>>>>>> However, nowadays the flink-table module more and more becomes
> an
> > > >>>>>> important part in the Flink ecosystem. Connectors, formats, and
> > SQL
> > > >>>>> client
> > > >>>>>> are actually implemented in Java but need to interoperate with
> > > >>>>> flink-table
> > > >>>>>> which makes these modules dependent on Scala. As mentioned in an
> > > >> earlier
> > > >>>>>> mail thread, using Scala for API classes also exposes member
> > > variables
> > > >>>>> and
> > > >>>>>> methods in Java that should not be exposed to users [1]. Java is
> > > still
> > > >>>>> the
> > > >>>>>> most important API language and right now we treat it as a
> > > >> second-class
> > > >>>>>> citizen. I just noticed that you even need to add Scala if you
> > just
> > > >> want
> > > >>>>> to
> > > >>>>>> implement a ScalarFunction because of method clashes between
> > `public
> > > >>>>> String
> > > >>>>>> toString()` and `public scala.Predef.String toString()`.
> > > >>>>>>> Given the size of the current code base, reimplementing the
> > entire
> > > >>>>>> flink-table code in Java is a goal that we might never reach.
> > > >> However, we
> > > >>>>>> should at least treat the symptoms and have this as a long-term
> > goal
> > > >> in
> > > >>>>>> mind. My suggestion would be to convert user-facing and runtime
> > > >> classes
> > > >>>>> and
> > > >>>>>> split the code base into multiple modules:
> > > >>>>>>>> flink-table-java {depends on flink-table-core}
> > > >>>>>>> Implemented in Java. Java users can use this. This would
> require
> > to
> > > >>>>>> convert classes like TableEnvironment, Table.
> > > >>>>>>>> flink-table-scala {depends on flink-table-core}
> > > >>>>>>> Implemented in Scala. Scala users can use this.
> > > >>>>>>>
> > > >>>>>>>> flink-table-common
> > > >>>>>>> Implemented in Java. Connectors, formats, and UDFs can use
> this.
> > It
> > > >>>>>> contains interface classes such as descriptors, table sink,
> table
> > > >> source.
> > > >>>>>>>> flink-table-core {depends on flink-table-common and
> > > >>>>>> flink-table-runtime}
> > > >>>>>>> Implemented in Scala. Contains the current main code base.
> > > >>>>>>>
> > > >>>>>>>> flink-table-runtime
> > > >>>>>>> Implemented in Java. This would require to convert classes in
> > > >>>>>> o.a.f.table.runtime but would improve the runtime potentially.
> > > >>>>>>>
> > > >>>>>>> What do you think?
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Regards,
> > > >>>>>>>
> > > >>>>>>> Timo
> > > >>>>>>>
> > > >>>>>>> [1]
> > > >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > > >>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > > >> traits-tp21335.html
> > > >>>>>>
> > > >>
> > >
> > >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Xiaowei Jiang <xi...@gmail.com>.
Hi Timo, thanks for driving this! I think that this is a nice thing to do.
While we are doing this, can we also keep in mind that we want to
eventually have a TableAPI interface only module which users can take
dependency on, but without including any implementation details?

Xiaowei

On Thu, Nov 22, 2018 at 6:37 PM Fabian Hueske <fh...@gmail.com> wrote:

> Hi Timo,
>
> Thanks for writing up this document.
> I like the new structure and agree to prioritize the porting of the
> flink-table-common classes.
> Since flink-table-runtime is (or should be) independent of the API and
> planner modules, we could start porting these classes once the code is
> split into the new module structure.
> The benefits of a Scala-free flink-table-runtime would be a Scala-free
> execution Jar.
>
> Best, Fabian
>
>
> Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <
> twalthr@apache.org
> >:
>
> > Hi everyone,
> >
> > I would like to continue this discussion thread and convert the outcome
> > into a FLIP such that users and contributors know what to expect in the
> > upcoming releases.
> >
> > I created a design document [1] that clarifies our motivation why we
> > want to do this, how a Maven module structure could look like, and a
> > suggestion for a migration plan.
> >
> > It would be great to start with the efforts for the 1.8 release such
> > that new features can be developed in Java and major refactorings such
> > as improvements to the connectors and external catalog support are not
> > blocked.
> >
> > Please let me know what you think.
> >
> > Regards,
> > Timo
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
> >
> >
> > Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > > Hi Piotr,
> > >
> > > thanks for bumping this thread and thanks for Xingcan for the comments.
> > >
> > > I think the first step would be to separate the flink-table module into
> > > multiple sub modules. These could be:
> > >
> > > - flink-table-api: All API facing classes. Can be later divided further
> > > into Java/Scala Table API/SQL
> > > - flink-table-planning: involves all planning (basically everything we
> do
> > > with Calcite)
> > > - flink-table-runtime: the runtime code
> > >
> > > IMO, a realistic mid-term goal is to have the runtime module and
> certain
> > > parts of the planning module ported to Java.
> > > The api module will be much harder to port because of several
> > dependencies
> > > to Scala core classes (the parser framework, tree iterations, etc.).
> I'm
> > > not saying we should not port this to Java, but it is not clear to me
> > (yet)
> > > how to do it.
> > >
> > > I think flink-table-runtime should not be too hard to port. The code
> does
> > > not make use of many Scala features, i.e., it's writing very Java-like.
> > > Also, there are not many dependencies and operators can be individually
> > > ported step-by-step.
> > > For flink-table-planning, we can have certain packages that we port to
> > Java
> > > like planning rules or plan nodes. The related classes mostly extend
> > > Calcite's Java interfaces/classes and would be natural choices for
> being
> > > ported. The code generation classes will require more effort to port.
> > There
> > > are also some dependencies in planning on the api module that we would
> > need
> > > to resolve somehow.
> > >
> > > For SQL most work when adding new features is done in the planning and
> > > runtime modules. So, this separation should already reduce
> "technological
> > > dept" quite a lot.
> > > The Table API depends much more on Scala than SQL.
> > >
> > > Cheers, Fabian
> > >
> > >
> > >
> > > 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> > >
> > >> Hi all,
> > >>
> > >> I also think about this problem these days and here are my thoughts.
> > >>
> > >> 1) We must admit that it’s really a tough task to interoperate with
> Java
> > >> and Scala. E.g., they have different collection types (Scala
> collections
> > >> v.s. java.util.*) and in Java, it's hard to implement a method which
> > takes
> > >> Scala functions as parameters. Considering the major part of the code
> > base
> > >> is implemented in Java, +1 for this goal from a long-term view.
> > >>
> > >> 2) The ideal solution would be to just expose a Scala API and make all
> > the
> > >> other parts Scala-free. But I am not sure if it could be achieved even
> > in a
> > >> long-term. Thus as Timo suggested, keep the Scala codes in
> > >> "flink-table-core" would be a compromise solution.
> > >>
> > >> 3) If the community makes the final decision, maybe any new features
> > >> should be added in Java (regardless of the modules), in order to
> prevent
> > >> the Scala codes from growing.
> > >>
> > >> Best,
> > >> Xingcan
> > >>
> > >>
> > >>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <pi...@data-artisans.com>
> > >> wrote:
> > >>> Bumping the topic.
> > >>>
> > >>> If we want to do this, the sooner we decide, the less code we will
> have
> > >> to rewrite. I have some objections/counter proposals to Fabian's
> > proposal
> > >> of doing it module wise and one module at a time.
> > >>> First, I do not see a problem of having java/scala code even within
> one
> > >> module, especially not if there are clean boundaries. Like we could
> have
> > >> API in Scala and optimizer rules/logical nodes written in Java in the
> > same
> > >> module. However I haven’t previously maintained mixed scala/java code
> > bases
> > >> before, so I might be missing something here.
> > >>> Secondly this whole migration might and most like will take longer
> then
> > >> expected, so that creates a problem for a new code that we will be
> > >> creating. After making a decision to migrate to Java, almost any new
> > Scala
> > >> line of code will be immediately a technological debt and we will have
> > to
> > >> rewrite it to Java later.
> > >>> Thus I would propose first to state our end goal - modules structure
> > and
> > >> which parts of modules we want to have eventually Scala-free. Secondly
> > >> taking all steps necessary that will allow us to write new code
> > complaint
> > >> with our end goal. Only after that we should/could focus on
> > incrementally
> > >> rewriting the old code. Otherwise we could be stuck/blocked for years
> > >> writing new code in Scala (and increasing technological debt), because
> > >> nobody have found a time to rewrite some non important and not
> actively
> > >> developed part of some module.
> > >>> Piotrek
> > >>>
> > >>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com> wrote:
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>> In general, I think this is a good effort. However, it won't be easy
> > >> and I
> > >>>> think we have to plan this well.
> > >>>> I don't like the idea of having the whole code base fragmented into
> > Java
> > >>>> and Scala code for too long.
> > >>>>
> > >>>> I think we should do this one step at a time and focus on migrating
> > one
> > >>>> module at a time.
> > >>>> IMO, the easiest start would be to port the runtime to Java.
> > >>>> Extracting the API classes into an own module, porting them to Java,
> > and
> > >>>> removing the Scala dependency won't be possible without breaking the
> > API
> > >>>> since a few classes depend on the Scala Table API.
> > >>>>
> > >>>> Best, Fabian
> > >>>>
> > >>>>
> > >>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
> > >>>>
> > >>>>> I think that is a noble and honorable goal and we should strive for
> > it.
> > >>>>> This, however, must be an iterative process given the sheer size of
> > the
> > >>>>> code base. I like the approach to define common Java modules which
> > are
> > >> used
> > >>>>> by more specific Scala modules and slowly moving classes from Scala
> > to
> > >>>>> Java. Thus +1 for the proposal.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Till
> > >>>>>
> > >>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> > >> piotr@data-artisans.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> I do not have an experience with how scala and java interacts with
> > >> each
> > >>>>>> other, so I can not fully validate your proposal, but generally
> > >> speaking
> > >>>>> +1
> > >>>>>> from me.
> > >>>>>>
> > >>>>>> Does it also mean, that we should slowly migrate
> `flink-table-core`
> > to
> > >>>>>> Java? How would you envision it? It would be nice to be able to
> add
> > >> new
> > >>>>>> classes/features written in Java and so that they can coexist with
> > old
> > >>>>>> Scala code until we gradually switch from Scala to Java.
> > >>>>>>
> > >>>>>> Piotrek
> > >>>>>>
> > >>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org>
> wrote:
> > >>>>>>>
> > >>>>>>> Hi everyone,
> > >>>>>>>
> > >>>>>>> as you all know, currently the Table & SQL API is implemented in
> > >> Scala.
> > >>>>>> This decision was made a long-time ago when the initital code base
> > was
> > >>>>>> created as part of a master's thesis. The community kept Scala
> > >> because of
> > >>>>>> the nice language features that enable a fluent Table API like
> > >>>>>> table.select('field.trim()) and because Scala allows for quick
> > >>>>> prototyping
> > >>>>>> (e.g. multi-line comments for code generation). The committers
> > >> enforced
> > >>>>> not
> > >>>>>> splitting the code-base into two programming languages.
> > >>>>>>> However, nowadays the flink-table module more and more becomes an
> > >>>>>> important part in the Flink ecosystem. Connectors, formats, and
> SQL
> > >>>>> client
> > >>>>>> are actually implemented in Java but need to interoperate with
> > >>>>> flink-table
> > >>>>>> which makes these modules dependent on Scala. As mentioned in an
> > >> earlier
> > >>>>>> mail thread, using Scala for API classes also exposes member
> > variables
> > >>>>> and
> > >>>>>> methods in Java that should not be exposed to users [1]. Java is
> > still
> > >>>>> the
> > >>>>>> most important API language and right now we treat it as a
> > >> second-class
> > >>>>>> citizen. I just noticed that you even need to add Scala if you
> just
> > >> want
> > >>>>> to
> > >>>>>> implement a ScalarFunction because of method clashes between
> `public
> > >>>>> String
> > >>>>>> toString()` and `public scala.Predef.String toString()`.
> > >>>>>>> Given the size of the current code base, reimplementing the
> entire
> > >>>>>> flink-table code in Java is a goal that we might never reach.
> > >> However, we
> > >>>>>> should at least treat the symptoms and have this as a long-term
> goal
> > >> in
> > >>>>>> mind. My suggestion would be to convert user-facing and runtime
> > >> classes
> > >>>>> and
> > >>>>>> split the code base into multiple modules:
> > >>>>>>>> flink-table-java {depends on flink-table-core}
> > >>>>>>> Implemented in Java. Java users can use this. This would require
> to
> > >>>>>> convert classes like TableEnvironment, Table.
> > >>>>>>>> flink-table-scala {depends on flink-table-core}
> > >>>>>>> Implemented in Scala. Scala users can use this.
> > >>>>>>>
> > >>>>>>>> flink-table-common
> > >>>>>>> Implemented in Java. Connectors, formats, and UDFs can use this.
> It
> > >>>>>> contains interface classes such as descriptors, table sink, table
> > >> source.
> > >>>>>>>> flink-table-core {depends on flink-table-common and
> > >>>>>> flink-table-runtime}
> > >>>>>>> Implemented in Scala. Contains the current main code base.
> > >>>>>>>
> > >>>>>>>> flink-table-runtime
> > >>>>>>> Implemented in Java. This would require to convert classes in
> > >>>>>> o.a.f.table.runtime but would improve the runtime potentially.
> > >>>>>>>
> > >>>>>>> What do you think?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Regards,
> > >>>>>>>
> > >>>>>>> Timo
> > >>>>>>>
> > >>>>>>> [1]
> > >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> > >>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> > >> traits-tp21335.html
> > >>>>>>
> > >>
> >
> >
>

Re: [DISCUSS] Long-term goal of making flink-table Scala-free

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Timo,

Thanks for writing up this document.
I like the new structure and agree to prioritize the porting of the
flink-table-common classes.
Since flink-table-runtime is (or should be) independent of the API and
planner modules, we could start porting these classes once the code is
split into the new module structure.
The benefits of a Scala-free flink-table-runtime would be a Scala-free
execution Jar.

Best, Fabian


Am Do., 22. Nov. 2018 um 10:54 Uhr schrieb Timo Walther <twalthr@apache.org
>:

> Hi everyone,
>
> I would like to continue this discussion thread and convert the outcome
> into a FLIP such that users and contributors know what to expect in the
> upcoming releases.
>
> I created a design document [1] that clarifies our motivation why we
> want to do this, how a Maven module structure could look like, and a
> suggestion for a migration plan.
>
> It would be great to start with the efforts for the 1.8 release such
> that new features can be developed in Java and major refactorings such
> as improvements to the connectors and external catalog support are not
> blocked.
>
> Please let me know what you think.
>
> Regards,
> Timo
>
> [1]
>
> https://docs.google.com/document/d/1PPo6goW7tOwxmpFuvLSjFnx7BF8IVz0w3dcmPPyqvoY/edit?usp=sharing
>
>
> Am 02.07.18 um 17:08 schrieb Fabian Hueske:
> > Hi Piotr,
> >
> > thanks for bumping this thread and thanks for Xingcan for the comments.
> >
> > I think the first step would be to separate the flink-table module into
> > multiple sub modules. These could be:
> >
> > - flink-table-api: All API facing classes. Can be later divided further
> > into Java/Scala Table API/SQL
> > - flink-table-planning: involves all planning (basically everything we do
> > with Calcite)
> > - flink-table-runtime: the runtime code
> >
> > IMO, a realistic mid-term goal is to have the runtime module and certain
> > parts of the planning module ported to Java.
> > The api module will be much harder to port because of several
> dependencies
> > to Scala core classes (the parser framework, tree iterations, etc.). I'm
> > not saying we should not port this to Java, but it is not clear to me
> (yet)
> > how to do it.
> >
> > I think flink-table-runtime should not be too hard to port. The code does
> > not make use of many Scala features, i.e., it's writing very Java-like.
> > Also, there are not many dependencies and operators can be individually
> > ported step-by-step.
> > For flink-table-planning, we can have certain packages that we port to
> Java
> > like planning rules or plan nodes. The related classes mostly extend
> > Calcite's Java interfaces/classes and would be natural choices for being
> > ported. The code generation classes will require more effort to port.
> There
> > are also some dependencies in planning on the api module that we would
> need
> > to resolve somehow.
> >
> > For SQL most work when adding new features is done in the planning and
> > runtime modules. So, this separation should already reduce "technological
> > dept" quite a lot.
> > The Table API depends much more on Scala than SQL.
> >
> > Cheers, Fabian
> >
> >
> >
> > 2018-07-02 16:26 GMT+02:00 Xingcan Cui <xi...@gmail.com>:
> >
> >> Hi all,
> >>
> >> I also think about this problem these days and here are my thoughts.
> >>
> >> 1) We must admit that it’s really a tough task to interoperate with Java
> >> and Scala. E.g., they have different collection types (Scala collections
> >> v.s. java.util.*) and in Java, it's hard to implement a method which
> takes
> >> Scala functions as parameters. Considering the major part of the code
> base
> >> is implemented in Java, +1 for this goal from a long-term view.
> >>
> >> 2) The ideal solution would be to just expose a Scala API and make all
> the
> >> other parts Scala-free. But I am not sure if it could be achieved even
> in a
> >> long-term. Thus as Timo suggested, keep the Scala codes in
> >> "flink-table-core" would be a compromise solution.
> >>
> >> 3) If the community makes the final decision, maybe any new features
> >> should be added in Java (regardless of the modules), in order to prevent
> >> the Scala codes from growing.
> >>
> >> Best,
> >> Xingcan
> >>
> >>
> >>> On Jul 2, 2018, at 9:30 PM, Piotr Nowojski <pi...@data-artisans.com>
> >> wrote:
> >>> Bumping the topic.
> >>>
> >>> If we want to do this, the sooner we decide, the less code we will have
> >> to rewrite. I have some objections/counter proposals to Fabian's
> proposal
> >> of doing it module wise and one module at a time.
> >>> First, I do not see a problem of having java/scala code even within one
> >> module, especially not if there are clean boundaries. Like we could have
> >> API in Scala and optimizer rules/logical nodes written in Java in the
> same
> >> module. However I haven’t previously maintained mixed scala/java code
> bases
> >> before, so I might be missing something here.
> >>> Secondly this whole migration might and most like will take longer then
> >> expected, so that creates a problem for a new code that we will be
> >> creating. After making a decision to migrate to Java, almost any new
> Scala
> >> line of code will be immediately a technological debt and we will have
> to
> >> rewrite it to Java later.
> >>> Thus I would propose first to state our end goal - modules structure
> and
> >> which parts of modules we want to have eventually Scala-free. Secondly
> >> taking all steps necessary that will allow us to write new code
> complaint
> >> with our end goal. Only after that we should/could focus on
> incrementally
> >> rewriting the old code. Otherwise we could be stuck/blocked for years
> >> writing new code in Scala (and increasing technological debt), because
> >> nobody have found a time to rewrite some non important and not actively
> >> developed part of some module.
> >>> Piotrek
> >>>
> >>>> On 14 Jun 2018, at 15:34, Fabian Hueske <fh...@gmail.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> In general, I think this is a good effort. However, it won't be easy
> >> and I
> >>>> think we have to plan this well.
> >>>> I don't like the idea of having the whole code base fragmented into
> Java
> >>>> and Scala code for too long.
> >>>>
> >>>> I think we should do this one step at a time and focus on migrating
> one
> >>>> module at a time.
> >>>> IMO, the easiest start would be to port the runtime to Java.
> >>>> Extracting the API classes into an own module, porting them to Java,
> and
> >>>> removing the Scala dependency won't be possible without breaking the
> API
> >>>> since a few classes depend on the Scala Table API.
> >>>>
> >>>> Best, Fabian
> >>>>
> >>>>
> >>>> 2018-06-14 10:33 GMT+02:00 Till Rohrmann <tr...@apache.org>:
> >>>>
> >>>>> I think that is a noble and honorable goal and we should strive for
> it.
> >>>>> This, however, must be an iterative process given the sheer size of
> the
> >>>>> code base. I like the approach to define common Java modules which
> are
> >> used
> >>>>> by more specific Scala modules and slowly moving classes from Scala
> to
> >>>>> Java. Thus +1 for the proposal.
> >>>>>
> >>>>> Cheers,
> >>>>> Till
> >>>>>
> >>>>> On Wed, Jun 13, 2018 at 12:01 PM Piotr Nowojski <
> >> piotr@data-artisans.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I do not have an experience with how scala and java interacts with
> >> each
> >>>>>> other, so I can not fully validate your proposal, but generally
> >> speaking
> >>>>> +1
> >>>>>> from me.
> >>>>>>
> >>>>>> Does it also mean, that we should slowly migrate `flink-table-core`
> to
> >>>>>> Java? How would you envision it? It would be nice to be able to add
> >> new
> >>>>>> classes/features written in Java and so that they can coexist with
> old
> >>>>>> Scala code until we gradually switch from Scala to Java.
> >>>>>>
> >>>>>> Piotrek
> >>>>>>
> >>>>>>> On 13 Jun 2018, at 11:32, Timo Walther <tw...@apache.org> wrote:
> >>>>>>>
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> as you all know, currently the Table & SQL API is implemented in
> >> Scala.
> >>>>>> This decision was made a long-time ago when the initital code base
> was
> >>>>>> created as part of a master's thesis. The community kept Scala
> >> because of
> >>>>>> the nice language features that enable a fluent Table API like
> >>>>>> table.select('field.trim()) and because Scala allows for quick
> >>>>> prototyping
> >>>>>> (e.g. multi-line comments for code generation). The committers
> >> enforced
> >>>>> not
> >>>>>> splitting the code-base into two programming languages.
> >>>>>>> However, nowadays the flink-table module more and more becomes an
> >>>>>> important part in the Flink ecosystem. Connectors, formats, and SQL
> >>>>> client
> >>>>>> are actually implemented in Java but need to interoperate with
> >>>>> flink-table
> >>>>>> which makes these modules dependent on Scala. As mentioned in an
> >> earlier
> >>>>>> mail thread, using Scala for API classes also exposes member
> variables
> >>>>> and
> >>>>>> methods in Java that should not be exposed to users [1]. Java is
> still
> >>>>> the
> >>>>>> most important API language and right now we treat it as a
> >> second-class
> >>>>>> citizen. I just noticed that you even need to add Scala if you just
> >> want
> >>>>> to
> >>>>>> implement a ScalarFunction because of method clashes between `public
> >>>>> String
> >>>>>> toString()` and `public scala.Predef.String toString()`.
> >>>>>>> Given the size of the current code base, reimplementing the entire
> >>>>>> flink-table code in Java is a goal that we might never reach.
> >> However, we
> >>>>>> should at least treat the symptoms and have this as a long-term goal
> >> in
> >>>>>> mind. My suggestion would be to convert user-facing and runtime
> >> classes
> >>>>> and
> >>>>>> split the code base into multiple modules:
> >>>>>>>> flink-table-java {depends on flink-table-core}
> >>>>>>> Implemented in Java. Java users can use this. This would require to
> >>>>>> convert classes like TableEnvironment, Table.
> >>>>>>>> flink-table-scala {depends on flink-table-core}
> >>>>>>> Implemented in Scala. Scala users can use this.
> >>>>>>>
> >>>>>>>> flink-table-common
> >>>>>>> Implemented in Java. Connectors, formats, and UDFs can use this. It
> >>>>>> contains interface classes such as descriptors, table sink, table
> >> source.
> >>>>>>>> flink-table-core {depends on flink-table-common and
> >>>>>> flink-table-runtime}
> >>>>>>> Implemented in Scala. Contains the current main code base.
> >>>>>>>
> >>>>>>>> flink-table-runtime
> >>>>>>> Implemented in Java. This would require to convert classes in
> >>>>>> o.a.f.table.runtime but would improve the runtime potentially.
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Timo
> >>>>>>>
> >>>>>>> [1]
> >>>>>> http://apache-flink-mailing-list-archive.1008284.n3.
> >>>>> nabble.com/DISCUSS-Convert-main-Table-API-classes-into-
> >> traits-tp21335.html
> >>>>>>
> >>
>
>