You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Nick Burkard (Jira)" <ji...@apache.org> on 2021/01/03 13:28:00 UTC

[jira] [Comment Edited] (FLINK-13414) Add support for Scala 2.13

    [ https://issues.apache.org/jira/browse/FLINK-13414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257763#comment-17257763 ] 

Nick Burkard edited comment on FLINK-13414 at 1/3/21, 1:27 PM:
---------------------------------------------------------------

Hi everyone,

I spent the past few days working on this, building on [~ariskoliopoulos]'s work - there's a lot of work that will be needed to properly migrate to supporting 2.13. I was using [this document|https://docs.scala-lang.org/overviews/core/collections-migration-213.html] provided by Scala's website, plus their [compat library|https://github.com/scala/scala-collection-compat], to handle the migration steps. I wanted to get the community's opinion on a few different points. I think the work can be split into a few different tickets to limit overall impact.

#1 - Scala 2.11 is very old, most modern Scala libraries do not support it anymore.
 The typical support path is 2.12 and 2.13. Supporting 2.11 while also supporting 2.13 will make this migration process more difficult. If there's consensus to drop 2.11 support, perhaps that should come as part of its own ticket first. This will make upgrading libraries to support 2.13 much easier as well. On the subject of 2.12, I noticed `flink-scala-shell` only supports 2.11 and I am unaware if there's interests to continue upgrading it. Wanted to mention it in case anyone else has insight.

#2 - Flink makes extensive use of implicit type conversions.
 A lot of Flink's Scala code, especially in the flink-table submodules, make repeated use of implicit conversions to convert between Java & Scala types. I mention this because `import scala.collection.JavaConversions.__`_ is entirely gone in 2.13, which is what provides said implicit conversions. We can instead start to import scala.collection.convert.ImplicitConversions._, which is still present (albeit deprecated) in 2.13. Worth noting this new import is not present in 2.11, it would need to come after point #1. Again, I think this could be its own ticket so as to limit the impact.

#3 - `TraversableOnce` is gone in 2.13.
 Internal Java code uses `TraversableOnce` for [two|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerConfigSnapshot.java] [serializers|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerSnapshot.java], with comments explicitly stating neither cannot be written in Scala due to constructor limitations with the language. This poses a problem, since as far as I can tell we can't import the Scala compat library to get access to `IterableOnce` in Java code for cross-compiling. We may have to move these two files into separate java folders that maven conditionally compiles based on Scala version - one for `TraversableOnce` with 2.12 and one for `IterableOnce` with 2.13. I was able to get it compiling with this strategy but am open to other suggestions.

#4 - Collection conversion has changed.
 The collection compat library will mitigate the difficulty in making this change, specifically from using `collection.to[Type]` to using `collection.to(Type)`. Probably safe to also include the change for switching from `mutable.MutableList` to `mutable.ListBuffer` here since its surface area is not as user-facing.

#5 - Code generation has changed.
 Like previously mentioned, the migration from `CanBuildFrom` to `Factory` / `BuildFrom` will be necessary, but shouldn't be large, since it's contained to the module `flink-scala` from what I could tell.

#6 - `Seq` in 2.12 and 2.13 mean different things.
 2.12 refers to the mutable version, while 2.13 refers to the immutable version. This is also present in varargs Scala APIs, such as `env.fromCollection("one", "two")`. This poses a problem, as `Seq` is used in each of the Scala modules and is quite prevalent in user-facing APIs. Although the ambiguity for varargs between versions can't be avoided, we could choose to disambiguate typical `Seq` use for 2.12 and 2.13 by enforcing consistent usage. This is likely where most of the work will reside depending on the strategy chosen. The aforementioned migration doc provides a few options, such as internally shadowing `Seq` to make unqualified use illegal (i.e. only allow `immutable.Seq` or `collection.Seq`, not `Seq`). I did notice some issues with the type serializer though with using qualified versions (`immutable.Seq`), so perhaps we want to consider shading `Seq` to always mean one version? Either way, this is a large change and should be discussed early.

Overall, the changes for points 1-5 are relatively easy when separated, while #6 will likely be a larger change considering how much of the blink table code is written in Scala. Open to any and all suggestions. :)


was (Author: nickburkard):
Hi everyone,

I spent the past few days working on this, building on [~ariskoliopoulos]'s work - there's a lot of work that will be needed to properly migrate to supporting 2.13. I was using [this document|https://docs.scala-lang.org/overviews/core/collections-migration-213.html] provided by Scala's website, plus their [compat library|https://github.com/scala/scala-collection-compat], to handle the migration steps. I wanted to get the community's opinion on a few different points. I think the work can be split into a few different tickets to limit overall impact.

#1 - Scala 2.11 is very old, most modern Scala libraries do not support it anymore.
The typical support path is 2.12 and 2.13. Supporting 2.11 while also supporting 2.13 will make this migration process more difficult. If there's consensus to drop 2.11 support, perhaps that should come as part of its own ticket first. This will make upgrading libraries to support 2.13 much easier as well. On the subject of 2.12, I noticed `flink-scala-shell` only supports 2.11 and I am unaware if there's interests to continue upgrading it. Wanted to mention it in case anyone else has insight.

#2 - Flink makes extensive use of implicit type conversions.
A lot of Flink's Scala code, especially in the flink-table submodules, make repeated use of implicit conversions to convert between Java & Scala types. I mention this because `import scala.collection.JavaConversions._` is entirely gone in 2.13, which is what provides said implicit conversions. We can instead start to import `scala.collection.convert.ImplicitConversions._`, which is still present (albeit deprecated) in 2.13. Worth noting this new import is not present in 2.11, it would need to come after point #1. Again, I think this could be its own ticket so as to limit the impact.

#3 - `TraversableOnce` is gone in 2.13.
Internal Java code uses `TraversableOnce` for [two|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerConfigSnapshot.java] [serializers|https://github.com/apache/flink/blob/master/flink-scala/src/main/java/org/apache/flink/api/scala/typeutils/TraversableSerializerSnapshot.java], with comments explicitly stating neither cannot be written in Scala due to constructor limitations with the language. This poses a problem, since as far as I can tell we can't import the Scala compat library to get access to `IterableOnce` in Java code for cross-compiling. We may have to move these two files into separate java folders that maven conditionally compiles based on Scala version - one for `TraversableOnce` with 2.12 and one for `IterableOnce` with 2.13. I was able to get it compiling with this strategy but am open to other suggestions.

#4 - Collection conversion has changed.
The collection compat library will mitigate the difficulty in making this change, specifically from using `collection.to[Type]` to using `collection.to(Type)`. Probably safe to also include the change for switching from `mutable.MutableList` to `mutable.ListBuffer` here since its surface area is not as user-facing.

#5 - Code generation has changed.
Like previously mentioned, the migration from `CanBuildFrom` to `Factory` / `BuildFrom` will be necessary, but shouldn't be large, since it's contained to the module `flink-scala` from what I could tell.

#6 - `Seq` in 2.12 and 2.13 mean different things.
2.12 refers to the mutable version, while 2.13 refers to the immutable version. This is also present in varargs Scala APIs, such as `env.fromCollection("one", "two")`. This poses a problem, as `Seq` is used in each of the Scala modules and is quite prevalent in user-facing APIs. Although the ambiguity for varargs between versions can't be avoided, we could choose to disambiguate typical `Seq` use for 2.12 and 2.13 by enforcing consistent usage. This is likely where most of the work will reside depending on the strategy chosen. The aforementioned migration doc provides a few options, such as internally shadowing `Seq` to make unqualified use illegal (i.e. only allow `immutable.Seq` or `collection.Seq`, not `Seq`). I did notice some issues with the type serializer though with using qualified versions (`immutable.Seq`), so perhaps we want to consider shading `Seq` to always mean one version? Either way, this is a large change and should be discussed early.

Overall, the changes for points 1-5 are relatively easy when separated, while #6 will likely be a larger change considering how much of the blink table code is written in Scala. Open to any and all suggestions. :)

> Add support for Scala 2.13
> --------------------------
>
>                 Key: FLINK-13414
>                 URL: https://issues.apache.org/jira/browse/FLINK-13414
>             Project: Flink
>          Issue Type: New Feature
>          Components: API / Scala
>            Reporter: Chaoran Yu
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)