You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Igor Dvorzhak <id...@google.com.INVALID> on 2021/12/03 18:04:02 UTC

Re: Scala 3 support approach

Are there any plans to support Scala 3 in the upcoming Spark 3.3 release?

On Sun, Oct 18, 2020 at 11:10 PM Dongjoon Hyun <do...@gmail.com>
wrote:

> Hi, Koert.
>
> We know, welcome, and believe it. However, it's only Scala community's
> roadmap so far. It doesn't mean Apache Spark supports Scala 3 officially.
>
> For example, Apache Spark 3.0.1 supports Scala 2.12.10 but not 2.12.12 due
> to Scala issue.
>
> In Apache Spark community, we had better focus on 2.13. After that, we
> will see what is needed for Scala 3.
>
> Bests,
> Dongjoon.
>
> On Sun, Oct 18, 2020 at 1:33 PM Koert Kuipers <ko...@tresata.com> wrote:
>
>> i think scala 3.0 will be able to use libraries built with Scala 2.13 (as
>> long as they dont use macros)
>>
>> see:
>> https://www.scala-lang.org/2019/12/18/road-to-scala-3.html
>>
>> On Sun, Oct 18, 2020 at 9:54 AM Sean Owen <sr...@apache.org> wrote:
>>
>>> Spark depends on a number of Scala libraries, so needs them all to
>>> support version X before Spark can. This only happened for 2.13 about 4-5
>>> months ago. I wonder if even a fraction of the necessary libraries have 3.0
>>> support yet?
>>>
>>> It can be difficult to test and support multiple Scala versions
>>> simultaneously. 2.11 has already been dropped and 2.13 is coming, but it
>>> might be hard to have a code base that works for 2.12, 2.13, and 3.0.
>>>
>>> So one dependency could be, when can 2.12 be dropped? And with Spark
>>> supporting 2.13 only early next year, and user apps migrating over a year
>>> or more, it seems difficult to do that anytime soon.
>>>
>>> I think Spark 3 support is eventually desirable, so maybe the other way
>>> to resolve that is to show that Spark 3 support doesn't interfere much with
>>> maintenance of 2.12/2.13 support. I am a little bit skeptical of it, just
>>> because the 2.11->2.12 and 2.12->2.13 changes were fairly significant, let
>>> alone 2.13->3.0 I'm sure, but I don't know.
>>>
>>> That is, if we start to have to implement workarounds are parallel code
>>> trees and so on for 3.0 support, and if it can't be completed for a while
>>> to come because of downstream dependencies, then it may not be worth
>>> iterating in the code base yet or even considering.
>>>
>>> You can file an umbrella JIRA to track it, yes, with a possible target
>>> of Spark 4.0. Non-intrusive changes can go in anytime. We may not want to
>>> get into major ones until later.
>>>
>>> On Sat, Oct 17, 2020 at 8:49 PM gemelen <ge...@gmail.com> wrote:
>>>
>>>> Hi all!
>>>>
>>>> I'd like to ask for an opinion and discuss the next thing:
>>>> at this moment in general Spark could be built with Scala 2.11 and 2.12
>>>> (mostly), and close to the point to have support for Scala 2.13. On the
>>>> other hand, Scala 3 is going into the pre-release phase (with 3.0.0-M1
>>>> released at the beginning of October).
>>>>
>>>> Previously, support of the current Scala version by Spark was a bit
>>>> behind of desired state, dictated by all circumstances. To move things
>>>> differently with Scala 3 I'd like to contribute my efforts (and help others
>>>> if there would be any) to support it starting as soon as possible (ie to
>>>> have Spark build compiled with Scala 3 and to have release artifacts when
>>>> it would be possible).
>>>>
>>>> I suggest that it would require to add an experimental profile to the
>>>> build file so further changes to compile, test and run other tasks could be
>>>> done in incremental manner (with respect to compatibility with current code
>>>> for versions 2.12 and 2.13 and backporting where possible). I'd like to do
>>>> it that way since I do not represent any company, contribute in my own time
>>>> and thus cannot guarantee consistent time spent on this (so just in case of
>>>> anything such contribution would not be left in the fork repo).
>>>>
>>>> In fact, with recent changes to move Spark build to use the latest SBT,
>>>> such starting changes are pretty small on the SBT side (about 10 LOC) and I
>>>> was already able to see how build fails with Scala 3 compiler :)
>>>>
>>>> To summarize:
>>>> 1. Is this approach suitable for the project at this moment, so it
>>>> would be accepted and accounted for in the release schedule (in 2021 I
>>>> assume)?
>>>> 2. how should it be filed, as an umbrella Jira ticket with minor tasks
>>>> or as a SPIP at first with more thorough analysis?
>>>>
>>>

Re: Scala 3 support approach

Posted by Igor Dvorzhak <id...@google.com.INVALID>.
Some people tried it and it seems to work but requires some leg work:
https://medium.com/virtuslab/scala-3-and-spark-389f7ecef71b

+Filip, author of the article, maybe you have an idea how hard it will be
to officially support Scala 3 in Spark code base given that it already
supports Scala 2.13?

On Fri, Dec 3, 2021 at 10:20 AM Sean Owen <sr...@apache.org> wrote:

> I don't think anyone's tested it or tried it, but if it's pretty
> compatible with 2.13, it may already work, or mostly.
>
> See my answer below, which still stands: if it's not pretty compatible
> with 2.13 and needs a new build, this effectively means dropping 2.12
> support, as supporting 3 Scala versions is a bit too much at once.
> And the downstream library dependencies are still likely a partial problem.
>
> Have you or anyone interested in this tried it out? that's the best way to
> make progress.
> I do not think this would go into any Spark release on the horizon.
>
> On Fri, Dec 3, 2021 at 12:04 PM Igor Dvorzhak <id...@google.com> wrote:
>
>> Are there any plans to support Scala 3 in the upcoming Spark 3.3 release?
>>
>> On Sun, Oct 18, 2020 at 11:10 PM Dongjoon Hyun <do...@gmail.com>
>> wrote:
>>
>>> Hi, Koert.
>>>
>>> We know, welcome, and believe it. However, it's only Scala community's
>>> roadmap so far. It doesn't mean Apache Spark supports Scala 3 officially.
>>>
>>> For example, Apache Spark 3.0.1 supports Scala 2.12.10 but not 2.12.12
>>> due to Scala issue.
>>>
>>> In Apache Spark community, we had better focus on 2.13. After that, we
>>> will see what is needed for Scala 3.
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Sun, Oct 18, 2020 at 1:33 PM Koert Kuipers <ko...@tresata.com> wrote:
>>>
>>>> i think scala 3.0 will be able to use libraries built with Scala 2.13
>>>> (as long as they dont use macros)
>>>>
>>>> see:
>>>> https://www.scala-lang.org/2019/12/18/road-to-scala-3.html
>>>>
>>>> On Sun, Oct 18, 2020 at 9:54 AM Sean Owen <sr...@apache.org> wrote:
>>>>
>>>>> Spark depends on a number of Scala libraries, so needs them all to
>>>>> support version X before Spark can. This only happened for 2.13 about 4-5
>>>>> months ago. I wonder if even a fraction of the necessary libraries have 3.0
>>>>> support yet?
>>>>>
>>>>> It can be difficult to test and support multiple Scala versions
>>>>> simultaneously. 2.11 has already been dropped and 2.13 is coming, but it
>>>>> might be hard to have a code base that works for 2.12, 2.13, and 3.0.
>>>>>
>>>>> So one dependency could be, when can 2.12 be dropped? And with Spark
>>>>> supporting 2.13 only early next year, and user apps migrating over a year
>>>>> or more, it seems difficult to do that anytime soon.
>>>>>
>>>>> I think Spark 3 support is eventually desirable, so maybe the other
>>>>> way to resolve that is to show that Spark 3 support doesn't interfere much
>>>>> with maintenance of 2.12/2.13 support. I am a little bit skeptical of it,
>>>>> just because the 2.11->2.12 and 2.12->2.13 changes were fairly significant,
>>>>> let alone 2.13->3.0 I'm sure, but I don't know.
>>>>>
>>>>> That is, if we start to have to implement workarounds are parallel
>>>>> code trees and so on for 3.0 support, and if it can't be completed for a
>>>>> while to come because of downstream dependencies, then it may not be worth
>>>>> iterating in the code base yet or even considering.
>>>>>
>>>>> You can file an umbrella JIRA to track it, yes, with a possible target
>>>>> of Spark 4.0. Non-intrusive changes can go in anytime. We may not want to
>>>>> get into major ones until later.
>>>>>
>>>>> On Sat, Oct 17, 2020 at 8:49 PM gemelen <ge...@gmail.com> wrote:
>>>>>
>>>>>> Hi all!
>>>>>>
>>>>>> I'd like to ask for an opinion and discuss the next thing:
>>>>>> at this moment in general Spark could be built with Scala 2.11 and
>>>>>> 2.12 (mostly), and close to the point to have support for Scala 2.13. On
>>>>>> the other hand, Scala 3 is going into the pre-release phase (with 3.0.0-M1
>>>>>> released at the beginning of October).
>>>>>>
>>>>>> Previously, support of the current Scala version by Spark was a bit
>>>>>> behind of desired state, dictated by all circumstances. To move things
>>>>>> differently with Scala 3 I'd like to contribute my efforts (and help others
>>>>>> if there would be any) to support it starting as soon as possible (ie to
>>>>>> have Spark build compiled with Scala 3 and to have release artifacts when
>>>>>> it would be possible).
>>>>>>
>>>>>> I suggest that it would require to add an experimental profile to the
>>>>>> build file so further changes to compile, test and run other tasks could be
>>>>>> done in incremental manner (with respect to compatibility with current code
>>>>>> for versions 2.12 and 2.13 and backporting where possible). I'd like to do
>>>>>> it that way since I do not represent any company, contribute in my own time
>>>>>> and thus cannot guarantee consistent time spent on this (so just in case of
>>>>>> anything such contribution would not be left in the fork repo).
>>>>>>
>>>>>> In fact, with recent changes to move Spark build to use the latest
>>>>>> SBT, such starting changes are pretty small on the SBT side (about 10 LOC)
>>>>>> and I was already able to see how build fails with Scala 3 compiler :)
>>>>>>
>>>>>> To summarize:
>>>>>> 1. Is this approach suitable for the project at this moment, so it
>>>>>> would be accepted and accounted for in the release schedule (in 2021 I
>>>>>> assume)?
>>>>>> 2. how should it be filed, as an umbrella Jira ticket with minor
>>>>>> tasks or as a SPIP at first with more thorough analysis?
>>>>>>
>>>>>

Re: Scala 3 support approach

Posted by Sean Owen <sr...@apache.org>.
I don't think anyone's tested it or tried it, but if it's pretty compatible
with 2.13, it may already work, or mostly.

See my answer below, which still stands: if it's not pretty compatible with
2.13 and needs a new build, this effectively means dropping 2.12 support,
as supporting 3 Scala versions is a bit too much at once.
And the downstream library dependencies are still likely a partial problem.

Have you or anyone interested in this tried it out? that's the best way to
make progress.
I do not think this would go into any Spark release on the horizon.

On Fri, Dec 3, 2021 at 12:04 PM Igor Dvorzhak <id...@google.com> wrote:

> Are there any plans to support Scala 3 in the upcoming Spark 3.3 release?
>
> On Sun, Oct 18, 2020 at 11:10 PM Dongjoon Hyun <do...@gmail.com>
> wrote:
>
>> Hi, Koert.
>>
>> We know, welcome, and believe it. However, it's only Scala community's
>> roadmap so far. It doesn't mean Apache Spark supports Scala 3 officially.
>>
>> For example, Apache Spark 3.0.1 supports Scala 2.12.10 but not 2.12.12
>> due to Scala issue.
>>
>> In Apache Spark community, we had better focus on 2.13. After that, we
>> will see what is needed for Scala 3.
>>
>> Bests,
>> Dongjoon.
>>
>> On Sun, Oct 18, 2020 at 1:33 PM Koert Kuipers <ko...@tresata.com> wrote:
>>
>>> i think scala 3.0 will be able to use libraries built with Scala 2.13
>>> (as long as they dont use macros)
>>>
>>> see:
>>> https://www.scala-lang.org/2019/12/18/road-to-scala-3.html
>>>
>>> On Sun, Oct 18, 2020 at 9:54 AM Sean Owen <sr...@apache.org> wrote:
>>>
>>>> Spark depends on a number of Scala libraries, so needs them all to
>>>> support version X before Spark can. This only happened for 2.13 about 4-5
>>>> months ago. I wonder if even a fraction of the necessary libraries have 3.0
>>>> support yet?
>>>>
>>>> It can be difficult to test and support multiple Scala versions
>>>> simultaneously. 2.11 has already been dropped and 2.13 is coming, but it
>>>> might be hard to have a code base that works for 2.12, 2.13, and 3.0.
>>>>
>>>> So one dependency could be, when can 2.12 be dropped? And with Spark
>>>> supporting 2.13 only early next year, and user apps migrating over a year
>>>> or more, it seems difficult to do that anytime soon.
>>>>
>>>> I think Spark 3 support is eventually desirable, so maybe the other way
>>>> to resolve that is to show that Spark 3 support doesn't interfere much with
>>>> maintenance of 2.12/2.13 support. I am a little bit skeptical of it, just
>>>> because the 2.11->2.12 and 2.12->2.13 changes were fairly significant, let
>>>> alone 2.13->3.0 I'm sure, but I don't know.
>>>>
>>>> That is, if we start to have to implement workarounds are parallel code
>>>> trees and so on for 3.0 support, and if it can't be completed for a while
>>>> to come because of downstream dependencies, then it may not be worth
>>>> iterating in the code base yet or even considering.
>>>>
>>>> You can file an umbrella JIRA to track it, yes, with a possible target
>>>> of Spark 4.0. Non-intrusive changes can go in anytime. We may not want to
>>>> get into major ones until later.
>>>>
>>>> On Sat, Oct 17, 2020 at 8:49 PM gemelen <ge...@gmail.com> wrote:
>>>>
>>>>> Hi all!
>>>>>
>>>>> I'd like to ask for an opinion and discuss the next thing:
>>>>> at this moment in general Spark could be built with Scala 2.11 and
>>>>> 2.12 (mostly), and close to the point to have support for Scala 2.13. On
>>>>> the other hand, Scala 3 is going into the pre-release phase (with 3.0.0-M1
>>>>> released at the beginning of October).
>>>>>
>>>>> Previously, support of the current Scala version by Spark was a bit
>>>>> behind of desired state, dictated by all circumstances. To move things
>>>>> differently with Scala 3 I'd like to contribute my efforts (and help others
>>>>> if there would be any) to support it starting as soon as possible (ie to
>>>>> have Spark build compiled with Scala 3 and to have release artifacts when
>>>>> it would be possible).
>>>>>
>>>>> I suggest that it would require to add an experimental profile to the
>>>>> build file so further changes to compile, test and run other tasks could be
>>>>> done in incremental manner (with respect to compatibility with current code
>>>>> for versions 2.12 and 2.13 and backporting where possible). I'd like to do
>>>>> it that way since I do not represent any company, contribute in my own time
>>>>> and thus cannot guarantee consistent time spent on this (so just in case of
>>>>> anything such contribution would not be left in the fork repo).
>>>>>
>>>>> In fact, with recent changes to move Spark build to use the latest
>>>>> SBT, such starting changes are pretty small on the SBT side (about 10 LOC)
>>>>> and I was already able to see how build fails with Scala 3 compiler :)
>>>>>
>>>>> To summarize:
>>>>> 1. Is this approach suitable for the project at this moment, so it
>>>>> would be accepted and accounted for in the release schedule (in 2021 I
>>>>> assume)?
>>>>> 2. how should it be filed, as an umbrella Jira ticket with minor tasks
>>>>> or as a SPIP at first with more thorough analysis?
>>>>>
>>>>