You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Sean Owen <so...@cloudera.com> on 2017/03/06 09:18:45 UTC

RFC: removing Scala 2.10

Another call for comments on removal of Scala 2.10 support, if you haven't
already. See

https://github.com/apache/spark/pull/17150
http://issues.apache.org/jira/browse/SPARK-19810

I've heard several votes in support and no specific objections at this
point, but wanted to make another call to check for any doubts before I go
ahead for Spark 2.2.

Re: RFC: removing Scala 2.10

Posted by Reynold Xin <rx...@databricks.com>.
Hit sent too soon.

Actually my chart included only clusters on Spark 2.x, ie I excluded 1.x. I
also did one with Spark 1.x and I saw no substantial difference in
distribution for Scala versions. On the question of how many "would be
unable to" upgrade to Scala 2.12, I have no way to find out unless I go
talk to every one of them which is too expensive. My experience with Scala
upgrade, having done a few of them for Spark and for other projects, is
that it is very difficult and frustrating experience.

On Databricks this is actually not an issue at all because our customers
can manage multiple clusters with different versions of Spark easily
(select an old version of Spark with Scala 2.10 in one click).

As engineers, we all love to delete old code and simplify the build (5000
line gone!). In a previous email I said we never deprecated it. After
looking at it more, I realized this we did deprecate it partially: We
updated the docs and added a warning in SparkContext, but didn't announce
it in the release notes (mostly my fault).As a result, even I thought Scala
2.10 wasn't deprecated when I saw no mention of it in the release notes.

(Given we had partially deprecated Scala 2.10 support in Spark 2.1, I feel
less strongly about keeping it.)


Now look at the cost of keeping Scala 2.10: The part that defines Scala
2.10/2.11 support rarely changes, at least until we want to add support for
Scala 2.12 (and we are not adding 2.12 support in Spark 2.2). The actually
cost, which annoys some of us, is just the occasional build breaks (mostly
due to the use of Option.contains). It looks like this happened roughly
once a mont,h and each time it took just a few mins to resolve.

So the cost seems very low. Perhaps we should just deprecate it more
formally in 2.2 given the whole system is set to have it working, and kill
it next release.







On Mon, Mar 6, 2017 at 9:23 PM, Reynold Xin <rx...@databricks.com> wrote:

> Actually my chart included only clusters on Spark 2.x, ie I excluded 1.x.
>
>
> On Mon, Mar 6, 2017 at 8:34 PM Stephen Boesch <ja...@gmail.com> wrote:
>
>> Hi Reynold,
>>
>>  This is not necessarily convincing.  Many installations are still on
>> spark 1.X - including at the large company I work at.  When moving to 2.2 -
>> whenever that might happen -  it would be a reasonable expectation to also
>> move off of an old version of scala.  Of the 30% of customers shown I
>> wonder how many are both ( a) on spark 2.X/scala 2.10 *now **and* ( b)
>> would be unable to manage a transition to scala 2.11/2.12 whenever the move
>> to  spark 2.2 were to happen.
>>
>> stephenb
>>
>>
>>
>> 2017-03-06 19:04 GMT-08:00 Reynold Xin <rx...@databricks.com>:
>>
>> For some reason the previous email didn't show up properly. Trying again.
>>
>> ---------- Forwarded message ----------
>> From: *Reynold Xin*
>> Date: Mon, Mar 6, 2017 at 6:37 PM
>> Subject: Re: RFC: removing Scala 2.10
>> To: Sean Owen <so...@cloudera.com>
>> Cc: dev <de...@spark.apache.org>
>>
>>
>> Thanks for sending an email. I was going to +1 but then I figured I
>> should be data driven. I took a look at the distribution of Scala versions
>> across all the clusters Databricks runs (which is a very high number across
>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>
>> [image: Inline image 1]
>>
>>
>>
>> Given 30% are still on Scala 2.10, I'd say we should officially deprecate
>> Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
>> 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
>> and didn't do anything with Scala 2.10.
>>
>>
>>
>>
>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> Another call for comments on removal of Scala 2.10 support, if you
>> haven't already. See
>>
>> https://github.com/apache/spark/pull/17150
>> http://issues.apache.org/jira/browse/SPARK-19810
>>
>> I've heard several votes in support and no specific objections at this
>> point, but wanted to make another call to check for any doubts before I go
>> ahead for Spark 2.2.
>>
>>
>>
>>
>>

Re: RFC: removing Scala 2.10

Posted by Reynold Xin <rx...@databricks.com>.
Actually my chart included only clusters on Spark 2.x, ie I excluded 1.x.


On Mon, Mar 6, 2017 at 8:34 PM Stephen Boesch <ja...@gmail.com> wrote:

> Hi Reynold,
>
>  This is not necessarily convincing.  Many installations are still on
> spark 1.X - including at the large company I work at.  When moving to 2.2 -
> whenever that might happen -  it would be a reasonable expectation to also
> move off of an old version of scala.  Of the 30% of customers shown I
> wonder how many are both ( a) on spark 2.X/scala 2.10 *now **and* ( b)
> would be unable to manage a transition to scala 2.11/2.12 whenever the move
> to  spark 2.2 were to happen.
>
> stephenb
>
>
>
> 2017-03-06 19:04 GMT-08:00 Reynold Xin <rx...@databricks.com>:
>
> For some reason the previous email didn't show up properly. Trying again.
>
> ---------- Forwarded message ----------
> From: *Reynold Xin*
> Date: Mon, Mar 6, 2017 at 6:37 PM
> Subject: Re: RFC: removing Scala 2.10
> To: Sean Owen <so...@cloudera.com>
> Cc: dev <de...@spark.apache.org>
>
>
> Thanks for sending an email. I was going to +1 but then I figured I should
> be data driven. I took a look at the distribution of Scala versions across
> all the clusters Databricks runs (which is a very high number across a
> variety of tech startups, SMBs, large enterprises, and this is the chart:
>
> [image: Inline image 1]
>
>
>
> Given 30% are still on Scala 2.10, I'd say we should officially deprecate
> Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
> 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
> and didn't do anything with Scala 2.10.
>
>
>
>
> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>
> Another call for comments on removal of Scala 2.10 support, if you haven't
> already. See
>
> https://github.com/apache/spark/pull/17150
> http://issues.apache.org/jira/browse/SPARK-19810
>
> I've heard several votes in support and no specific objections at this
> point, but wanted to make another call to check for any doubts before I go
> ahead for Spark 2.2.
>
>
>
>
>

Re: RFC: removing Scala 2.10

Posted by Stephen Boesch <ja...@gmail.com>.
Hi Reynold,

 This is not necessarily convincing.  Many installations are still on spark
1.X - including at the large company I work at.  When moving to 2.2 -
whenever that might happen -  it would be a reasonable expectation to also
move off of an old version of scala.  Of the 30% of customers shown I
wonder how many are both ( a) on spark 2.X/scala 2.10 *now **and* ( b)
would be unable to manage a transition to scala 2.11/2.12 whenever the move
to  spark 2.2 were to happen.

stephenb



2017-03-06 19:04 GMT-08:00 Reynold Xin <rx...@databricks.com>:

> For some reason the previous email didn't show up properly. Trying again.
>
> ---------- Forwarded message ----------
> From: Reynold Xin
> Date: Mon, Mar 6, 2017 at 6:37 PM
> Subject: Re: RFC: removing Scala 2.10
> To: Sean Owen <so...@cloudera.com>
> Cc: dev <de...@spark.apache.org>
>
>
> Thanks for sending an email. I was going to +1 but then I figured I should
> be data driven. I took a look at the distribution of Scala versions across
> all the clusters Databricks runs (which is a very high number across a
> variety of tech startups, SMBs, large enterprises, and this is the chart:
>
> [image: Inline image 1]
>
>
>
> Given 30% are still on Scala 2.10, I'd say we should officially deprecate
> Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
> 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
> and didn't do anything with Scala 2.10.
>
>
>
>
> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> Another call for comments on removal of Scala 2.10 support, if you
>> haven't already. See
>>
>> https://github.com/apache/spark/pull/17150
>> http://issues.apache.org/jira/browse/SPARK-19810
>>
>> I've heard several votes in support and no specific objections at this
>> point, but wanted to make another call to check for any doubts before I go
>> ahead for Spark 2.2.
>>
>
>
>

Fwd: RFC: removing Scala 2.10

Posted by Reynold Xin <rx...@databricks.com>.
For some reason the previous email didn't show up properly. Trying again.

---------- Forwarded message ----------
From: Reynold Xin
Date: Mon, Mar 6, 2017 at 6:37 PM
Subject: Re: RFC: removing Scala 2.10
To: Sean Owen <so...@cloudera.com>
Cc: dev <de...@spark.apache.org>


Thanks for sending an email. I was going to +1 but then I figured I should
be data driven. I took a look at the distribution of Scala versions across
all the clusters Databricks runs (which is a very high number across a
variety of tech startups, SMBs, large enterprises, and this is the chart:

[image: Inline image 1]



Given 30% are still on Scala 2.10, I'd say we should officially deprecate
Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
and didn't do anything with Scala 2.10.




On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:

> Another call for comments on removal of Scala 2.10 support, if you haven't
> already. See
>
> https://github.com/apache/spark/pull/17150
> http://issues.apache.org/jira/browse/SPARK-19810
>
> I've heard several votes in support and no specific objections at this
> point, but wanted to make another call to check for any doubts before I go
> ahead for Spark 2.2.
>

Re: RFC: removing Scala 2.10

Posted by Reynold Xin <rx...@databricks.com>.
SGTM.

On Sat, Jul 8, 2017 at 12:38 AM Sean Owen <so...@cloudera.com> wrote:

> I'd like to go ahead and merge SPARK-19810 after the Spark 2.2.0 release
> is fully done and dusted. Would anyone object at this stage, to removing
> 2.10 support?
>
> On Tue, Mar 7, 2017 at 3:55 PM Mark Hamstra <ma...@clearstorydata.com>
> wrote:
>
>> Sorry, for some reason I was thinking that we have branch-2.2 cut
>> already. If we're not going to pull Scala 2.10 out of 2.2.0, then we should
>> wait at least until that branch is cut before we pull it out of master --
>> but I'd still argue for not long after that so that the 2.12 work can start.
>>
>> On Tue, Mar 7, 2017 at 7:50 AM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>>> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>>>> will be important.
>>>>
>>>
>>> Which is why I'm thinking that we should pull 2.10 support out of master
>>> soon -- either immediately or right after 2.2 goes into RC or full release.
>>>
>>> On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> (2.10 was already deprecated for 2.1, so that's done actually.)
>>>>
>>>> Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
>>>> Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
>>>> it, might be worth moving on right now, but if there's any substantive
>>>> argument against, I'd also punt it another release.
>>>>
>>>> It's not really driven by cleanup, though that's nice, but 2.12
>>>> support. I don't think 2.10 and 2.12 support can coexist, and soon, 2.12
>>>> support will be important.
>>>>
>>>> How about tagging this for 2.3.0, as well as targeting 2.12 support for
>>>> 2.3.0?
>>>>
>>>> On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
>>>>
>>>>> Thanks for sending an email. I was going to +1 but then I figured I
>>>>> should be data driven. I took a look at the distribution of Scala versions
>>>>> across all the clusters Databricks runs (which is a very high number across
>>>>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>>>>
>>>>> [image: scala-version.png]
>>>>>
>>>>>
>>>>>
>>>>> Given 30% are still on Scala 2.10, I'd say we should officially
>>>>> deprecate Scala 2.10 in Spark 2.2 and remove the support in a future
>>>>> release (e.g. 2.3). Note that in the past we only deprecated Java 7 /
>>>>> Python 2.6 in 2.0, and didn't do anything with Scala 2.10.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>> Another call for comments on removal of Scala 2.10 support, if you
>>>>> haven't already. See
>>>>>
>>>>> https://github.com/apache/spark/pull/17150
>>>>> http://issues.apache.org/jira/browse/SPARK-19810
>>>>>
>>>>> I've heard several votes in support and no specific objections at this
>>>>> point, but wanted to make another call to check for any doubts before I go
>>>>> ahead for Spark 2.2.
>>>>>
>>>>>
>>>>>
>>>
>>

Re: RFC: removing Scala 2.10

Posted by Russell Spitzer <ru...@gmail.com>.
Look at the date of the first email with the chart :)

On Sat, Jul 8, 2017 at 4:42 PM Chris Fregly <ch...@fregly.com> wrote:

> @reynold: why does this chart end in february?
>
> we're well into july. what is the recent breakdown?
>
> *Chris Fregly*
> Research Engineer @ *PipelineAI*
> Founder @ *Advanced Spark and TensorFlow Meetup*
> *San Francisco - Chicago - Washington DC - London*
>
> On Jul 8, 2017, 9:01 AM -0700, Sean Owen <so...@cloudera.com>, wrote:
>
> I'd like to go ahead and merge SPARK-19810 after the Spark 2.2.0 release
> is fully done and dusted. Would anyone object at this stage, to removing
> 2.10 support?
>
> On Tue, Mar 7, 2017 at 3:55 PM Mark Hamstra <ma...@clearstorydata.com>
> wrote:
>
>> Sorry, for some reason I was thinking that we have branch-2.2 cut
>> already. If we're not going to pull Scala 2.10 out of 2.2.0, then we should
>> wait at least until that branch is cut before we pull it out of master --
>> but I'd still argue for not long after that so that the 2.12 work can start.
>>
>> On Tue, Mar 7, 2017 at 7:50 AM, Mark Hamstra <ma...@clearstorydata.com>
>> wrote:
>>
>>> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>>>> will be important.
>>>>
>>>
>>> Which is why I'm thinking that we should pull 2.10 support out of master
>>> soon -- either immediately or right after 2.2 goes into RC or full release.
>>>
>>> On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> (2.10 was already deprecated for 2.1, so that's done actually.)
>>>>
>>>> Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
>>>> Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
>>>> it, might be worth moving on right now, but if there's any substantive
>>>> argument against, I'd also punt it another release.
>>>>
>>>> It's not really driven by cleanup, though that's nice, but 2.12
>>>> support. I don't think 2.10 and 2.12 support can coexist, and soon, 2.12
>>>> support will be important.
>>>>
>>>> How about tagging this for 2.3.0, as well as targeting 2.12 support for
>>>> 2.3.0?
>>>>
>>>> On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
>>>>
>>>>> Thanks for sending an email. I was going to +1 but then I figured I
>>>>> should be data driven. I took a look at the distribution of Scala versions
>>>>> across all the clusters Databricks runs (which is a very high number across
>>>>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>>>>
>>>>> [image: scala-version.png]
>>>>>
>>>>>
>>>>>
>>>>> Given 30% are still on Scala 2.10, I'd say we should officially
>>>>> deprecate Scala 2.10 in Spark 2.2 and remove the support in a future
>>>>> release (e.g. 2.3). Note that in the past we only deprecated Java 7 /
>>>>> Python 2.6 in 2.0, and didn't do anything with Scala 2.10.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>> Another call for comments on removal of Scala 2.10 support, if you
>>>>> haven't already. See
>>>>>
>>>>> https://github.com/apache/spark/pull/17150
>>>>> http://issues.apache.org/jira/browse/SPARK-19810
>>>>>
>>>>> I've heard several votes in support and no specific objections at this
>>>>> point, but wanted to make another call to check for any doubts before I go
>>>>> ahead for Spark 2.2.
>>>>>
>>>>>
>>>>>
>>>
>>

Re: RFC: removing Scala 2.10

Posted by Chris Fregly <ch...@fregly.com>.
@reynold: why does this chart end in february?

we're well into july. what is the recent breakdown?

Chris Fregly
Research Engineer @ PipelineAI
Founder @ Advanced Spark and TensorFlow Meetup
San Francisco - Chicago - Washington DC - London

On Jul 8, 2017, 9:01 AM -0700, Sean Owen <so...@cloudera.com>, wrote:
> I'd like to go ahead and merge SPARK-19810 after the Spark 2.2.0 release is fully done and dusted. Would anyone object at this stage, to removing 2.10 support?
>
> > On Tue, Mar 7, 2017 at 3:55 PM Mark Hamstra <ma...@clearstorydata.com> wrote:
> > > Sorry, for some reason I was thinking that we have branch-2.2 cut already. If we're not going to pull Scala 2.10 out of 2.2.0, then we should wait at least until that branch is cut before we pull it out of master -- but I'd still argue for not long after that so that the 2.12 work can start.
> > >
> > > > On Tue, Mar 7, 2017 at 7:50 AM, Mark Hamstra <ma...@clearstorydata.com> wrote:
> > > > > > I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support will be important.
> > > > >
> > > > > Which is why I'm thinking that we should pull 2.10 support out of master soon -- either immediately or right after 2.2 goes into RC or full release.
> > > > >
> > > > > > On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:
> > > > > > > (2.10 was already deprecated for 2.1, so that's done actually.)
> > > > > > >
> > > > > > > Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is Scala 2.11-only for Spark 2.) If there were no voices in support of keeping it, might be worth moving on right now, but if there's any substantive argument against, I'd also punt it another release.
> > > > > > >
> > > > > > > It's not really driven by cleanup, though that's nice, but 2.12 support. I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support will be important.
> > > > > > >
> > > > > > > How about tagging this for 2.3.0, as well as targeting 2.12 support for 2.3.0?
> > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
> > > > > > > > > Thanks for sending an email. I was going to +1 but then I figured I should be data driven. I took a look at the distribution of Scala versions across all the clusters Databricks runs (which is a very high number across a variety of tech startups, SMBs, large enterprises, and this is the chart:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Given 30% are still on Scala 2.10, I'd say we should officially deprecate Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g. 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0, and didn't do anything with Scala 2.10.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
> > > > > > > > > > Another call for comments on removal of Scala 2.10 support, if you haven't already. See
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/spark/pull/17150
> > > > > > > > > > http://issues.apache.org/jira/browse/SPARK-19810
> > > > > > > > > >
> > > > > > > > > > I've heard several votes in support and no specific objections at this point, but wanted to make another call to check for any doubts before I go ahead for Spark 2.2.
> > > > > > > > >
> > > > >
> > >

Re: RFC: removing Scala 2.10

Posted by Sean Owen <so...@cloudera.com>.
I'd like to go ahead and merge SPARK-19810 after the Spark 2.2.0 release is
fully done and dusted. Would anyone object at this stage, to removing 2.10
support?

On Tue, Mar 7, 2017 at 3:55 PM Mark Hamstra <ma...@clearstorydata.com> wrote:

> Sorry, for some reason I was thinking that we have branch-2.2 cut already.
> If we're not going to pull Scala 2.10 out of 2.2.0, then we should wait at
> least until that branch is cut before we pull it out of master -- but I'd
> still argue for not long after that so that the 2.12 work can start.
>
> On Tue, Mar 7, 2017 at 7:50 AM, Mark Hamstra <ma...@clearstorydata.com>
> wrote:
>
>> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>>> will be important.
>>>
>>
>> Which is why I'm thinking that we should pull 2.10 support out of master
>> soon -- either immediately or right after 2.2 goes into RC or full release.
>>
>> On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> (2.10 was already deprecated for 2.1, so that's done actually.)
>>>
>>> Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
>>> Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
>>> it, might be worth moving on right now, but if there's any substantive
>>> argument against, I'd also punt it another release.
>>>
>>> It's not really driven by cleanup, though that's nice, but 2.12 support.
>>> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>>> will be important.
>>>
>>> How about tagging this for 2.3.0, as well as targeting 2.12 support for
>>> 2.3.0?
>>>
>>> On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
>>>
>>>> Thanks for sending an email. I was going to +1 but then I figured I
>>>> should be data driven. I took a look at the distribution of Scala versions
>>>> across all the clusters Databricks runs (which is a very high number across
>>>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>>>
>>>> [image: scala-version.png]
>>>>
>>>>
>>>>
>>>> Given 30% are still on Scala 2.10, I'd say we should officially
>>>> deprecate Scala 2.10 in Spark 2.2 and remove the support in a future
>>>> release (e.g. 2.3). Note that in the past we only deprecated Java 7 /
>>>> Python 2.6 in 2.0, and didn't do anything with Scala 2.10.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>
>>>> Another call for comments on removal of Scala 2.10 support, if you
>>>> haven't already. See
>>>>
>>>> https://github.com/apache/spark/pull/17150
>>>> http://issues.apache.org/jira/browse/SPARK-19810
>>>>
>>>> I've heard several votes in support and no specific objections at this
>>>> point, but wanted to make another call to check for any doubts before I go
>>>> ahead for Spark 2.2.
>>>>
>>>>
>>>>
>>
>

Re: RFC: removing Scala 2.10

Posted by Mark Hamstra <ma...@clearstorydata.com>.
Sorry, for some reason I was thinking that we have branch-2.2 cut already.
If we're not going to pull Scala 2.10 out of 2.2.0, then we should wait at
least until that branch is cut before we pull it out of master -- but I'd
still argue for not long after that so that the 2.12 work can start.

On Tue, Mar 7, 2017 at 7:50 AM, Mark Hamstra <ma...@clearstorydata.com>
wrote:

> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>> will be important.
>>
>
> Which is why I'm thinking that we should pull 2.10 support out of master
> soon -- either immediately or right after 2.2 goes into RC or full release.
>
> On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:
>
>> (2.10 was already deprecated for 2.1, so that's done actually.)
>>
>> Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
>> Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
>> it, might be worth moving on right now, but if there's any substantive
>> argument against, I'd also punt it another release.
>>
>> It's not really driven by cleanup, though that's nice, but 2.12 support.
>> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
>> will be important.
>>
>> How about tagging this for 2.3.0, as well as targeting 2.12 support for
>> 2.3.0?
>>
>> On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
>>
>>> Thanks for sending an email. I was going to +1 but then I figured I
>>> should be data driven. I took a look at the distribution of Scala versions
>>> across all the clusters Databricks runs (which is a very high number across
>>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>>
>>> [image: scala-version.png]
>>>
>>>
>>>
>>> Given 30% are still on Scala 2.10, I'd say we should officially
>>> deprecate Scala 2.10 in Spark 2.2 and remove the support in a future
>>> release (e.g. 2.3). Note that in the past we only deprecated Java 7 /
>>> Python 2.6 in 2.0, and didn't do anything with Scala 2.10.
>>>
>>>
>>>
>>>
>>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>>
>>> Another call for comments on removal of Scala 2.10 support, if you
>>> haven't already. See
>>>
>>> https://github.com/apache/spark/pull/17150
>>> http://issues.apache.org/jira/browse/SPARK-19810
>>>
>>> I've heard several votes in support and no specific objections at this
>>> point, but wanted to make another call to check for any doubts before I go
>>> ahead for Spark 2.2.
>>>
>>>
>>>
>

Re: RFC: removing Scala 2.10

Posted by Mark Hamstra <ma...@clearstorydata.com>.
>
> I don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support
> will be important.
>

Which is why I'm thinking that we should pull 2.10 support out of master
soon -- either immediately or right after 2.2 goes into RC or full release.

On Tue, Mar 7, 2017 at 2:14 AM, Sean Owen <so...@cloudera.com> wrote:

> (2.10 was already deprecated for 2.1, so that's done actually.)
>
> Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
> Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
> it, might be worth moving on right now, but if there's any substantive
> argument against, I'd also punt it another release.
>
> It's not really driven by cleanup, though that's nice, but 2.12 support. I
> don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support will
> be important.
>
> How about tagging this for 2.3.0, as well as targeting 2.12 support for
> 2.3.0?
>
> On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:
>
>> Thanks for sending an email. I was going to +1 but then I figured I
>> should be data driven. I took a look at the distribution of Scala versions
>> across all the clusters Databricks runs (which is a very high number across
>> a variety of tech startups, SMBs, large enterprises, and this is the chart:
>>
>> [image: scala-version.png]
>>
>>
>>
>> Given 30% are still on Scala 2.10, I'd say we should officially deprecate
>> Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
>> 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
>> and didn't do anything with Scala 2.10.
>>
>>
>>
>>
>> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> Another call for comments on removal of Scala 2.10 support, if you
>> haven't already. See
>>
>> https://github.com/apache/spark/pull/17150
>> http://issues.apache.org/jira/browse/SPARK-19810
>>
>> I've heard several votes in support and no specific objections at this
>> point, but wanted to make another call to check for any doubts before I go
>> ahead for Spark 2.2.
>>
>>
>>

Re: RFC: removing Scala 2.10

Posted by Sean Owen <so...@cloudera.com>.
(2.10 was already deprecated for 2.1, so that's done actually.)

Personally I'm fine with leaving in 2.10 support for 2.2. (FWIW CDH is
Scala 2.11-only for Spark 2.) If there were no voices in support of keeping
it, might be worth moving on right now, but if there's any substantive
argument against, I'd also punt it another release.

It's not really driven by cleanup, though that's nice, but 2.12 support. I
don't think 2.10 and 2.12 support can coexist, and soon, 2.12 support will
be important.

How about tagging this for 2.3.0, as well as targeting 2.12 support for
2.3.0?

On Tue, Mar 7, 2017 at 2:38 AM Reynold Xin <rx...@databricks.com> wrote:

> Thanks for sending an email. I was going to +1 but then I figured I should
> be data driven. I took a look at the distribution of Scala versions across
> all the clusters Databricks runs (which is a very high number across a
> variety of tech startups, SMBs, large enterprises, and this is the chart:
>
> [image: scala-version.png]
>
>
>
> Given 30% are still on Scala 2.10, I'd say we should officially deprecate
> Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
> 2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
> and didn't do anything with Scala 2.10.
>
>
>
>
> On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:
>
> Another call for comments on removal of Scala 2.10 support, if you haven't
> already. See
>
> https://github.com/apache/spark/pull/17150
> http://issues.apache.org/jira/browse/SPARK-19810
>
> I've heard several votes in support and no specific objections at this
> point, but wanted to make another call to check for any doubts before I go
> ahead for Spark 2.2.
>
>
>

Re: RFC: removing Scala 2.10

Posted by Reynold Xin <rx...@databricks.com>.
Thanks for sending an email. I was going to +1 but then I figured I should
be data driven. I took a look at the distribution of Scala versions across
all the clusters Databricks runs (which is a very high number across a
variety of tech startups, SMBs, large enterprises, and this is the chart:

[image: Inline image 1]



Given 30% are still on Scala 2.10, I'd say we should officially deprecate
Scala 2.10 in Spark 2.2 and remove the support in a future release (e.g.
2.3). Note that in the past we only deprecated Java 7 / Python 2.6 in 2.0,
and didn't do anything with Scala 2.10.




On Mon, Mar 6, 2017 at 1:18 AM, Sean Owen <so...@cloudera.com> wrote:

> Another call for comments on removal of Scala 2.10 support, if you haven't
> already. See
>
> https://github.com/apache/spark/pull/17150
> http://issues.apache.org/jira/browse/SPARK-19810
>
> I've heard several votes in support and no specific objections at this
> point, but wanted to make another call to check for any doubts before I go
> ahead for Spark 2.2.
>