You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemml.apache.org by Glenn Weidner <gw...@us.ibm.com> on 2016/08/02 18:44:35 UTC

[DISCUSS] Migration to Spark 2.0.0


In the "[DISCUSS] SystemML 0.11 release" thread, native frame support and
API updates such as new MLContext were identified as main new features for
the release.  In addition, support for Spark 2.0.0 was targeted.
Note code changes required for Spark 2.0.0 are not backward compatible to
earlier Spark versions (e.g., 1.6.2) so starting separate mail thread for
anyone to raise objections/alternatives for migrating to Spark 2.0.0.

One possible option is to do a release to include the new Apache SystemML
features before migrating to Spark 2.0.0.  However, it seems better to have
the next Apache SystemML release compatible with latest Spark version
2.0.0.  The Apache SystemML 0.10 release from June can be used with earlier
versions of Spark.

Regards,
Glenn

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Matthias Boehm <mb...@us.ibm.com>.
what was the major reason hindering us to support both Spark 1.x and 2.x at
the same time?

Regards,
Matthias



From:	dusenberrymw@gmail.com
To:	dev@systemml.incubator.apache.org
Date:	08/17/2016 07:26 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



Yes, I think this approach sounds great.  To that end, I created a new tag
"0.11.0-incubating-preview" that points to a specific commit that contains
new features that will be in the 0.11 release with specific support for the
Spark 1.x line.


- Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Aug 16, 2016, at 4:44 PM, Frederick R Reiss <fr...@us.ibm.com>
wrote:
>
> I think the approach Glenn proposes here is fine.
>
> Fred
>
> Deron Eriksson ---08/16/2016 02:41:51 PM---Hi Glenn, I am fine with this
approach. If this approach is taken, I would like to
>
> From: Deron Eriksson <de...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 08/16/2016 02:41 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
>
>
>
>
> Hi Glenn,
>
> I am fine with this approach. If this approach is taken, I would like to
> set the documentation version in _config.yml to 0.10.x before the project
> is tagged (I recently set it to 0.11).
>
> Deron
>
>
> On Thu, Aug 11, 2016 at 3:40 PM, Glenn Weidner <gw...@us.ibm.com>
wrote:
>
> > I would like to propose an alternative to supporting Spark 2.0 and
Spark
> > 1.x within single stream.
> >
> > 1) Capture snapshot and establish label of current Apache SystemML
master
> > which includes new features added since 0.10.0 release.
> >
> > 2) After step 1 completed, enable master to move forward with support
for
> > Spark 2.x only.
> >
> > This is similar to what Fred initially proposed except step 1 would not
> > involve a separate release. The 0.11 release of Apache SystemML would
be
> > compatible for Spark 2.0 and Scala 2.11.
> >
> > Thanks,
> > Glenn
> >
> > [image: Inactive hide details for Glenn Weidner---08/08/2016 03:33:43
> > PM---As a preliminary experiment in attempt to compile against bo]Glenn
> > Weidner---08/08/2016 03:33:43 PM---As a preliminary experiment in
attempt
> > to compile against both Spark 2.0.0 and Spark 1.6.2 from same
> >
> > From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > Date: 08/08/2016 03:33 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > As a preliminary experiment in attempt to compile against both Spark
2.0.0
> > and Spark 1.6.2 from same code base, I made another set of changes for
> > comparison against previous proposed changes for [SYSTEMML-776].
> > This experimental set can be viewed here:
> >
> >
*https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0*

> > <
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0
>
> >
> > This compiles against Spark 2.0.0 and Spark 1.6.2 except for
fit/transform
> > overrides in LogisticRegression.scala due to:
> > SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs
> >
> > Detailed code comments and suggestions to try out can be made in the
> > branch commit instead of this mail thread.
> >
> > Thanks,
> > Glenn
> >
> > Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
> > supporting Spark 2 and Spark<2 concurrently if someone shows that it
can be
> >
> > From: Deron Eriksson <de...@gmail.com>
> > To: dev@systemml.incubator.apache.org
> > Date: 08/05/2016 02:02 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
> > someone shows that it can be accomplished with minimal inconvenience.
> >
> > However, I would lean towards Fred's approach (Spark 1.6 release
followed
> > shortly by a Spark 2 release). If possible, I want to be able to focus
most
> > of our efforts towards the future rather than the past.
> >
> > Deron
> >
> >
> > On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
> > wrote:
> >
> > > That was going to be my suggestion... In Zeppelin, we just introduced
> > > support for different versions of scala and added support for spark
2.0
> > > based on profiles and a bit of reflections...
> > >
> > > Do we have to do anything related to Scala versions as well ?
> > >
> > > On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com>
wrote:
> > >
> > > > I would recommend to start an investigation if we could support
both
> > the
> > > > 1.x and 2.x lines with a single code base. It seems feasible to
> > refactor
> > > > the code a bit, compile against 2.0 (or with profiles), and run on
> > either
> > > > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > > > Iterable and Iterator, we could overcome the Iterator API change as
> > shown
> > > > by our LazyIterableIterator which did not require any change in
related
> > > > functions. Btw, we did the same for MRv1 and Yarn by ensuring that
on
> > > MRv1,
> > > > we don't touch Yarn related APIs. Similarly on Spark, we already
> > support
> > > > both legacy and >=1.6 memory management. I think this kind of
platform
> > > > independence is really valuable but it obviously adds complexity.
> > > >
> > > > Regards,
> > > > Matthias
> > > >
> > > >
> > > > [image: Inactive hide details for Niketan Pansare---08/03/2016
05:15:21
> > > > PM---I am in favor of having one more release against Spark
1.6]Niketan
> > > > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> > > release
> > > > against Spark 1.6. Since default scala version for Spark 1.
> > > >
> > > > From: Niketan Pansare/Almaden/IBM@IBMUS
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e
(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/03/2016 05:15 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > I am in favor of having one more release against Spark 1.6. Since
> > default
> > > > scala version for Spark 1.6 is 2.10, I recommend either having
SystemML
> > > > compiled and released with Scala 2.10 profile or having two release
> > > > candidates.
> > > >
> > > > Thanks,
> > > >
> > > > Niketan Pansare
> > > > IBM Almaden Research Center
> > > > E-mail: npansar At us.ibm.com
> > > > *http://researcher.watson.ibm.com/researcher/view.php?
> > person=us-npansar*
> > > > <
> >
*http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>>
> > > >
> > > > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that
getting
> > > > onto Spark 2.0 quickly ought to be a priority, there are existing
> > early u
> > > >
> > > > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e
(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/03/2016 03:58 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > While I agree that getting onto Spark 2.0 quickly ought to be a
> > priority,
> > > > there are existing early users of SystemML who are likely stuck on
> > Spark
> > > > 1.6.x for the next few months. Those users could want some of the
new
> > > > experimental features since 0.10 (specifically frames, the
prototype
> > > Python
> > > > DSL, and the new MLContext) and it would be good to have a Spark
1.6
> > > branch
> > > > of our version tree where we can backport the debugged versions of
> > these
> > > > features if needed.
> > > >
> > > > I would recommend that we do one more SystemML release against
Spark
> > 1.6,
> > > > then switch the head version of SystemML over to Spark 2.0, then
> > > > immediately perform a second SystemML release. Thoughts?
> > > >
> > > > Fred
> > > >
> > > > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
> > favor
> > > > of moving to Spark 2.0 as early as possible. This will allow
SystemML
> > > >
> > > > From: Deron Eriksson <deroneriksson@gmail.com
> > > > <*javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com')*;>>
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e
(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/02/2016 12:13 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > I would definitely be in favor of moving to Spark 2.0 as early as
> > > possible.
> > > > This will allow SystemML to be current with cutting edge Spark. It
> > would
> > > be
> > > > nice to focus our efforts on the latest Spark.
> > > >
> > > > Deron
> > > >
> > > >
> > > > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > > > <*javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com')*;>> wrote:
> > > >
> > > > > I'm in favor of moving to Spark 2.0 now, meaning that our
upcoming
> > > > release
> > > > > would include both new features and 2.0 support.  0.10 has plenty
of
> > > > > functionality for any existing 1.x users.
> > > > >
> > > > > -Mike
> > > > >
> > > > > --
> > > > >
> > > > > Mike Dusenberry
> > > > > GitHub: github.com/dusenberrymw
> > > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > > >
> > > > > Sent from my iPhone.
> > > > >
> > > > >
> > > > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > > > <*javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com')*;>> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
> > support
> > > > and
> > > > > > API updates such as new MLContext were identified as main new
> > > features
> > > > > for
> > > > > > the release.  In addition, support for Spark 2.0.0 was
targeted.
> > > > > > Note code changes required for Spark 2.0.0 are not backward
> > > compatible
> > > > to
> > > > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
> > thread
> > > > for
> > > > > > anyone to raise objections/alternatives for migrating to Spark
> > 2.0.0.
> > > > > >
> > > > > > One possible option is to do a release to include the new
Apache
> > > > SystemML
> > > > > > features before migrating to Spark 2.0.0.  However, it seems
better
> > > to
> > > > > have
> > > > > > the next Apache SystemML release compatible with latest Spark
> > version
> > > > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used
with
> > > > > earlier
> > > > > > versions of Spark.
> > > > > >
> > > > > > Regards,
> > > > > > Glenn
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Sent from my Mobile device
> > >
> >
> >
> >
> >
> >
> >
>
>
>


Re: [DISCUSS] Migration to Spark 2.0.0

Posted by du...@gmail.com.
Yes, I think this approach sounds great.  To that end, I created a new tag "0.11.0-incubating-preview" that points to a specific commit that contains new features that will be in the 0.11 release with specific support for the Spark 1.x line.


- Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Aug 16, 2016, at 4:44 PM, Frederick R Reiss <fr...@us.ibm.com> wrote:
> 
> I think the approach Glenn proposes here is fine.
> 
> Fred
> 
> Deron Eriksson ---08/16/2016 02:41:51 PM---Hi Glenn, I am fine with this approach. If this approach is taken, I would like to
> 
> From: Deron Eriksson <de...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 08/16/2016 02:41 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> 
> 
> 
> 
> Hi Glenn,
> 
> I am fine with this approach. If this approach is taken, I would like to
> set the documentation version in _config.yml to 0.10.x before the project
> is tagged (I recently set it to 0.11).
> 
> Deron
> 
> 
> On Thu, Aug 11, 2016 at 3:40 PM, Glenn Weidner <gw...@us.ibm.com> wrote:
> 
> > I would like to propose an alternative to supporting Spark 2.0 and Spark
> > 1.x within single stream.
> >
> > 1) Capture snapshot and establish label of current Apache SystemML master
> > which includes new features added since 0.10.0 release.
> >
> > 2) After step 1 completed, enable master to move forward with support for
> > Spark 2.x only.
> >
> > This is similar to what Fred initially proposed except step 1 would not
> > involve a separate release. The 0.11 release of Apache SystemML would be
> > compatible for Spark 2.0 and Scala 2.11.
> >
> > Thanks,
> > Glenn
> >
> > [image: Inactive hide details for Glenn Weidner---08/08/2016 03:33:43
> > PM---As a preliminary experiment in attempt to compile against bo]Glenn
> > Weidner---08/08/2016 03:33:43 PM---As a preliminary experiment in attempt
> > to compile against both Spark 2.0.0 and Spark 1.6.2 from same
> >
> > From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > Date: 08/08/2016 03:33 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > As a preliminary experiment in attempt to compile against both Spark 2.0.0
> > and Spark 1.6.2 from same code base, I made another set of changes for
> > comparison against previous proposed changes for [SYSTEMML-776].
> > This experimental set can be viewed here:
> >
> > *https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0*
> > <https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0>
> >
> > This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
> > overrides in LogisticRegression.scala due to:
> > SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs
> >
> > Detailed code comments and suggestions to try out can be made in the
> > branch commit instead of this mail thread.
> >
> > Thanks,
> > Glenn
> >
> > Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
> > supporting Spark 2 and Spark<2 concurrently if someone shows that it can be
> >
> > From: Deron Eriksson <de...@gmail.com>
> > To: dev@systemml.incubator.apache.org
> > Date: 08/05/2016 02:02 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
> > someone shows that it can be accomplished with minimal inconvenience.
> >
> > However, I would lean towards Fred's approach (Spark 1.6 release followed
> > shortly by a Spark 2 release). If possible, I want to be able to focus most
> > of our efforts towards the future rather than the past.
> >
> > Deron
> >
> >
> > On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
> > wrote:
> >
> > > That was going to be my suggestion... In Zeppelin, we just introduced
> > > support for different versions of scala and added support for spark 2.0
> > > based on profiles and a bit of reflections...
> > >
> > > Do we have to do anything related to Scala versions as well ?
> > >
> > > On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
> > >
> > > > I would recommend to start an investigation if we could support both
> > the
> > > > 1.x and 2.x lines with a single code base. It seems feasible to
> > refactor
> > > > the code a bit, compile against 2.0 (or with profiles), and run on
> > either
> > > > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > > > Iterable and Iterator, we could overcome the Iterator API change as
> > shown
> > > > by our LazyIterableIterator which did not require any change in related
> > > > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> > > MRv1,
> > > > we don't touch Yarn related APIs. Similarly on Spark, we already
> > support
> > > > both legacy and >=1.6 memory management. I think this kind of platform
> > > > independence is really valuable but it obviously adds complexity.
> > > >
> > > > Regards,
> > > > Matthias
> > > >
> > > >
> > > > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > > > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > > > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> > > release
> > > > against Spark 1.6. Since default scala version for Spark 1.
> > > >
> > > > From: Niketan Pansare/Almaden/IBM@IBMUS
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/03/2016 05:15 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > I am in favor of having one more release against Spark 1.6. Since
> > default
> > > > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > > > compiled and released with Scala 2.10 profile or having two release
> > > > candidates.
> > > >
> > > > Thanks,
> > > >
> > > > Niketan Pansare
> > > > IBM Almaden Research Center
> > > > E-mail: npansar At us.ibm.com
> > > > *http://researcher.watson.ibm.com/researcher/view.php?
> > person=us-npansar*
> > > > <
> > *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar>>
> > > >
> > > > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > > > onto Spark 2.0 quickly ought to be a priority, there are existing
> > early u
> > > >
> > > > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/03/2016 03:58 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > While I agree that getting onto Spark 2.0 quickly ought to be a
> > priority,
> > > > there are existing early users of SystemML who are likely stuck on
> > Spark
> > > > 1.6.x for the next few months. Those users could want some of the new
> > > > experimental features since 0.10 (specifically frames, the prototype
> > > Python
> > > > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> > > branch
> > > > of our version tree where we can backport the debugged versions of
> > these
> > > > features if needed.
> > > >
> > > > I would recommend that we do one more SystemML release against Spark
> > 1.6,
> > > > then switch the head version of SystemML over to Spark 2.0, then
> > > > immediately perform a second SystemML release. Thoughts?
> > > >
> > > > Fred
> > > >
> > > > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
> > favor
> > > > of moving to Spark 2.0 as early as possible. This will allow SystemML
> > > >
> > > > From: Deron Eriksson <deroneriksson@gmail.com
> > > > <*javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com')*;>>
> > > > To: dev@systemml.incubator.apache.org
> > > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > > Date: 08/02/2016 12:13 PM
> > > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > > ------------------------------
> > > >
> > > >
> > > >
> > > > I would definitely be in favor of moving to Spark 2.0 as early as
> > > possible.
> > > > This will allow SystemML to be current with cutting edge Spark. It
> > would
> > > be
> > > > nice to focus our efforts on the latest Spark.
> > > >
> > > > Deron
> > > >
> > > >
> > > > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > > > <*javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com')*;>> wrote:
> > > >
> > > > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > > > release
> > > > > would include both new features and 2.0 support.  0.10 has plenty of
> > > > > functionality for any existing 1.x users.
> > > > >
> > > > > -Mike
> > > > >
> > > > > --
> > > > >
> > > > > Mike Dusenberry
> > > > > GitHub: github.com/dusenberrymw
> > > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > > >
> > > > > Sent from my iPhone.
> > > > >
> > > > >
> > > > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > > > <*javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com')*;>> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
> > support
> > > > and
> > > > > > API updates such as new MLContext were identified as main new
> > > features
> > > > > for
> > > > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > > > Note code changes required for Spark 2.0.0 are not backward
> > > compatible
> > > > to
> > > > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
> > thread
> > > > for
> > > > > > anyone to raise objections/alternatives for migrating to Spark
> > 2.0.0.
> > > > > >
> > > > > > One possible option is to do a release to include the new Apache
> > > > SystemML
> > > > > > features before migrating to Spark 2.0.0.  However, it seems better
> > > to
> > > > > have
> > > > > > the next Apache SystemML release compatible with latest Spark
> > version
> > > > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > > > earlier
> > > > > > versions of Spark.
> > > > > >
> > > > > > Regards,
> > > > > > Glenn
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Sent from my Mobile device
> > >
> >
> >
> >
> >
> >
> >
> 
> 
> 

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Frederick R Reiss <fr...@us.ibm.com>.
I think the approach Glenn proposes here is fine.

Fred



From:	Deron Eriksson <de...@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	08/16/2016 02:41 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



Hi Glenn,

I am fine with this approach. If this approach is taken, I would like to
set the documentation version in _config.yml to 0.10.x before the project
is tagged (I recently set it to 0.11).

Deron


On Thu, Aug 11, 2016 at 3:40 PM, Glenn Weidner <gw...@us.ibm.com> wrote:

> I would like to propose an alternative to supporting Spark 2.0 and Spark
> 1.x within single stream.
>
> 1) Capture snapshot and establish label of current Apache SystemML master
> which includes new features added since 0.10.0 release.
>
> 2) After step 1 completed, enable master to move forward with support for
> Spark 2.x only.
>
> This is similar to what Fred initially proposed except step 1 would not
> involve a separate release. The 0.11 release of Apache SystemML would be
> compatible for Spark 2.0 and Scala 2.11.
>
> Thanks,
> Glenn
>
> [image: Inactive hide details for Glenn Weidner---08/08/2016 03:33:43
> PM---As a preliminary experiment in attempt to compile against bo]Glenn
> Weidner---08/08/2016 03:33:43 PM---As a preliminary experiment in attempt
> to compile against both Spark 2.0.0 and Spark 1.6.2 from same
>
> From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 08/08/2016 03:33 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> As a preliminary experiment in attempt to compile against both Spark
2.0.0
> and Spark 1.6.2 from same code base, I made another set of changes for
> comparison against previous proposed changes for [SYSTEMML-776].
> This experimental set can be viewed here:
>
>
*https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0*

> <
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0
>
>
> This compiles against Spark 2.0.0 and Spark 1.6.2 except for
fit/transform
> overrides in LogisticRegression.scala due to:
> SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs
>
> Detailed code comments and suggestions to try out can be made in the
> branch commit instead of this mail thread.
>
> Thanks,
> Glenn
>
> Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
> supporting Spark 2 and Spark<2 concurrently if someone shows that it can
be
>
> From: Deron Eriksson <de...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 08/05/2016 02:02 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
> someone shows that it can be accomplished with minimal inconvenience.
>
> However, I would lean towards Fred's approach (Spark 1.6 release followed
> shortly by a Spark 2 release). If possible, I want to be able to focus
most
> of our efforts towards the future rather than the past.
>
> Deron
>
>
> On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
> wrote:
>
> > That was going to be my suggestion... In Zeppelin, we just introduced
> > support for different versions of scala and added support for spark 2.0
> > based on profiles and a bit of reflections...
> >
> > Do we have to do anything related to Scala versions as well ?
> >
> > On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
> >
> > > I would recommend to start an investigation if we could support both
> the
> > > 1.x and 2.x lines with a single code base. It seems feasible to
> refactor
> > > the code a bit, compile against 2.0 (or with profiles), and run on
> either
> > > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > > Iterable and Iterator, we could overcome the Iterator API change as
> shown
> > > by our LazyIterableIterator which did not require any change in
related
> > > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> > MRv1,
> > > we don't touch Yarn related APIs. Similarly on Spark, we already
> support
> > > both legacy and >=1.6 memory management. I think this kind of
platform
> > > independence is really valuable but it obviously adds complexity.
> > >
> > > Regards,
> > > Matthias
> > >
> > >
> > > [image: Inactive hide details for Niketan Pansare---08/03/2016
05:15:21
> > > PM---I am in favor of having one more release against Spark
1.6]Niketan
> > > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> > release
> > > against Spark 1.6. Since default scala version for Spark 1.
> > >
> > > From: Niketan Pansare/Almaden/IBM@IBMUS
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/03/2016 05:15 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > I am in favor of having one more release against Spark 1.6. Since
> default
> > > scala version for Spark 1.6 is 2.10, I recommend either having
SystemML
> > > compiled and released with Scala 2.10 profile or having two release
> > > candidates.
> > >
> > > Thanks,
> > >
> > > Niketan Pansare
> > > IBM Almaden Research Center
> > > E-mail: npansar At us.ibm.com
> > > *http://researcher.watson.ibm.com/researcher/view.php?
> person=us-npansar*
> > > <
> *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar>>
> > >
> > > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that
getting
> > > onto Spark 2.0 quickly ought to be a priority, there are existing
> early u
> > >
> > > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/03/2016 03:58 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > While I agree that getting onto Spark 2.0 quickly ought to be a
> priority,
> > > there are existing early users of SystemML who are likely stuck on
> Spark
> > > 1.6.x for the next few months. Those users could want some of the new
> > > experimental features since 0.10 (specifically frames, the prototype
> > Python
> > > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> > branch
> > > of our version tree where we can backport the debugged versions of
> these
> > > features if needed.
> > >
> > > I would recommend that we do one more SystemML release against Spark
> 1.6,
> > > then switch the head version of SystemML over to Spark 2.0, then
> > > immediately perform a second SystemML release. Thoughts?
> > >
> > > Fred
> > >
> > > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
> favor
> > > of moving to Spark 2.0 as early as possible. This will allow SystemML
> > >
> > > From: Deron Eriksson <deroneriksson@gmail.com
> > > <*javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com')*;>>
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/02/2016 12:13 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > I would definitely be in favor of moving to Spark 2.0 as early as
> > possible.
> > > This will allow SystemML to be current with cutting edge Spark. It
> would
> > be
> > > nice to focus our efforts on the latest Spark.
> > >
> > > Deron
> > >
> > >
> > > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > > <*javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com')*;>> wrote:
> > >
> > > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > > release
> > > > would include both new features and 2.0 support.  0.10 has plenty
of
> > > > functionality for any existing 1.x users.
> > > >
> > > > -Mike
> > > >
> > > > --
> > > >
> > > > Mike Dusenberry
> > > > GitHub: github.com/dusenberrymw
> > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > >
> > > > Sent from my iPhone.
> > > >
> > > >
> > > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > > <*javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com')*;>> wrote:
> > > > >
> > > > >
> > > > >
> > > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
> support
> > > and
> > > > > API updates such as new MLContext were identified as main new
> > features
> > > > for
> > > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > > Note code changes required for Spark 2.0.0 are not backward
> > compatible
> > > to
> > > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
> thread
> > > for
> > > > > anyone to raise objections/alternatives for migrating to Spark
> 2.0.0.
> > > > >
> > > > > One possible option is to do a release to include the new Apache
> > > SystemML
> > > > > features before migrating to Spark 2.0.0.  However, it seems
better
> > to
> > > > have
> > > > > the next Apache SystemML release compatible with latest Spark
> version
> > > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used
with
> > > > earlier
> > > > > versions of Spark.
> > > > >
> > > > > Regards,
> > > > > Glenn
> > > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Sent from my Mobile device
> >
>
>
>
>
>
>



Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Deron Eriksson <de...@gmail.com>.
Hi Glenn,

I am fine with this approach. If this approach is taken, I would like to
set the documentation version in _config.yml to 0.10.x before the project
is tagged (I recently set it to 0.11).

Deron


On Thu, Aug 11, 2016 at 3:40 PM, Glenn Weidner <gw...@us.ibm.com> wrote:

> I would like to propose an alternative to supporting Spark 2.0 and Spark
> 1.x within single stream.
>
> 1) Capture snapshot and establish label of current Apache SystemML master
> which includes new features added since 0.10.0 release.
>
> 2) After step 1 completed, enable master to move forward with support for
> Spark 2.x only.
>
> This is similar to what Fred initially proposed except step 1 would not
> involve a separate release. The 0.11 release of Apache SystemML would be
> compatible for Spark 2.0 and Scala 2.11.
>
> Thanks,
> Glenn
>
> [image: Inactive hide details for Glenn Weidner---08/08/2016 03:33:43
> PM---As a preliminary experiment in attempt to compile against bo]Glenn
> Weidner---08/08/2016 03:33:43 PM---As a preliminary experiment in attempt
> to compile against both Spark 2.0.0 and Spark 1.6.2 from same
>
> From: Glenn Weidner/Silicon Valley/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 08/08/2016 03:33 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> As a preliminary experiment in attempt to compile against both Spark 2.0.0
> and Spark 1.6.2 from same code base, I made another set of changes for
> comparison against previous proposed changes for [SYSTEMML-776].
> This experimental set can be viewed here:
>
> *https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0*
> <https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0>
>
> This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
> overrides in LogisticRegression.scala due to:
> SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs
>
> Detailed code comments and suggestions to try out can be made in the
> branch commit instead of this mail thread.
>
> Thanks,
> Glenn
>
> Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
> supporting Spark 2 and Spark<2 concurrently if someone shows that it can be
>
> From: Deron Eriksson <de...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 08/05/2016 02:02 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
> someone shows that it can be accomplished with minimal inconvenience.
>
> However, I would lean towards Fred's approach (Spark 1.6 release followed
> shortly by a Spark 2 release). If possible, I want to be able to focus most
> of our efforts towards the future rather than the past.
>
> Deron
>
>
> On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
> wrote:
>
> > That was going to be my suggestion... In Zeppelin, we just introduced
> > support for different versions of scala and added support for spark 2.0
> > based on profiles and a bit of reflections...
> >
> > Do we have to do anything related to Scala versions as well ?
> >
> > On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
> >
> > > I would recommend to start an investigation if we could support both
> the
> > > 1.x and 2.x lines with a single code base. It seems feasible to
> refactor
> > > the code a bit, compile against 2.0 (or with profiles), and run on
> either
> > > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > > Iterable and Iterator, we could overcome the Iterator API change as
> shown
> > > by our LazyIterableIterator which did not require any change in related
> > > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> > MRv1,
> > > we don't touch Yarn related APIs. Similarly on Spark, we already
> support
> > > both legacy and >=1.6 memory management. I think this kind of platform
> > > independence is really valuable but it obviously adds complexity.
> > >
> > > Regards,
> > > Matthias
> > >
> > >
> > > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> > release
> > > against Spark 1.6. Since default scala version for Spark 1.
> > >
> > > From: Niketan Pansare/Almaden/IBM@IBMUS
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/03/2016 05:15 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > I am in favor of having one more release against Spark 1.6. Since
> default
> > > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > > compiled and released with Scala 2.10 profile or having two release
> > > candidates.
> > >
> > > Thanks,
> > >
> > > Niketan Pansare
> > > IBM Almaden Research Center
> > > E-mail: npansar At us.ibm.com
> > > *http://researcher.watson.ibm.com/researcher/view.php?
> person=us-npansar*
> > > <
> *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar>>
> > >
> > > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > > onto Spark 2.0 quickly ought to be a priority, there are existing
> early u
> > >
> > > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/03/2016 03:58 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > While I agree that getting onto Spark 2.0 quickly ought to be a
> priority,
> > > there are existing early users of SystemML who are likely stuck on
> Spark
> > > 1.6.x for the next few months. Those users could want some of the new
> > > experimental features since 0.10 (specifically frames, the prototype
> > Python
> > > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> > branch
> > > of our version tree where we can backport the debugged versions of
> these
> > > features if needed.
> > >
> > > I would recommend that we do one more SystemML release against Spark
> 1.6,
> > > then switch the head version of SystemML over to Spark 2.0, then
> > > immediately perform a second SystemML release. Thoughts?
> > >
> > > Fred
> > >
> > > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
> favor
> > > of moving to Spark 2.0 as early as possible. This will allow SystemML
> > >
> > > From: Deron Eriksson <deroneriksson@gmail.com
> > > <*javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com')*;>>
> > > To: dev@systemml.incubator.apache.org
> > > <*javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org')*;>
> > > Date: 08/02/2016 12:13 PM
> > > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > > ------------------------------
> > >
> > >
> > >
> > > I would definitely be in favor of moving to Spark 2.0 as early as
> > possible.
> > > This will allow SystemML to be current with cutting edge Spark. It
> would
> > be
> > > nice to focus our efforts on the latest Spark.
> > >
> > > Deron
> > >
> > >
> > > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > > <*javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com')*;>> wrote:
> > >
> > > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > > release
> > > > would include both new features and 2.0 support.  0.10 has plenty of
> > > > functionality for any existing 1.x users.
> > > >
> > > > -Mike
> > > >
> > > > --
> > > >
> > > > Mike Dusenberry
> > > > GitHub: github.com/dusenberrymw
> > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > >
> > > > Sent from my iPhone.
> > > >
> > > >
> > > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > > <*javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com')*;>> wrote:
> > > > >
> > > > >
> > > > >
> > > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
> support
> > > and
> > > > > API updates such as new MLContext were identified as main new
> > features
> > > > for
> > > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > > Note code changes required for Spark 2.0.0 are not backward
> > compatible
> > > to
> > > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
> thread
> > > for
> > > > > anyone to raise objections/alternatives for migrating to Spark
> 2.0.0.
> > > > >
> > > > > One possible option is to do a release to include the new Apache
> > > SystemML
> > > > > features before migrating to Spark 2.0.0.  However, it seems better
> > to
> > > > have
> > > > > the next Apache SystemML release compatible with latest Spark
> version
> > > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > > earlier
> > > > > versions of Spark.
> > > > >
> > > > > Regards,
> > > > > Glenn
> > > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Sent from my Mobile device
> >
>
>
>
>
>
>

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Glenn Weidner <gw...@us.ibm.com>.
I would like to propose an alternative to supporting Spark 2.0 and Spark
1.x within single stream.

1) Capture snapshot and establish label of current Apache SystemML master
which includes new features added since 0.10.0 release.

2) After step 1 completed, enable master to move forward with support for
Spark 2.x only.

This is similar to what Fred initially proposed except step 1 would not
involve a separate release.  The 0.11 release of Apache SystemML would be
compatible for Spark 2.0 and Scala 2.11.

Thanks,
Glenn



From:	Glenn Weidner/Silicon Valley/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Date:	08/08/2016 03:33 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



As a preliminary experiment in attempt to compile against both Spark 2.0.0
and Spark 1.6.2 from same code base, I made another set of changes for
comparison against previous proposed changes for [SYSTEMML-776].
This experimental set can be viewed here:
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0


This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
overrides in LogisticRegression.scala due to:
SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs

Detailed code comments and suggestions to try out can be made in the branch
commit instead of this mail thread.

Thanks,
Glenn

Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
supporting Spark 2 and Spark<2 concurrently if someone shows that it can be

From: Deron Eriksson <de...@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 08/05/2016 02:02 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0



I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
someone shows that it can be accomplished with minimal inconvenience.

However, I would lean towards Fred's approach (Spark 1.6 release followed
shortly by a Spark 2 release). If possible, I want to be able to focus most
of our efforts towards the future rather than the past.

Deron


On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
wrote:

> That was going to be my suggestion... In Zeppelin, we just introduced
> support for different versions of scala and added support for spark 2.0
> based on profiles and a bit of reflections...
>
> Do we have to do anything related to Scala versions as well ?
>
> On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
>
> > I would recommend to start an investigation if we could support both
the
> > 1.x and 2.x lines with a single code base. It seems feasible to
refactor
> > the code a bit, compile against 2.0 (or with profiles), and run on
either
> > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > Iterable and Iterator, we could overcome the Iterator API change as
shown
> > by our LazyIterableIterator which did not require any change in related
> > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> MRv1,
> > we don't touch Yarn related APIs. Similarly on Spark, we already
support
> > both legacy and >=1.6 memory management. I think this kind of platform
> > independence is really valuable but it obviously adds complexity.
> >
> > Regards,
> > Matthias
> >
> >
> > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> release
> > against Spark 1.6. Since default scala version for Spark 1.
> >
> > From: Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 05:15 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am in favor of having one more release against Spark 1.6. Since
default
> > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > compiled and released with Scala 2.10 profile or having two release
> > candidates.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> >
*http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >
> > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > onto Spark 2.0 quickly ought to be a priority, there are existing early
u
> >
> > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 03:58 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > While I agree that getting onto Spark 2.0 quickly ought to be a
priority,
> > there are existing early users of SystemML who are likely stuck on
Spark
> > 1.6.x for the next few months. Those users could want some of the new
> > experimental features since 0.10 (specifically frames, the prototype
> Python
> > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> branch
> > of our version tree where we can backport the debugged versions of
these
> > features if needed.
> >
> > I would recommend that we do one more SystemML release against Spark
1.6,
> > then switch the head version of SystemML over to Spark 2.0, then
> > immediately perform a second SystemML release. Thoughts?
> >
> > Fred
> >
> > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
favor
> > of moving to Spark 2.0 as early as possible. This will allow SystemML
> >
> > From: Deron Eriksson <deroneriksson@gmail.com
> > <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/02/2016 12:13 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I would definitely be in favor of moving to Spark 2.0 as early as
> possible.
> > This will allow SystemML to be current with cutting edge Spark. It
would
> be
> > nice to focus our efforts on the latest Spark.
> >
> > Deron
> >
> >
> > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
> >
> > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > release
> > > would include both new features and 2.0 support.  0.10 has plenty of
> > > functionality for any existing 1.x users.
> > >
> > > -Mike
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > > >
> > > >
> > > >
> > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
support
> > and
> > > > API updates such as new MLContext were identified as main new
> features
> > > for
> > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > Note code changes required for Spark 2.0.0 are not backward
> compatible
> > to
> > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
thread
> > for
> > > > anyone to raise objections/alternatives for migrating to Spark
2.0.0.
> > > >
> > > > One possible option is to do a release to include the new Apache
> > SystemML
> > > > features before migrating to Spark 2.0.0.  However, it seems better
> to
> > > have
> > > > the next Apache SystemML release compatible with latest Spark
version
> > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > earlier
> > > > versions of Spark.
> > > >
> > > > Regards,
> > > > Glenn
> > >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sent from my Mobile device
>





Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Glenn Weidner <gw...@us.ibm.com>.
As a preliminary experiment in attempt to compile against both Spark 2.0.0
and Spark 1.6.2 from same code base, I made another set of changes for
comparison against previous proposed changes for [SYSTEMML-776].
This experimental set can be viewed here:
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0

This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
overrides in LogisticRegression.scala due to:
SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs

Detailed code comments and suggestions to try out can be made in the branch
commit instead of this mail thread.

Thanks,
Glenn



From:	Deron Eriksson <de...@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	08/05/2016 02:02 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
someone shows that it can be accomplished with minimal inconvenience.

However, I would lean towards Fred's approach (Spark 1.6 release followed
shortly by a Spark 2 release). If possible, I want to be able to focus most
of our efforts towards the future rather than the past.

Deron


On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
wrote:

> That was going to be my suggestion... In Zeppelin, we just introduced
> support for different versions of scala and added support for spark 2.0
> based on profiles and a bit of reflections...
>
> Do we have to do anything related to Scala versions as well ?
>
> On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
>
> > I would recommend to start an investigation if we could support both
the
> > 1.x and 2.x lines with a single code base. It seems feasible to
refactor
> > the code a bit, compile against 2.0 (or with profiles), and run on
either
> > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > Iterable and Iterator, we could overcome the Iterator API change as
shown
> > by our LazyIterableIterator which did not require any change in related
> > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> MRv1,
> > we don't touch Yarn related APIs. Similarly on Spark, we already
support
> > both legacy and >=1.6 memory management. I think this kind of platform
> > independence is really valuable but it obviously adds complexity.
> >
> > Regards,
> > Matthias
> >
> >
> > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> release
> > against Spark 1.6. Since default scala version for Spark 1.
> >
> > From: Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 05:15 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am in favor of having one more release against Spark 1.6. Since
default
> > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > compiled and released with Scala 2.10 profile or having two release
> > candidates.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> >
*http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >
> > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > onto Spark 2.0 quickly ought to be a priority, there are existing early
u
> >
> > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 03:58 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > While I agree that getting onto Spark 2.0 quickly ought to be a
priority,
> > there are existing early users of SystemML who are likely stuck on
Spark
> > 1.6.x for the next few months. Those users could want some of the new
> > experimental features since 0.10 (specifically frames, the prototype
> Python
> > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> branch
> > of our version tree where we can backport the debugged versions of
these
> > features if needed.
> >
> > I would recommend that we do one more SystemML release against Spark
1.6,
> > then switch the head version of SystemML over to Spark 2.0, then
> > immediately perform a second SystemML release. Thoughts?
> >
> > Fred
> >
> > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
favor
> > of moving to Spark 2.0 as early as possible. This will allow SystemML
> >
> > From: Deron Eriksson <deroneriksson@gmail.com
> > <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/02/2016 12:13 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I would definitely be in favor of moving to Spark 2.0 as early as
> possible.
> > This will allow SystemML to be current with cutting edge Spark. It
would
> be
> > nice to focus our efforts on the latest Spark.
> >
> > Deron
> >
> >
> > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
> >
> > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > release
> > > would include both new features and 2.0 support.  0.10 has plenty of
> > > functionality for any existing 1.x users.
> > >
> > > -Mike
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > > >
> > > >
> > > >
> > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
support
> > and
> > > > API updates such as new MLContext were identified as main new
> features
> > > for
> > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > Note code changes required for Spark 2.0.0 are not backward
> compatible
> > to
> > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
thread
> > for
> > > > anyone to raise objections/alternatives for migrating to Spark
2.0.0.
> > > >
> > > > One possible option is to do a release to include the new Apache
> > SystemML
> > > > features before migrating to Spark 2.0.0.  However, it seems better
> to
> > > have
> > > > the next Apache SystemML release compatible with latest Spark
version
> > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > earlier
> > > > versions of Spark.
> > > >
> > > > Regards,
> > > > Glenn
> > >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sent from my Mobile device
>



Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Deron Eriksson <de...@gmail.com>.
I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
someone shows that it can be accomplished with minimal inconvenience.

However, I would lean towards Fred's approach (Spark 1.6 release followed
shortly by a Spark 2 release). If possible, I want to be able to focus most
of our efforts towards the future rather than the past.

Deron


On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <lu...@gmail.com>
wrote:

> That was going to be my suggestion... In Zeppelin, we just introduced
> support for different versions of scala and added support for spark 2.0
> based on profiles and a bit of reflections...
>
> Do we have to do anything related to Scala versions as well ?
>
> On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:
>
> > I would recommend to start an investigation if we could support both the
> > 1.x and 2.x lines with a single code base. It seems feasible to refactor
> > the code a bit, compile against 2.0 (or with profiles), and run on either
> > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > Iterable and Iterator, we could overcome the Iterator API change as shown
> > by our LazyIterableIterator which did not require any change in related
> > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> MRv1,
> > we don't touch Yarn related APIs. Similarly on Spark, we already support
> > both legacy and >=1.6 memory management. I think this kind of platform
> > independence is really valuable but it obviously adds complexity.
> >
> > Regards,
> > Matthias
> >
> >
> > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> release
> > against Spark 1.6. Since default scala version for Spark 1.
> >
> > From: Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 05:15 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am in favor of having one more release against Spark 1.6. Since default
> > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > compiled and released with Scala 2.10 profile or having two release
> > candidates.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> > *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar>
> >
> > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > onto Spark 2.0 quickly ought to be a priority, there are existing early u
> >
> > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 03:58 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > While I agree that getting onto Spark 2.0 quickly ought to be a priority,
> > there are existing early users of SystemML who are likely stuck on Spark
> > 1.6.x for the next few months. Those users could want some of the new
> > experimental features since 0.10 (specifically frames, the prototype
> Python
> > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> branch
> > of our version tree where we can backport the debugged versions of these
> > features if needed.
> >
> > I would recommend that we do one more SystemML release against Spark 1.6,
> > then switch the head version of SystemML over to Spark 2.0, then
> > immediately perform a second SystemML release. Thoughts?
> >
> > Fred
> >
> > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in favor
> > of moving to Spark 2.0 as early as possible. This will allow SystemML
> >
> > From: Deron Eriksson <deroneriksson@gmail.com
> > <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/02/2016 12:13 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I would definitely be in favor of moving to Spark 2.0 as early as
> possible.
> > This will allow SystemML to be current with cutting edge Spark. It would
> be
> > nice to focus our efforts on the latest Spark.
> >
> > Deron
> >
> >
> > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
> >
> > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > release
> > > would include both new features and 2.0 support.  0.10 has plenty of
> > > functionality for any existing 1.x users.
> > >
> > > -Mike
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > > >
> > > >
> > > >
> > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support
> > and
> > > > API updates such as new MLContext were identified as main new
> features
> > > for
> > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > Note code changes required for Spark 2.0.0 are not backward
> compatible
> > to
> > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread
> > for
> > > > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> > > >
> > > > One possible option is to do a release to include the new Apache
> > SystemML
> > > > features before migrating to Spark 2.0.0.  However, it seems better
> to
> > > have
> > > > the next Apache SystemML release compatible with latest Spark version
> > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > earlier
> > > > versions of Spark.
> > > >
> > > > Regards,
> > > > Glenn
> > >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sent from my Mobile device
>

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Luciano Resende <lu...@gmail.com>.
That was going to be my suggestion... In Zeppelin, we just introduced
support for different versions of scala and added support for spark 2.0
based on profiles and a bit of reflections...

Do we have to do anything related to Scala versions as well ?

On Thursday, August 4, 2016, Matthias Boehm <mb...@us.ibm.com> wrote:

> I would recommend to start an investigation if we could support both the
> 1.x and 2.x lines with a single code base. It seems feasible to refactor
> the code a bit, compile against 2.0 (or with profiles), and run on either
> 1.6 or 2.0. For example, by creating a wrapper that implements both
> Iterable and Iterator, we could overcome the Iterator API change as shown
> by our LazyIterableIterator which did not require any change in related
> functions. Btw, we did the same for MRv1 and Yarn by ensuring that on MRv1,
> we don't touch Yarn related APIs. Similarly on Spark, we already support
> both legacy and >=1.6 memory management. I think this kind of platform
> independence is really valuable but it obviously adds complexity.
>
> Regards,
> Matthias
>
>
> [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> PM---I am in favor of having one more release against Spark 1.6]Niketan
> Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more release
> against Spark 1.6. Since default scala version for Spark 1.
>
> From: Niketan Pansare/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> Date: 08/03/2016 05:15 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> I am in favor of having one more release against Spark 1.6. Since default
> scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> compiled and released with Scala 2.10 profile or having two release
> candidates.
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> *http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar>
>
> Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> onto Spark 2.0 quickly ought to be a priority, there are existing early u
>
> From: Frederick R Reiss/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> Date: 08/03/2016 03:58 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> While I agree that getting onto Spark 2.0 quickly ought to be a priority,
> there are existing early users of SystemML who are likely stuck on Spark
> 1.6.x for the next few months. Those users could want some of the new
> experimental features since 0.10 (specifically frames, the prototype Python
> DSL, and the new MLContext) and it would be good to have a Spark 1.6 branch
> of our version tree where we can backport the debugged versions of these
> features if needed.
>
> I would recommend that we do one more SystemML release against Spark 1.6,
> then switch the head version of SystemML over to Spark 2.0, then
> immediately perform a second SystemML release. Thoughts?
>
> Fred
>
> Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in favor
> of moving to Spark 2.0 as early as possible. This will allow SystemML
>
> From: Deron Eriksson <deroneriksson@gmail.com
> <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> To: dev@systemml.incubator.apache.org
> <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> Date: 08/02/2016 12:13 PM
> Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> ------------------------------
>
>
>
> I would definitely be in favor of moving to Spark 2.0 as early as possible.
> This will allow SystemML to be current with cutting edge Spark. It would be
> nice to focus our efforts on the latest Spark.
>
> Deron
>
>
> On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
>
> > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> release
> > would include both new features and 2.0 support.  0.10 has plenty of
> > functionality for any existing 1.x users.
> >
> > -Mike
> >
> > --
> >
> > Mike Dusenberry
> > GitHub: github.com/dusenberrymw
> > LinkedIn: linkedin.com/in/mikedusenberry
> >
> > Sent from my iPhone.
> >
> >
> > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > >
> > >
> > >
> > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support
> and
> > > API updates such as new MLContext were identified as main new features
> > for
> > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > Note code changes required for Spark 2.0.0 are not backward compatible
> to
> > > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread
> for
> > > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> > >
> > > One possible option is to do a release to include the new Apache
> SystemML
> > > features before migrating to Spark 2.0.0.  However, it seems better to
> > have
> > > the next Apache SystemML release compatible with latest Spark version
> > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > earlier
> > > versions of Spark.
> > >
> > > Regards,
> > > Glenn
> >
>
>
>
>
>
>
>

-- 
Sent from my Mobile device

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Matthias Boehm <mb...@us.ibm.com>.
I would recommend to start an investigation if we could support both the
1.x and 2.x lines with a single code base. It seems feasible to refactor
the code a bit, compile against 2.0 (or with profiles), and run on either
1.6 or 2.0. For example, by creating a wrapper that implements both
Iterable and Iterator, we could overcome the Iterator API change as shown
by our LazyIterableIterator which did not require any change in related
functions. Btw, we did the same for MRv1 and Yarn by ensuring that on MRv1,
we don't touch Yarn related APIs. Similarly on Spark, we already support
both legacy and >=1.6 memory management. I think this kind of platform
independence is really valuable but it obviously adds complexity.

Regards,
Matthias




From:	Niketan Pansare/Almaden/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Date:	08/03/2016 05:15 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



I am in favor of having one more release against Spark 1.6. Since default
scala version for Spark 1.6 is 2.10, I recommend either having SystemML
compiled and released with Scala 2.10 profile or having two release
candidates.

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar

Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
onto Spark 2.0 quickly ought to be a priority, there are existing early u

From: Frederick R Reiss/Almaden/IBM@IBMUS
To: dev@systemml.incubator.apache.org
Date: 08/03/2016 03:58 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0



While I agree that getting onto Spark 2.0 quickly ought to be a priority,
there are existing early users of SystemML who are likely stuck on Spark
1.6.x for the next few months. Those users could want some of the new
experimental features since 0.10 (specifically frames, the prototype Python
DSL, and the new MLContext) and it would be good to have a Spark 1.6 branch
of our version tree where we can backport the debugged versions of these
features if needed.

I would recommend that we do one more SystemML release against Spark 1.6,
then switch the head version of SystemML over to Spark 2.0, then
immediately perform a second SystemML release. Thoughts?

Fred

Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in favor
of moving to Spark 2.0 as early as possible. This will allow SystemML

From: Deron Eriksson <de...@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 08/02/2016 12:13 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0



I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM, <du...@gmail.com> wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gw...@us.ibm.com> wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support
and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible
to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread
for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache
SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>






Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Niketan Pansare <np...@us.ibm.com>.
I am in favor of having one more release against Spark 1.6. Since default
scala version for Spark 1.6 is 2.10, I recommend either having SystemML
compiled and released with Scala 2.10 profile or having two release
candidates.

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar



From:	Frederick R Reiss/Almaden/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Date:	08/03/2016 03:58 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



While I agree that getting onto Spark 2.0 quickly ought to be a priority,
there are existing early users of SystemML who are likely stuck on Spark
1.6.x for the next few months. Those users could want some of the new
experimental features since 0.10 (specifically frames, the prototype Python
DSL, and the new MLContext) and it would be good to have a Spark 1.6 branch
of our version tree where we can backport the debugged versions of these
features if needed.

I would recommend that we do one more SystemML release against Spark 1.6,
then switch the head version of SystemML over to Spark 2.0, then
immediately perform a second SystemML release. Thoughts?

Fred

Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in favor
of moving to Spark 2.0 as early as possible. This will allow SystemML

From: Deron Eriksson <de...@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 08/02/2016 12:13 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0



I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM, <du...@gmail.com> wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gw...@us.ibm.com> wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support
and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible
to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread
for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache
SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>





Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Frederick R Reiss <fr...@us.ibm.com>.
While I agree that getting onto Spark 2.0 quickly ought to be a priority,
there are existing early users of SystemML who are likely stuck on Spark
1.6.x for the next few months. Those users could want some of the new
experimental features since 0.10 (specifically frames, the prototype Python
DSL, and the new MLContext) and it would be good to have a Spark 1.6 branch
of our version tree where we can backport the debugged versions of these
features if needed.

I would recommend that we do one more SystemML release against Spark 1.6,
then switch the head version of SystemML over to Spark 2.0, then
immediately perform a second SystemML release. Thoughts?

Fred



From:	Deron Eriksson <de...@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	08/02/2016 12:13 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM, <du...@gmail.com> wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gw...@us.ibm.com> wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support
and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible
to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread
for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache
SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>



Re: [DISCUSS] Migration to Spark 2.0.0

Posted by Deron Eriksson <de...@gmail.com>.
I would definitely be in favor of moving to Spark 2.0 as early as possible.
This will allow SystemML to be current with cutting edge Spark. It would be
nice to focus our efforts on the latest Spark.

Deron


On Tue, Aug 2, 2016 at 12:05 PM, <du...@gmail.com> wrote:

> I'm in favor of moving to Spark 2.0 now, meaning that our upcoming release
> would include both new features and 2.0 support.  0.10 has plenty of
> functionality for any existing 1.x users.
>
> -Mike
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gw...@us.ibm.com> wrote:
> >
> >
> >
> > In the "[DISCUSS] SystemML 0.11 release" thread, native frame support and
> > API updates such as new MLContext were identified as main new features
> for
> > the release.  In addition, support for Spark 2.0.0 was targeted.
> > Note code changes required for Spark 2.0.0 are not backward compatible to
> > earlier Spark versions (e.g., 1.6.2) so starting separate mail thread for
> > anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> >
> > One possible option is to do a release to include the new Apache SystemML
> > features before migrating to Spark 2.0.0.  However, it seems better to
> have
> > the next Apache SystemML release compatible with latest Spark version
> > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> earlier
> > versions of Spark.
> >
> > Regards,
> > Glenn
>

Re: [DISCUSS] Migration to Spark 2.0.0

Posted by du...@gmail.com.
I'm in favor of moving to Spark 2.0 now, meaning that our upcoming release would include both new features and 2.0 support.  0.10 has plenty of functionality for any existing 1.x users. 

-Mike

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gw...@us.ibm.com> wrote:
> 
> 
> 
> In the "[DISCUSS] SystemML 0.11 release" thread, native frame support and
> API updates such as new MLContext were identified as main new features for
> the release.  In addition, support for Spark 2.0.0 was targeted.
> Note code changes required for Spark 2.0.0 are not backward compatible to
> earlier Spark versions (e.g., 1.6.2) so starting separate mail thread for
> anyone to raise objections/alternatives for migrating to Spark 2.0.0.
> 
> One possible option is to do a release to include the new Apache SystemML
> features before migrating to Spark 2.0.0.  However, it seems better to have
> the next Apache SystemML release compatible with latest Spark version
> 2.0.0.  The Apache SystemML 0.10 release from June can be used with earlier
> versions of Spark.
> 
> Regards,
> Glenn