You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2015/02/26 19:23:07 UTC

Question about Spark versions

Spark releases every few weeks. In the meantime some users will have chosen a version to stay with for awhile. Now that we are moving to 1.2.1 what does that mean for users who are working with the version of Mahout that is using 1.1.0? 

Should we be releasing or tagging builds to sync with Spark versions? Otherwise we may be creating a headache for users. I say this because one of my clients is on Spark 1.1.0 and is hesitant to upgrade. Since there has been no release or tag we are giving no guidance about what point in Mahout to use.

I guess a light weight thing to do would be tag every time we move to a new build of Spark and annotate the tag with the version of Spark. The harder thing to do would be support multiple versions in the poms like we do for Hadoop. This is probably going to be required at some point, right?

Re: Question about Spark versions

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

algebraic optimizer binary should be compatible with pretty wide range of
spark. At very least, current head is backward compatible with 1.1.x. The
only thing that locked it to that is using unpersist api.

Before that it should've been compatible all the way to at least 0.9. spark
0.8.something was the first spark release that i added minimal necessary
patches to run mahout optimizer on, before that spark is definitely not
compatible with anything at all.

So technically, if so desired, current state could be made run on as early
as 0.8 (with recompilation and some minimal hacking), or 1.1.x with no
hacking/recompilation at all.

I also expect it to be pretty seamlessly forward compatible, at least until
major scala version jump.

As for the rest of spark-specific code -- you probably know better.

Since mahout does not redistribute spark dependencies, it also means that
no recompilation technically is required to switch from one spark version
to another, as long as SPARK_HOME points to the right directory(in case of
shell. YMMV with custom embeded apps that manage classpath differently).

so no, no specifically tagged builds are required.

Ultimately, if in doubt, the least expensive way to check if it is still
compatible with version N is to simply try to compile & run tests with this
version. This hopefully should catch incompatibilities.

On Thu, Feb 26, 2015 at 10:23 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Spark releases every few weeks. In the meantime some users will have
> chosen a version to stay with for awhile. Now that we are moving to 1.2.1
> what does that mean for users who are working with the version of Mahout
> that is using 1.1.0?
>
> Should we be releasing or tagging builds to sync with Spark versions?
> Otherwise we may be creating a headache for users. I say this because one
> of my clients is on Spark 1.1.0 and is hesitant to upgrade. Since there has
> been no release or tag we are giving no guidance about what point in Mahout
> to use.
>
> I guess a light weight thing to do would be tag every time we move to a
> new build of Spark and annotate the tag with the version of Spark. The
> harder thing to do would be support multiple versions in the poms like we
> do for Hadoop. This is probably going to be required at some point, right?