You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Michael Heuer <he...@gmail.com> on 2016/09/16 19:03:39 UTC
Re: Spark 1.x/2.x qualifiers in downstream artifact names

On Wed, Aug 24, 2016 at 12:12 PM, Sean Owen <so...@cloudera.com> wrote:

> If you're just varying versions (or things that can be controlled by a
> profile, which is most everything including dependencies), you don't
> need and probably don't want multiple POM files. Even that wouldn't
> mean you can't use classifiers.
>

It is worse (or better) than that, profiles didn't work for us in
combination with Scala 2.10/2.11, so we modify the POM in place as part of
CI and the release process.



> I have seen it used for HBase, core Hadoop. I am not sure I've seen it
> used for Spark 2 vs 1 but no reason it couldn't be. Frequently
> projects would instead declare that as of some version, Spark 2 is
> required, rather than support both. Or shim over an API difference
> with reflection if that's all there was to it. Spark does both of
> those sorts of things itself to avoid having to publish multiple
> variants at all. (Well, except for Scala 2.10 vs 2.11!)
>

We shim over Hadoop changes where necessary but the Spark changes between
1.x and 2.x are too much.

We have since resolved to deploy separate Spark 1.x and 2.x artifactIds as
described below.  Relevant pull requests:

https://github.com/bigdatagenomics/adam/pull/1123
https://github.com/bigdatagenomics/utils/pull/78

Thanks!

   michael



>
> On Wed, Aug 24, 2016 at 6:02 PM, Michael Heuer <he...@gmail.com> wrote:
> > Have you seen any successful applications of this for Spark 1.x/2.x?
> >
> > From the doc "The classifier allows to distinguish artifacts that were
> built
> > from the same POM but differ in their content."
> >
> > We'd be building from different POMs, since we'd be modifying the Spark
> > dependency version (and presumably any other dependencies that needed the
> > same Spark 1.x/2.x distinction).
> >
> >
> > On Wed, Aug 24, 2016 at 11:49 AM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> This is also what "classifiers" are for in Maven, to have variations
> >> on one artifact and version. https://maven.apache.org/pom.html
> >>
> >> It has been used to ship code for Hadoop 1 vs 2 APIs.
> >>
> >> In a way it's the same idea as Scala's "_2.xx" naming convention, with
> >> a less unfortunate implementation.
> >>
> >>
> >> On Wed, Aug 24, 2016 at 5:41 PM, Michael Heuer <he...@gmail.com>
> wrote:
> >> > Hello,
> >> >
> >> > We're a project downstream of Spark and need to provide separate
> >> > artifacts
> >> > for Spark 1.x and Spark 2.x.  Has any convention been established or
> >> > even
> >> > proposed for artifact names and/or qualifiers?
> >> >
> >> > We are currently thinking
> >> >
> >> > org.bdgenomics.adam:adam-{core,apis,cli}_2.1[0,1]  for Spark 1.x and
> >> > Scala
> >> > 2.10 & 2.11
> >> >
> >> >   and
> >> >
> >> > org.bdgenomics.adam:adam-{core,apis,cli}-spark2_2.1[0,1]  for Spark
> 1.x
> >> > and
> >> > Scala 2.10 & 2.11
> >> >
> >> > https://github.com/bigdatagenomics/adam/issues/1093
> >> >
> >> >
> >> > Thanks in advance,
> >> >
> >> >    michael
> >
> >
>