You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@predictionio.apache.org by Pat Ferrel <pa...@actionml.com> on 2016/09/05 16:43:08 UTC

Binary or Source release

This weekend I tracked down all out deps, which required a few scripts to process sbt output. This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt. As I read the Apache guidelines this should be the license that goes with the version we include since the copyright owner of license may have changed in newer versions.

This may be near impossible to maintain by hand if we have frequent dependency upgrades and frequent releases. Donald is looking at automating this but I’m personally dubious about this because it require all 166 deps have maintained their licenses in artifacts for all versions we might use.

A source release requires that *only* the source included be reflected in LICENSES.txt. This would be ~0, I think a couple things are included.

Several things lead me to favor a source-only release:
1) 166 licenses needed for binary ~0 needed for source—I’d rather we spend time on things that add more value
2) I have never used the binary release. Any version of a source download and `./make-dirstribution` works universally.
3) our install.sh now installs source and builds it for the user. This is good because we can use the same script for unreleased -SNAPSHOT versions sitting in the `develop` branch.
4) outside of instructions for downloading and installing the binary that do not yet exist afaik, there would be no obvious way for the user to get the binary. 
5) indirectly any delay to release is getting to be a serious problem. We haven’t had a well supported release from the main project since close on a year ago and work on new features is being delayed.
6) we can do a source only release now and be clean of the license issue as far as the IPMC is concerned. We can add binary when we have a better answer to automation. In other words why hold the release for binary?

Since this decision will affect the project for as long as it is in incubation. I’d like to see what others think. I believe we can release now if we do source-only.

Source only, or source & binary?

Re: Binary or Source release

Posted by Pat Ferrel <pa...@occamsmachete.com>.
In case we can automate LICENSE.txt creation I’d still rather not have it as a blocker since the file is easy to update to reflect source releases and for other reasons I listed I’m not sure if its value.

Removing binary-ready release as a blocker will give us time to automate without the pressure of hurrying for release.

What do others think?



On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

Thanks Andy. 

RE “Only need to include one entry with the complete text of a license, everything else can just name the license.” So the copyright notice in the license is not important, only the license type? This is often the only important difference in the license from one dep to another.

It sounds like your automation covered LICENSE.txt creation? or just inclusion in the binary?


On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com> wrote:

I won't weigh in on the question at hand but I'd like to make a couple of clarifications for what it is worth:

> This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt.

There are some available simplifications:

- Only need to include one entry with the complete text of a license, everything else can just name the license. 

- Where there are multiple artifacts coming from a single project, like Hadoop, only one entry for the project is needed. 

> Donald is looking at automating this but I’m personally dubious 

As I think I've mentioned before here we have successfully automated this for HBase (based on automation done by yet other Apache projects) so I hope you'll take my advice and evidence based assertion it can be done. Caveat: we use maven not SBT as build framework. 


> 
> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> 
> This weekend I tracked down all out deps, which required a few scripts to process sbt output. This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt. As I read the Apache guidelines this should be the license that goes with the version we include since the copyright owner of license may have changed in newer versions.
> 
> This may be near impossible to maintain by hand if we have frequent dependency upgrades and frequent releases. Donald is looking at automating this but I’m personally dubious about this because it require all 166 deps have maintained their licenses in artifacts for all versions we might use.
> 
> A source release requires that *only* the source included be reflected in LICENSES.txt. This would be ~0, I think a couple things are included.
> 
> Several things lead me to favor a source-only release:
> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we spend time on things that add more value
> 2) I have never used the binary release. Any version of a source download and `./make-dirstribution` works universally.
> 3) our install.sh now installs source and builds it for the user. This is good because we can use the same script for unreleased -SNAPSHOT versions sitting in the `develop` branch.
> 4) outside of instructions for downloading and installing the binary that do not yet exist afaik, there would be no obvious way for the user to get the binary. 
> 5) indirectly any delay to release is getting to be a serious problem. We haven’t had a well supported release from the main project since close on a year ago and work on new features is being delayed.
> 6) we can do a source only release now and be clean of the license issue as far as the IPMC is concerned. We can add binary when we have a better answer to automation. In other words why hold the release for binary?
> 
> Since this decision will affect the project for as long as it is in incubation. I’d like to see what others think. I believe we can release now if we do source-only.
> 
> Source only, or source & binary?



Re: Binary or Source release

Posted by Donald Szeto <do...@apache.org>.
I agree with doing a source release now while gradually fixing a binary
release.

On Monday, September 5, 2016, Suneel Marthi <sm...@apache.org> wrote:

> Its easy to do what Andy is describing using maven's assembly plugin in the
> maven world. I have no experience with sbt so can't speak to how it can be
> done with Sbt and would defer that to the experts.
>
> We hit a similar issue with licenses in source and binary on the first Pirk
> release last week. We finally decided to make a source-only first release
> while we r now working on fixing the binary license packaging for the next
> release.
>
>
>
> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <andrew.purtell@gmail.com
> <javascript:;>>
> wrote:
>
> > It covers LICENSE and NOTICE file generation for both source and binary
> > releases, and inclusion of the resulting files in source archives, binary
> > jars, and binary archives through integration with the maven build and
> > assembly targets.
> >
> > Including the complete text of any given license in LICENSE is important
> > but only needs to be done once. You retain the copyright notice and
> mention
> > of the license type per dependency. We are just talking about
> > deduplicating, eg 100 full texts of the ASLv2 into one.
> >
> > > On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pat@occamsmachete.com
> <javascript:;>> wrote:
> > >
> > > Thanks Andy.
> > >
> > > RE “Only need to include one entry with the complete text of a license,
> > everything else can just name the license.” So the copyright notice in
> the
> > license is not important, only the license type? This is often the only
> > important difference in the license from one dep to another.
> > >
> > > It sounds like your automation covered LICENSE.txt creation? or just
> > inclusion in the binary?
> > >
> > >
> > > On Sep 5, 2016, at 9:59 AM, Andrew Purtell <andrew.purtell@gmail.com
> <javascript:;>>
> > wrote:
> > >
> > > I won't weigh in on the question at hand but I'd like to make a couple
> > of clarifications for what it is worth:
> > >
> > >> This yielded 166 deps, so this implies we need to include 166 licenses
> > and copyright notices in LICENSE.txt.
> > >
> > > There are some available simplifications:
> > >
> > > - Only need to include one entry with the complete text of a license,
> > everything else can just name the license.
> > >
> > > - Where there are multiple artifacts coming from a single project, like
> > Hadoop, only one entry for the project is needed.
> > >
> > >> Donald is looking at automating this but I’m personally dubious
> > >
> > > As I think I've mentioned before here we have successfully automated
> > this for HBase (based on automation done by yet other Apache projects)
> so I
> > hope you'll take my advice and evidence based assertion it can be done.
> > Caveat: we use maven not SBT as build framework.
> > >
> > >
> > >>
> > >> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pat@actionml.com
> <javascript:;>> wrote:
> > >>
> > >> This weekend I tracked down all out deps, which required a few scripts
> > to process sbt output. This yielded 166 deps, so this implies we need to
> > include 166 licenses and copyright notices in LICENSE.txt. As I read the
> > Apache guidelines this should be the license that goes with the version
> we
> > include since the copyright owner of license may have changed in newer
> > versions.
> > >>
> > >> This may be near impossible to maintain by hand if we have frequent
> > dependency upgrades and frequent releases. Donald is looking at
> automating
> > this but I’m personally dubious about this because it require all 166
> deps
> > have maintained their licenses in artifacts for all versions we might
> use.
> > >>
> > >> A source release requires that *only* the source included be reflected
> > in LICENSES.txt. This would be ~0, I think a couple things are included.
> > >>
> > >> Several things lead me to favor a source-only release:
> > >> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
> > spend time on things that add more value
> > >> 2) I have never used the binary release. Any version of a source
> > download and `./make-dirstribution` works universally.
> > >> 3) our install.sh now installs source and builds it for the user. This
> > is good because we can use the same script for unreleased -SNAPSHOT
> > versions sitting in the `develop` branch.
> > >> 4) outside of instructions for downloading and installing the binary
> > that do not yet exist afaik, there would be no obvious way for the user
> to
> > get the binary.
> > >> 5) indirectly any delay to release is getting to be a serious problem.
> > We haven’t had a well supported release from the main project since close
> > on a year ago and work on new features is being delayed.
> > >> 6) we can do a source only release now and be clean of the license
> > issue as far as the IPMC is concerned. We can add binary when we have a
> > better answer to automation. In other words why hold the release for
> binary?
> > >>
> > >> Since this decision will affect the project for as long as it is in
> > incubation. I’d like to see what others think. I believe we can release
> now
> > if we do source-only.
> > >>
> > >> Source only, or source & binary?
> > >
> >
>

Re: Binary or Source release

Posted by Suneel Marthi <su...@gmail.com>.
What Andy's outlined below are pretty much the process and steps we have
been following for all the Mahout releases and for the first Pirk release
recently.

Again both Mahout and Pirk are all Maven and the relevant plugins and
profiles have been setup and configured to build, package, sign, push
artifacts to Nexus and then to dist.apache.org.

Worth asking the question again: "Why did Spark migrate to Maven? "

Something to consider post first PIO release.


On Tue, Sep 6, 2016 at 2:49 PM, Andrew Purtell <ap...@apache.org> wrote:

> Kam is checking in release candidate artifacts to dev staging SVN by hand
> and then promoting them by hand, I believe.
>
> Apache projects cannot "release on GitHub". Please review the release
> policy documents we've provided links to in the past.
>
> FWIW, here is the shorthand version of what I do to release HBase release
> candidates. This is typical of an Apache project:
>
>    1. Preflight: unit tests, integration tests, check compilation of
>    downstream projects, etc.
>
>    2. Build a source tarball assembly using the Maven assembly plugin, and
>    save it to a staging directory
>    3. Build binaries from source
>    2a. Activate Maven target for generating binary tarball, and save it to
>    a staging directory
>    2b. Activate Maven target for uploading jars to Apache's Nexus. This
>    allocates a "staging repository" that testers can use during release
>    candidate evaluation.
>    4. Sign source and binary artifacts in the staging directory
>    5. Generate required MD5 and SHA1 sum files of the source and binary
>    artifacts in the staging directory
>    6. Commit the source and binary artifacts, signature files, and sum
>    files from the staging directory to our dev area on dist.apache.org
>    7. Call a vote
>    8. Should the vote pass
>    8a. 'svn mv' the release candidate files from our staging to release
>    areas on dist.apache.org. Mirrors will then pick up the new release
>    artfiacts.
>    8b. Log on to Apache's Nexus and release the staging repository. This
>    moves the uploaded artifacts out of staging into general release and
> they
>    will be further propagated to Maven central.
>
> Only some of this is amenable to further automation.
> Because you use SBT I don't know how much of any of this can be done in an
> automated fashion, sorry.
>
>
> On Tue, Sep 6, 2016 at 11:35 AM, Suneel Marthi <su...@gmail.com>
> wrote:
>
> > On Tue, Sep 6, 2016 at 1:17 PM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
> >
> > > Ok, if no other objections then we have no blockers for source release.
> > >
> > > There is a process to get the tar into Apache mirrors but to release on
> > > Github all we need to do it merge develop, wait for travis tests and
> tag
> > > the master. Github can be told to produce a source release tar and host
> > it
> > > and the install.sh can be made to point to it.
> > >
> >
> > Phew!!! Life ain't that easy.
> >
> > The release needs to be first deployed to staging for  - PPMC voting and
> > validation, followed by IPMC voting and validation.
> >
> > Once both of the above votes pass - its then that the release is
> finalized
> > and posted to Apache mirrors and then tag the release in github.
> >
> > There's a reason for creating all of those staging and deployment sites
> for
> > PIO in https://issues.apache.org/jira/browse/INFRA-12384
> >
> > In the maven world most of these tasks can be accomplished by bunch of
> > plugins and profiles - maven-release-plugin, maven-gpg-plugin (for
> > producing the MD5 and SHA1 sigs for each artifact), maven-jar-plugin,
> > maven-assembly-plugin etc...
> >
> > Not sure how its all done with SBT.
> >
> > Are folks aware of any Apache project using Sbt and has had atleast one
> > release? I believe Gearpump is all Sbt, maybe check with Kam Kasravi & Co
> > as to what the release process and plugins are.
> >
> >
> >
> > > Not sure what the ASF rules are regarding this so maybe the mentors can
> > > comment—specifically do we have to use the Apache mirror system?
> > >
> > >
> > > On Sep 5, 2016, at 4:34 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> > >
> > >
> > > On Sep 5, 2016, at 1:25 PM, Alex Merritt <em...@apache.org>
> > wrote:
> > >
> > > Agree we should go source only for this release.
> > >
> > > On Sep 5, 2016 1:10 PM, "Suneel Marthi" <su...@gmail.com>
> wrote:
> > >
> > > > On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <
> > andrew.purtell@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> I also don't have experience with SBT, apologies. I did do some
> poking
> > > >> around on Google and it looks like SBT is well behind Maven in
> > providing
> > > >> this type of functionality out of the box or by third party plugin
> > > >> (sbt-assembly does some useful and interesting things but is focused
> > > >> exclusively on producing über jars). I think that's to be expected
> > given
> > > >> the origin story. "Maven is huge and crufty and we want new and
> > simple!"
> > > >> "Ok, let's make Simple Build Tool!" Fast forward. No longer simple.
> > Not
> > > >> able to do a lot of what Maven can. Years of reinventing the wheel
> > > ahead,
> > > >> ahoy! Happy to be corrected.
> > > >>
> > > >
> > > > Heh, not to mention that Sbt is just not as flexible as maven in
> being
> > > able
> > > > to handle different phases and cycles of build and deployment.
> > > >
> > > >
> > > >> Doing what I've described looks achievable by programming what is
> > needed
> > > >> in SBT's DSL. Source only releases for a while maybe? Or work up
> > LICENSE
> > > >> and NOTICE files by hand and figure out how to break release builds
> if
> > > >> dependencies change and the metadata hasn't been updated by hand?
> > > >>
> > > >> I was wondering why Spark went with Maven for their build of
> > reference.
> > > >>
> > > >> My little rant on SBT aside I am NOT suggesting you replace SBT with
> > > >> Maven. That would be in my opinion an unfortunate use of developer
> > > >> bandwidth better put to task getting the current software with
> current
> > > >> build system out the door in a first Apache release.
> > > >>
> > > >
> > > > +1 and we all seem to agree for a quick source-only first release.
> > > >
> > > >
> > > >>> On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org>
> > wrote:
> > > >>>
> > > >>> Its easy to do what Andy is describing using maven's assembly
> plugin
> > in
> > > >> the
> > > >>> maven world. I have no experience with sbt so can't speak to how it
> > can
> > > >> be
> > > >>> done with Sbt and would defer that to the experts.
> > > >>>
> > > >>> We hit a similar issue with licenses in source and binary on the
> > first
> > > >> Pirk
> > > >>> release last week. We finally decided to make a source-only first
> > > > release
> > > >>> while we r now working on fixing the binary license packaging for
> the
> > > >> next
> > > >>> release.
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> > > > andrew.purtell@gmail.com
> > > >>>
> > > >>> wrote:
> > > >>>
> > > >>>> It covers LICENSE and NOTICE file generation for both source and
> > > > binary
> > > >>>> releases, and inclusion of the resulting files in source archives,
> > > >> binary
> > > >>>> jars, and binary archives through integration with the maven build
> > and
> > > >>>> assembly targets.
> > > >>>>
> > > >>>> Including the complete text of any given license in LICENSE is
> > > > important
> > > >>>> but only needs to be done once. You retain the copyright notice
> and
> > > >> mention
> > > >>>> of the license type per dependency. We are just talking about
> > > >>>> deduplicating, eg 100 full texts of the ASLv2 into one.
> > > >>>>
> > > >>>>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com>
> > > > wrote:
> > > >>>>>
> > > >>>>> Thanks Andy.
> > > >>>>>
> > > >>>>> RE “Only need to include one entry with the complete text of a
> > > > license,
> > > >>>> everything else can just name the license.” So the copyright
> notice
> > in
> > > >> the
> > > >>>> license is not important, only the license type? This is often the
> > > > only
> > > >>>> important difference in the license from one dep to another.
> > > >>>>>
> > > >>>>> It sounds like your automation covered LICENSE.txt creation? or
> > just
> > > >>>> inclusion in the binary?
> > > >>>>>
> > > >>>>>
> > > >>>>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> > > > andrew.purtell@gmail.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>> I won't weigh in on the question at hand but I'd like to make a
> > > > couple
> > > >>>> of clarifications for what it is worth:
> > > >>>>>
> > > >>>>>> This yielded 166 deps, so this implies we need to include 166
> > > > licenses
> > > >>>> and copyright notices in LICENSE.txt.
> > > >>>>>
> > > >>>>> There are some available simplifications:
> > > >>>>>
> > > >>>>> - Only need to include one entry with the complete text of a
> > license,
> > > >>>> everything else can just name the license.
> > > >>>>>
> > > >>>>> - Where there are multiple artifacts coming from a single
> project,
> > > > like
> > > >>>> Hadoop, only one entry for the project is needed.
> > > >>>>>
> > > >>>>>> Donald is looking at automating this but I’m personally dubious
> > > >>>>>
> > > >>>>> As I think I've mentioned before here we have successfully
> > automated
> > > >>>> this for HBase (based on automation done by yet other Apache
> > projects)
> > > >> so I
> > > >>>> hope you'll take my advice and evidence based assertion it can be
> > > > done.
> > > >>>> Caveat: we use maven not SBT as build framework.
> > > >>>>>
> > > >>>>>
> > > >>>>>>
> > > >>>>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com>
> wrote:
> > > >>>>>>
> > > >>>>>> This weekend I tracked down all out deps, which required a few
> > > > scripts
> > > >>>> to process sbt output. This yielded 166 deps, so this implies we
> > need
> > > > to
> > > >>>> include 166 licenses and copyright notices in LICENSE.txt. As I
> read
> > > > the
> > > >>>> Apache guidelines this should be the license that goes with the
> > > > version
> > > >> we
> > > >>>> include since the copyright owner of license may have changed in
> > newer
> > > >>>> versions.
> > > >>>>>>
> > > >>>>>> This may be near impossible to maintain by hand if we have
> > frequent
> > > >>>> dependency upgrades and frequent releases. Donald is looking at
> > > >> automating
> > > >>>> this but I’m personally dubious about this because it require all
> > 166
> > > >> deps
> > > >>>> have maintained their licenses in artifacts for all versions we
> > might
> > > >> use.
> > > >>>>>>
> > > >>>>>> A source release requires that *only* the source included be
> > > > reflected
> > > >>>> in LICENSES.txt. This would be ~0, I think a couple things are
> > > > included.
> > > >>>>>>
> > > >>>>>> Several things lead me to favor a source-only release:
> > > >>>>>> 1) 166 licenses needed for binary ~0 needed for source—I’d
> rather
> > we
> > > >>>> spend time on things that add more value
> > > >>>>>> 2) I have never used the binary release. Any version of a source
> > > >>>> download and `./make-dirstribution` works universally.
> > > >>>>>> 3) our install.sh now installs source and builds it for the
> user.
> > > > This
> > > >>>> is good because we can use the same script for unreleased
> -SNAPSHOT
> > > >>>> versions sitting in the `develop` branch.
> > > >>>>>> 4) outside of instructions for downloading and installing the
> > binary
> > > >>>> that do not yet exist afaik, there would be no obvious way for the
> > > > user
> > > >> to
> > > >>>> get the binary.
> > > >>>>>> 5) indirectly any delay to release is getting to be a serious
> > > > problem.
> > > >>>> We haven’t had a well supported release from the main project
> since
> > > >> close
> > > >>>> on a year ago and work on new features is being delayed.
> > > >>>>>> 6) we can do a source only release now and be clean of the
> license
> > > >>>> issue as far as the IPMC is concerned. We can add binary when we
> > have
> > > > a
> > > >>>> better answer to automation. In other words why hold the release
> for
> > > >> binary?
> > > >>>>>>
> > > >>>>>> Since this decision will affect the project for as long as it is
> > in
> > > >>>> incubation. I’d like to see what others think. I believe we can
> > > > release
> > > >> now
> > > >>>> if we do source-only.
> > > >>>>>>
> > > >>>>>> Source only, or source & binary?
> > > >>>>
> > > >>
> > > >
> > >
> > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Re: Binary or Source release

Posted by Andrew Purtell <ap...@apache.org>.
Kam is checking in release candidate artifacts to dev staging SVN by hand
and then promoting them by hand, I believe.

Apache projects cannot "release on GitHub". Please review the release
policy documents we've provided links to in the past.

FWIW, here is the shorthand version of what I do to release HBase release
candidates. This is typical of an Apache project:

   1. Preflight: unit tests, integration tests, check compilation of
   downstream projects, etc.

   2. Build a source tarball assembly using the Maven assembly plugin, and
   save it to a staging directory
   3. Build binaries from source
   2a. Activate Maven target for generating binary tarball, and save it to
   a staging directory
   2b. Activate Maven target for uploading jars to Apache's Nexus. This
   allocates a "staging repository" that testers can use during release
   candidate evaluation.
   4. Sign source and binary artifacts in the staging directory
   5. Generate required MD5 and SHA1 sum files of the source and binary
   artifacts in the staging directory
   6. Commit the source and binary artifacts, signature files, and sum
   files from the staging directory to our dev area on dist.apache.org
   7. Call a vote
   8. Should the vote pass
   8a. 'svn mv' the release candidate files from our staging to release
   areas on dist.apache.org. Mirrors will then pick up the new release
   artfiacts.
   8b. Log on to Apache's Nexus and release the staging repository. This
   moves the uploaded artifacts out of staging into general release and they
   will be further propagated to Maven central.

Only some of this is amenable to further automation.
Because you use SBT I don't know how much of any of this can be done in an
automated fashion, sorry.


On Tue, Sep 6, 2016 at 11:35 AM, Suneel Marthi <su...@gmail.com>
wrote:

> On Tue, Sep 6, 2016 at 1:17 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
> > Ok, if no other objections then we have no blockers for source release.
> >
> > There is a process to get the tar into Apache mirrors but to release on
> > Github all we need to do it merge develop, wait for travis tests and tag
> > the master. Github can be told to produce a source release tar and host
> it
> > and the install.sh can be made to point to it.
> >
>
> Phew!!! Life ain't that easy.
>
> The release needs to be first deployed to staging for  - PPMC voting and
> validation, followed by IPMC voting and validation.
>
> Once both of the above votes pass - its then that the release is finalized
> and posted to Apache mirrors and then tag the release in github.
>
> There's a reason for creating all of those staging and deployment sites for
> PIO in https://issues.apache.org/jira/browse/INFRA-12384
>
> In the maven world most of these tasks can be accomplished by bunch of
> plugins and profiles - maven-release-plugin, maven-gpg-plugin (for
> producing the MD5 and SHA1 sigs for each artifact), maven-jar-plugin,
> maven-assembly-plugin etc...
>
> Not sure how its all done with SBT.
>
> Are folks aware of any Apache project using Sbt and has had atleast one
> release? I believe Gearpump is all Sbt, maybe check with Kam Kasravi & Co
> as to what the release process and plugins are.
>
>
>
> > Not sure what the ASF rules are regarding this so maybe the mentors can
> > comment—specifically do we have to use the Apache mirror system?
> >
> >
> > On Sep 5, 2016, at 4:34 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >
> >
> > On Sep 5, 2016, at 1:25 PM, Alex Merritt <em...@apache.org>
> wrote:
> >
> > Agree we should go source only for this release.
> >
> > On Sep 5, 2016 1:10 PM, "Suneel Marthi" <su...@gmail.com> wrote:
> >
> > > On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <
> andrew.purtell@gmail.com
> > >
> > > wrote:
> > >
> > >> I also don't have experience with SBT, apologies. I did do some poking
> > >> around on Google and it looks like SBT is well behind Maven in
> providing
> > >> this type of functionality out of the box or by third party plugin
> > >> (sbt-assembly does some useful and interesting things but is focused
> > >> exclusively on producing über jars). I think that's to be expected
> given
> > >> the origin story. "Maven is huge and crufty and we want new and
> simple!"
> > >> "Ok, let's make Simple Build Tool!" Fast forward. No longer simple.
> Not
> > >> able to do a lot of what Maven can. Years of reinventing the wheel
> > ahead,
> > >> ahoy! Happy to be corrected.
> > >>
> > >
> > > Heh, not to mention that Sbt is just not as flexible as maven in being
> > able
> > > to handle different phases and cycles of build and deployment.
> > >
> > >
> > >> Doing what I've described looks achievable by programming what is
> needed
> > >> in SBT's DSL. Source only releases for a while maybe? Or work up
> LICENSE
> > >> and NOTICE files by hand and figure out how to break release builds if
> > >> dependencies change and the metadata hasn't been updated by hand?
> > >>
> > >> I was wondering why Spark went with Maven for their build of
> reference.
> > >>
> > >> My little rant on SBT aside I am NOT suggesting you replace SBT with
> > >> Maven. That would be in my opinion an unfortunate use of developer
> > >> bandwidth better put to task getting the current software with current
> > >> build system out the door in a first Apache release.
> > >>
> > >
> > > +1 and we all seem to agree for a quick source-only first release.
> > >
> > >
> > >>> On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org>
> wrote:
> > >>>
> > >>> Its easy to do what Andy is describing using maven's assembly plugin
> in
> > >> the
> > >>> maven world. I have no experience with sbt so can't speak to how it
> can
> > >> be
> > >>> done with Sbt and would defer that to the experts.
> > >>>
> > >>> We hit a similar issue with licenses in source and binary on the
> first
> > >> Pirk
> > >>> release last week. We finally decided to make a source-only first
> > > release
> > >>> while we r now working on fixing the binary license packaging for the
> > >> next
> > >>> release.
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> > > andrew.purtell@gmail.com
> > >>>
> > >>> wrote:
> > >>>
> > >>>> It covers LICENSE and NOTICE file generation for both source and
> > > binary
> > >>>> releases, and inclusion of the resulting files in source archives,
> > >> binary
> > >>>> jars, and binary archives through integration with the maven build
> and
> > >>>> assembly targets.
> > >>>>
> > >>>> Including the complete text of any given license in LICENSE is
> > > important
> > >>>> but only needs to be done once. You retain the copyright notice and
> > >> mention
> > >>>> of the license type per dependency. We are just talking about
> > >>>> deduplicating, eg 100 full texts of the ASLv2 into one.
> > >>>>
> > >>>>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com>
> > > wrote:
> > >>>>>
> > >>>>> Thanks Andy.
> > >>>>>
> > >>>>> RE “Only need to include one entry with the complete text of a
> > > license,
> > >>>> everything else can just name the license.” So the copyright notice
> in
> > >> the
> > >>>> license is not important, only the license type? This is often the
> > > only
> > >>>> important difference in the license from one dep to another.
> > >>>>>
> > >>>>> It sounds like your automation covered LICENSE.txt creation? or
> just
> > >>>> inclusion in the binary?
> > >>>>>
> > >>>>>
> > >>>>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> > > andrew.purtell@gmail.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>> I won't weigh in on the question at hand but I'd like to make a
> > > couple
> > >>>> of clarifications for what it is worth:
> > >>>>>
> > >>>>>> This yielded 166 deps, so this implies we need to include 166
> > > licenses
> > >>>> and copyright notices in LICENSE.txt.
> > >>>>>
> > >>>>> There are some available simplifications:
> > >>>>>
> > >>>>> - Only need to include one entry with the complete text of a
> license,
> > >>>> everything else can just name the license.
> > >>>>>
> > >>>>> - Where there are multiple artifacts coming from a single project,
> > > like
> > >>>> Hadoop, only one entry for the project is needed.
> > >>>>>
> > >>>>>> Donald is looking at automating this but I’m personally dubious
> > >>>>>
> > >>>>> As I think I've mentioned before here we have successfully
> automated
> > >>>> this for HBase (based on automation done by yet other Apache
> projects)
> > >> so I
> > >>>> hope you'll take my advice and evidence based assertion it can be
> > > done.
> > >>>> Caveat: we use maven not SBT as build framework.
> > >>>>>
> > >>>>>
> > >>>>>>
> > >>>>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> > >>>>>>
> > >>>>>> This weekend I tracked down all out deps, which required a few
> > > scripts
> > >>>> to process sbt output. This yielded 166 deps, so this implies we
> need
> > > to
> > >>>> include 166 licenses and copyright notices in LICENSE.txt. As I read
> > > the
> > >>>> Apache guidelines this should be the license that goes with the
> > > version
> > >> we
> > >>>> include since the copyright owner of license may have changed in
> newer
> > >>>> versions.
> > >>>>>>
> > >>>>>> This may be near impossible to maintain by hand if we have
> frequent
> > >>>> dependency upgrades and frequent releases. Donald is looking at
> > >> automating
> > >>>> this but I’m personally dubious about this because it require all
> 166
> > >> deps
> > >>>> have maintained their licenses in artifacts for all versions we
> might
> > >> use.
> > >>>>>>
> > >>>>>> A source release requires that *only* the source included be
> > > reflected
> > >>>> in LICENSES.txt. This would be ~0, I think a couple things are
> > > included.
> > >>>>>>
> > >>>>>> Several things lead me to favor a source-only release:
> > >>>>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather
> we
> > >>>> spend time on things that add more value
> > >>>>>> 2) I have never used the binary release. Any version of a source
> > >>>> download and `./make-dirstribution` works universally.
> > >>>>>> 3) our install.sh now installs source and builds it for the user.
> > > This
> > >>>> is good because we can use the same script for unreleased -SNAPSHOT
> > >>>> versions sitting in the `develop` branch.
> > >>>>>> 4) outside of instructions for downloading and installing the
> binary
> > >>>> that do not yet exist afaik, there would be no obvious way for the
> > > user
> > >> to
> > >>>> get the binary.
> > >>>>>> 5) indirectly any delay to release is getting to be a serious
> > > problem.
> > >>>> We haven’t had a well supported release from the main project since
> > >> close
> > >>>> on a year ago and work on new features is being delayed.
> > >>>>>> 6) we can do a source only release now and be clean of the license
> > >>>> issue as far as the IPMC is concerned. We can add binary when we
> have
> > > a
> > >>>> better answer to automation. In other words why hold the release for
> > >> binary?
> > >>>>>>
> > >>>>>> Since this decision will affect the project for as long as it is
> in
> > >>>> incubation. I’d like to see what others think. I believe we can
> > > release
> > >> now
> > >>>> if we do source-only.
> > >>>>>>
> > >>>>>> Source only, or source & binary?
> > >>>>
> > >>
> > >
> >
> >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: Binary or Source release

Posted by Suneel Marthi <su...@gmail.com>.
On Tue, Sep 6, 2016 at 1:17 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Ok, if no other objections then we have no blockers for source release.
>
> There is a process to get the tar into Apache mirrors but to release on
> Github all we need to do it merge develop, wait for travis tests and tag
> the master. Github can be told to produce a source release tar and host it
> and the install.sh can be made to point to it.
>

Phew!!! Life ain't that easy.

The release needs to be first deployed to staging for  - PPMC voting and
validation, followed by IPMC voting and validation.

Once both of the above votes pass - its then that the release is finalized
and posted to Apache mirrors and then tag the release in github.

There's a reason for creating all of those staging and deployment sites for
PIO in https://issues.apache.org/jira/browse/INFRA-12384

In the maven world most of these tasks can be accomplished by bunch of
plugins and profiles - maven-release-plugin, maven-gpg-plugin (for
producing the MD5 and SHA1 sigs for each artifact), maven-jar-plugin,
maven-assembly-plugin etc...

Not sure how its all done with SBT.

Are folks aware of any Apache project using Sbt and has had atleast one
release? I believe Gearpump is all Sbt, maybe check with Kam Kasravi & Co
as to what the release process and plugins are.



> Not sure what the ASF rules are regarding this so maybe the mentors can
> comment—specifically do we have to use the Apache mirror system?
>
>
> On Sep 5, 2016, at 4:34 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>
> On Sep 5, 2016, at 1:25 PM, Alex Merritt <em...@apache.org> wrote:
>
> Agree we should go source only for this release.
>
> On Sep 5, 2016 1:10 PM, "Suneel Marthi" <su...@gmail.com> wrote:
>
> > On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <andrew.purtell@gmail.com
> >
> > wrote:
> >
> >> I also don't have experience with SBT, apologies. I did do some poking
> >> around on Google and it looks like SBT is well behind Maven in providing
> >> this type of functionality out of the box or by third party plugin
> >> (sbt-assembly does some useful and interesting things but is focused
> >> exclusively on producing über jars). I think that's to be expected given
> >> the origin story. "Maven is huge and crufty and we want new and simple!"
> >> "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not
> >> able to do a lot of what Maven can. Years of reinventing the wheel
> ahead,
> >> ahoy! Happy to be corrected.
> >>
> >
> > Heh, not to mention that Sbt is just not as flexible as maven in being
> able
> > to handle different phases and cycles of build and deployment.
> >
> >
> >> Doing what I've described looks achievable by programming what is needed
> >> in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE
> >> and NOTICE files by hand and figure out how to break release builds if
> >> dependencies change and the metadata hasn't been updated by hand?
> >>
> >> I was wondering why Spark went with Maven for their build of reference.
> >>
> >> My little rant on SBT aside I am NOT suggesting you replace SBT with
> >> Maven. That would be in my opinion an unfortunate use of developer
> >> bandwidth better put to task getting the current software with current
> >> build system out the door in a first Apache release.
> >>
> >
> > +1 and we all seem to agree for a quick source-only first release.
> >
> >
> >>> On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org> wrote:
> >>>
> >>> Its easy to do what Andy is describing using maven's assembly plugin in
> >> the
> >>> maven world. I have no experience with sbt so can't speak to how it can
> >> be
> >>> done with Sbt and would defer that to the experts.
> >>>
> >>> We hit a similar issue with licenses in source and binary on the first
> >> Pirk
> >>> release last week. We finally decided to make a source-only first
> > release
> >>> while we r now working on fixing the binary license packaging for the
> >> next
> >>> release.
> >>>
> >>>
> >>>
> >>> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> > andrew.purtell@gmail.com
> >>>
> >>> wrote:
> >>>
> >>>> It covers LICENSE and NOTICE file generation for both source and
> > binary
> >>>> releases, and inclusion of the resulting files in source archives,
> >> binary
> >>>> jars, and binary archives through integration with the maven build and
> >>>> assembly targets.
> >>>>
> >>>> Including the complete text of any given license in LICENSE is
> > important
> >>>> but only needs to be done once. You retain the copyright notice and
> >> mention
> >>>> of the license type per dependency. We are just talking about
> >>>> deduplicating, eg 100 full texts of the ASLv2 into one.
> >>>>
> >>>>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com>
> > wrote:
> >>>>>
> >>>>> Thanks Andy.
> >>>>>
> >>>>> RE “Only need to include one entry with the complete text of a
> > license,
> >>>> everything else can just name the license.” So the copyright notice in
> >> the
> >>>> license is not important, only the license type? This is often the
> > only
> >>>> important difference in the license from one dep to another.
> >>>>>
> >>>>> It sounds like your automation covered LICENSE.txt creation? or just
> >>>> inclusion in the binary?
> >>>>>
> >>>>>
> >>>>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> > andrew.purtell@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> I won't weigh in on the question at hand but I'd like to make a
> > couple
> >>>> of clarifications for what it is worth:
> >>>>>
> >>>>>> This yielded 166 deps, so this implies we need to include 166
> > licenses
> >>>> and copyright notices in LICENSE.txt.
> >>>>>
> >>>>> There are some available simplifications:
> >>>>>
> >>>>> - Only need to include one entry with the complete text of a license,
> >>>> everything else can just name the license.
> >>>>>
> >>>>> - Where there are multiple artifacts coming from a single project,
> > like
> >>>> Hadoop, only one entry for the project is needed.
> >>>>>
> >>>>>> Donald is looking at automating this but I’m personally dubious
> >>>>>
> >>>>> As I think I've mentioned before here we have successfully automated
> >>>> this for HBase (based on automation done by yet other Apache projects)
> >> so I
> >>>> hope you'll take my advice and evidence based assertion it can be
> > done.
> >>>> Caveat: we use maven not SBT as build framework.
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> >>>>>>
> >>>>>> This weekend I tracked down all out deps, which required a few
> > scripts
> >>>> to process sbt output. This yielded 166 deps, so this implies we need
> > to
> >>>> include 166 licenses and copyright notices in LICENSE.txt. As I read
> > the
> >>>> Apache guidelines this should be the license that goes with the
> > version
> >> we
> >>>> include since the copyright owner of license may have changed in newer
> >>>> versions.
> >>>>>>
> >>>>>> This may be near impossible to maintain by hand if we have frequent
> >>>> dependency upgrades and frequent releases. Donald is looking at
> >> automating
> >>>> this but I’m personally dubious about this because it require all 166
> >> deps
> >>>> have maintained their licenses in artifacts for all versions we might
> >> use.
> >>>>>>
> >>>>>> A source release requires that *only* the source included be
> > reflected
> >>>> in LICENSES.txt. This would be ~0, I think a couple things are
> > included.
> >>>>>>
> >>>>>> Several things lead me to favor a source-only release:
> >>>>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
> >>>> spend time on things that add more value
> >>>>>> 2) I have never used the binary release. Any version of a source
> >>>> download and `./make-dirstribution` works universally.
> >>>>>> 3) our install.sh now installs source and builds it for the user.
> > This
> >>>> is good because we can use the same script for unreleased -SNAPSHOT
> >>>> versions sitting in the `develop` branch.
> >>>>>> 4) outside of instructions for downloading and installing the binary
> >>>> that do not yet exist afaik, there would be no obvious way for the
> > user
> >> to
> >>>> get the binary.
> >>>>>> 5) indirectly any delay to release is getting to be a serious
> > problem.
> >>>> We haven’t had a well supported release from the main project since
> >> close
> >>>> on a year ago and work on new features is being delayed.
> >>>>>> 6) we can do a source only release now and be clean of the license
> >>>> issue as far as the IPMC is concerned. We can add binary when we have
> > a
> >>>> better answer to automation. In other words why hold the release for
> >> binary?
> >>>>>>
> >>>>>> Since this decision will affect the project for as long as it is in
> >>>> incubation. I’d like to see what others think. I believe we can
> > release
> >> now
> >>>> if we do source-only.
> >>>>>>
> >>>>>> Source only, or source & binary?
> >>>>
> >>
> >
>
>
>

Re: Binary or Source release

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Ok, if no other objections then we have no blockers for source release.

There is a process to get the tar into Apache mirrors but to release on Github all we need to do it merge develop, wait for travis tests and tag the master. Github can be told to produce a source release tar and host it and the install.sh can be made to point to it.

Not sure what the ASF rules are regarding this so maybe the mentors can comment—specifically do we have to use the Apache mirror system?


On Sep 5, 2016, at 4:34 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:


On Sep 5, 2016, at 1:25 PM, Alex Merritt <em...@apache.org> wrote:

Agree we should go source only for this release.

On Sep 5, 2016 1:10 PM, "Suneel Marthi" <su...@gmail.com> wrote:

> On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <an...@gmail.com>
> wrote:
> 
>> I also don't have experience with SBT, apologies. I did do some poking
>> around on Google and it looks like SBT is well behind Maven in providing
>> this type of functionality out of the box or by third party plugin
>> (sbt-assembly does some useful and interesting things but is focused
>> exclusively on producing über jars). I think that's to be expected given
>> the origin story. "Maven is huge and crufty and we want new and simple!"
>> "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not
>> able to do a lot of what Maven can. Years of reinventing the wheel ahead,
>> ahoy! Happy to be corrected.
>> 
> 
> Heh, not to mention that Sbt is just not as flexible as maven in being able
> to handle different phases and cycles of build and deployment.
> 
> 
>> Doing what I've described looks achievable by programming what is needed
>> in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE
>> and NOTICE files by hand and figure out how to break release builds if
>> dependencies change and the metadata hasn't been updated by hand?
>> 
>> I was wondering why Spark went with Maven for their build of reference.
>> 
>> My little rant on SBT aside I am NOT suggesting you replace SBT with
>> Maven. That would be in my opinion an unfortunate use of developer
>> bandwidth better put to task getting the current software with current
>> build system out the door in a first Apache release.
>> 
> 
> +1 and we all seem to agree for a quick source-only first release.
> 
> 
>>> On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org> wrote:
>>> 
>>> Its easy to do what Andy is describing using maven's assembly plugin in
>> the
>>> maven world. I have no experience with sbt so can't speak to how it can
>> be
>>> done with Sbt and would defer that to the experts.
>>> 
>>> We hit a similar issue with licenses in source and binary on the first
>> Pirk
>>> release last week. We finally decided to make a source-only first
> release
>>> while we r now working on fixing the binary license packaging for the
>> next
>>> release.
>>> 
>>> 
>>> 
>>> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> andrew.purtell@gmail.com
>>> 
>>> wrote:
>>> 
>>>> It covers LICENSE and NOTICE file generation for both source and
> binary
>>>> releases, and inclusion of the resulting files in source archives,
>> binary
>>>> jars, and binary archives through integration with the maven build and
>>>> assembly targets.
>>>> 
>>>> Including the complete text of any given license in LICENSE is
> important
>>>> but only needs to be done once. You retain the copyright notice and
>> mention
>>>> of the license type per dependency. We are just talking about
>>>> deduplicating, eg 100 full texts of the ASLv2 into one.
>>>> 
>>>>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>>>>> 
>>>>> Thanks Andy.
>>>>> 
>>>>> RE “Only need to include one entry with the complete text of a
> license,
>>>> everything else can just name the license.” So the copyright notice in
>> the
>>>> license is not important, only the license type? This is often the
> only
>>>> important difference in the license from one dep to another.
>>>>> 
>>>>> It sounds like your automation covered LICENSE.txt creation? or just
>>>> inclusion in the binary?
>>>>> 
>>>>> 
>>>>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> andrew.purtell@gmail.com>
>>>>> wrote:
>>>>> 
>>>>> I won't weigh in on the question at hand but I'd like to make a
> couple
>>>> of clarifications for what it is worth:
>>>>> 
>>>>>> This yielded 166 deps, so this implies we need to include 166
> licenses
>>>> and copyright notices in LICENSE.txt.
>>>>> 
>>>>> There are some available simplifications:
>>>>> 
>>>>> - Only need to include one entry with the complete text of a license,
>>>> everything else can just name the license.
>>>>> 
>>>>> - Where there are multiple artifacts coming from a single project,
> like
>>>> Hadoop, only one entry for the project is needed.
>>>>> 
>>>>>> Donald is looking at automating this but I’m personally dubious
>>>>> 
>>>>> As I think I've mentioned before here we have successfully automated
>>>> this for HBase (based on automation done by yet other Apache projects)
>> so I
>>>> hope you'll take my advice and evidence based assertion it can be
> done.
>>>> Caveat: we use maven not SBT as build framework.
>>>>> 
>>>>> 
>>>>>> 
>>>>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
>>>>>> 
>>>>>> This weekend I tracked down all out deps, which required a few
> scripts
>>>> to process sbt output. This yielded 166 deps, so this implies we need
> to
>>>> include 166 licenses and copyright notices in LICENSE.txt. As I read
> the
>>>> Apache guidelines this should be the license that goes with the
> version
>> we
>>>> include since the copyright owner of license may have changed in newer
>>>> versions.
>>>>>> 
>>>>>> This may be near impossible to maintain by hand if we have frequent
>>>> dependency upgrades and frequent releases. Donald is looking at
>> automating
>>>> this but I’m personally dubious about this because it require all 166
>> deps
>>>> have maintained their licenses in artifacts for all versions we might
>> use.
>>>>>> 
>>>>>> A source release requires that *only* the source included be
> reflected
>>>> in LICENSES.txt. This would be ~0, I think a couple things are
> included.
>>>>>> 
>>>>>> Several things lead me to favor a source-only release:
>>>>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
>>>> spend time on things that add more value
>>>>>> 2) I have never used the binary release. Any version of a source
>>>> download and `./make-dirstribution` works universally.
>>>>>> 3) our install.sh now installs source and builds it for the user.
> This
>>>> is good because we can use the same script for unreleased -SNAPSHOT
>>>> versions sitting in the `develop` branch.
>>>>>> 4) outside of instructions for downloading and installing the binary
>>>> that do not yet exist afaik, there would be no obvious way for the
> user
>> to
>>>> get the binary.
>>>>>> 5) indirectly any delay to release is getting to be a serious
> problem.
>>>> We haven’t had a well supported release from the main project since
>> close
>>>> on a year ago and work on new features is being delayed.
>>>>>> 6) we can do a source only release now and be clean of the license
>>>> issue as far as the IPMC is concerned. We can add binary when we have
> a
>>>> better answer to automation. In other words why hold the release for
>> binary?
>>>>>> 
>>>>>> Since this decision will affect the project for as long as it is in
>>>> incubation. I’d like to see what others think. I believe we can
> release
>> now
>>>> if we do source-only.
>>>>>> 
>>>>>> Source only, or source & binary?
>>>> 
>> 
> 



Re: Binary or Source release

Posted by Alex Merritt <em...@apache.org>.
Agree we should go source only for this release.

On Sep 5, 2016 1:10 PM, "Suneel Marthi" <su...@gmail.com> wrote:

> On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <an...@gmail.com>
> wrote:
>
> > I also don't have experience with SBT, apologies. I did do some poking
> > around on Google and it looks like SBT is well behind Maven in providing
> > this type of functionality out of the box or by third party plugin
> > (sbt-assembly does some useful and interesting things but is focused
> > exclusively on producing über jars). I think that's to be expected given
> > the origin story. "Maven is huge and crufty and we want new and simple!"
> > "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not
> > able to do a lot of what Maven can. Years of reinventing the wheel ahead,
> > ahoy! Happy to be corrected.
> >
>
> Heh, not to mention that Sbt is just not as flexible as maven in being able
> to handle different phases and cycles of build and deployment.
>
>
> > Doing what I've described looks achievable by programming what is needed
> > in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE
> > and NOTICE files by hand and figure out how to break release builds if
> > dependencies change and the metadata hasn't been updated by hand?
> >
> > I was wondering why Spark went with Maven for their build of reference.
> >
> > My little rant on SBT aside I am NOT suggesting you replace SBT with
> > Maven. That would be in my opinion an unfortunate use of developer
> > bandwidth better put to task getting the current software with current
> > build system out the door in a first Apache release.
> >
>
> +1 and we all seem to agree for a quick source-only first release.
>
>
> > > On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org> wrote:
> > >
> > > Its easy to do what Andy is describing using maven's assembly plugin in
> > the
> > > maven world. I have no experience with sbt so can't speak to how it can
> > be
> > > done with Sbt and would defer that to the experts.
> > >
> > > We hit a similar issue with licenses in source and binary on the first
> > Pirk
> > > release last week. We finally decided to make a source-only first
> release
> > > while we r now working on fixing the binary license packaging for the
> > next
> > > release.
> > >
> > >
> > >
> > > On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <
> andrew.purtell@gmail.com
> > >
> > > wrote:
> > >
> > >> It covers LICENSE and NOTICE file generation for both source and
> binary
> > >> releases, and inclusion of the resulting files in source archives,
> > binary
> > >> jars, and binary archives through integration with the maven build and
> > >> assembly targets.
> > >>
> > >> Including the complete text of any given license in LICENSE is
> important
> > >> but only needs to be done once. You retain the copyright notice and
> > mention
> > >> of the license type per dependency. We are just talking about
> > >> deduplicating, eg 100 full texts of the ASLv2 into one.
> > >>
> > >>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
> > >>>
> > >>> Thanks Andy.
> > >>>
> > >>> RE “Only need to include one entry with the complete text of a
> license,
> > >> everything else can just name the license.” So the copyright notice in
> > the
> > >> license is not important, only the license type? This is often the
> only
> > >> important difference in the license from one dep to another.
> > >>>
> > >>> It sounds like your automation covered LICENSE.txt creation? or just
> > >> inclusion in the binary?
> > >>>
> > >>>
> > >>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <
> andrew.purtell@gmail.com>
> > >>> wrote:
> > >>>
> > >>> I won't weigh in on the question at hand but I'd like to make a
> couple
> > >> of clarifications for what it is worth:
> > >>>
> > >>>> This yielded 166 deps, so this implies we need to include 166
> licenses
> > >> and copyright notices in LICENSE.txt.
> > >>>
> > >>> There are some available simplifications:
> > >>>
> > >>> - Only need to include one entry with the complete text of a license,
> > >> everything else can just name the license.
> > >>>
> > >>> - Where there are multiple artifacts coming from a single project,
> like
> > >> Hadoop, only one entry for the project is needed.
> > >>>
> > >>>> Donald is looking at automating this but I’m personally dubious
> > >>>
> > >>> As I think I've mentioned before here we have successfully automated
> > >> this for HBase (based on automation done by yet other Apache projects)
> > so I
> > >> hope you'll take my advice and evidence based assertion it can be
> done.
> > >> Caveat: we use maven not SBT as build framework.
> > >>>
> > >>>
> > >>>>
> > >>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> > >>>>
> > >>>> This weekend I tracked down all out deps, which required a few
> scripts
> > >> to process sbt output. This yielded 166 deps, so this implies we need
> to
> > >> include 166 licenses and copyright notices in LICENSE.txt. As I read
> the
> > >> Apache guidelines this should be the license that goes with the
> version
> > we
> > >> include since the copyright owner of license may have changed in newer
> > >> versions.
> > >>>>
> > >>>> This may be near impossible to maintain by hand if we have frequent
> > >> dependency upgrades and frequent releases. Donald is looking at
> > automating
> > >> this but I’m personally dubious about this because it require all 166
> > deps
> > >> have maintained their licenses in artifacts for all versions we might
> > use.
> > >>>>
> > >>>> A source release requires that *only* the source included be
> reflected
> > >> in LICENSES.txt. This would be ~0, I think a couple things are
> included.
> > >>>>
> > >>>> Several things lead me to favor a source-only release:
> > >>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
> > >> spend time on things that add more value
> > >>>> 2) I have never used the binary release. Any version of a source
> > >> download and `./make-dirstribution` works universally.
> > >>>> 3) our install.sh now installs source and builds it for the user.
> This
> > >> is good because we can use the same script for unreleased -SNAPSHOT
> > >> versions sitting in the `develop` branch.
> > >>>> 4) outside of instructions for downloading and installing the binary
> > >> that do not yet exist afaik, there would be no obvious way for the
> user
> > to
> > >> get the binary.
> > >>>> 5) indirectly any delay to release is getting to be a serious
> problem.
> > >> We haven’t had a well supported release from the main project since
> > close
> > >> on a year ago and work on new features is being delayed.
> > >>>> 6) we can do a source only release now and be clean of the license
> > >> issue as far as the IPMC is concerned. We can add binary when we have
> a
> > >> better answer to automation. In other words why hold the release for
> > binary?
> > >>>>
> > >>>> Since this decision will affect the project for as long as it is in
> > >> incubation. I’d like to see what others think. I believe we can
> release
> > now
> > >> if we do source-only.
> > >>>>
> > >>>> Source only, or source & binary?
> > >>
> >
>

Re: Binary or Source release

Posted by Suneel Marthi <su...@gmail.com>.
On Mon, Sep 5, 2016 at 2:55 PM, Andrew Purtell <an...@gmail.com>
wrote:

> I also don't have experience with SBT, apologies. I did do some poking
> around on Google and it looks like SBT is well behind Maven in providing
> this type of functionality out of the box or by third party plugin
> (sbt-assembly does some useful and interesting things but is focused
> exclusively on producing über jars). I think that's to be expected given
> the origin story. "Maven is huge and crufty and we want new and simple!"
> "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not
> able to do a lot of what Maven can. Years of reinventing the wheel ahead,
> ahoy! Happy to be corrected.
>

Heh, not to mention that Sbt is just not as flexible as maven in being able
to handle different phases and cycles of build and deployment.


> Doing what I've described looks achievable by programming what is needed
> in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE
> and NOTICE files by hand and figure out how to break release builds if
> dependencies change and the metadata hasn't been updated by hand?
>
> I was wondering why Spark went with Maven for their build of reference.
>
> My little rant on SBT aside I am NOT suggesting you replace SBT with
> Maven. That would be in my opinion an unfortunate use of developer
> bandwidth better put to task getting the current software with current
> build system out the door in a first Apache release.
>

+1 and we all seem to agree for a quick source-only first release.


> > On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org> wrote:
> >
> > Its easy to do what Andy is describing using maven's assembly plugin in
> the
> > maven world. I have no experience with sbt so can't speak to how it can
> be
> > done with Sbt and would defer that to the experts.
> >
> > We hit a similar issue with licenses in source and binary on the first
> Pirk
> > release last week. We finally decided to make a source-only first release
> > while we r now working on fixing the binary license packaging for the
> next
> > release.
> >
> >
> >
> > On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <andrew.purtell@gmail.com
> >
> > wrote:
> >
> >> It covers LICENSE and NOTICE file generation for both source and binary
> >> releases, and inclusion of the resulting files in source archives,
> binary
> >> jars, and binary archives through integration with the maven build and
> >> assembly targets.
> >>
> >> Including the complete text of any given license in LICENSE is important
> >> but only needs to be done once. You retain the copyright notice and
> mention
> >> of the license type per dependency. We are just talking about
> >> deduplicating, eg 100 full texts of the ASLv2 into one.
> >>
> >>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >>>
> >>> Thanks Andy.
> >>>
> >>> RE “Only need to include one entry with the complete text of a license,
> >> everything else can just name the license.” So the copyright notice in
> the
> >> license is not important, only the license type? This is often the only
> >> important difference in the license from one dep to another.
> >>>
> >>> It sounds like your automation covered LICENSE.txt creation? or just
> >> inclusion in the binary?
> >>>
> >>>
> >>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com>
> >>> wrote:
> >>>
> >>> I won't weigh in on the question at hand but I'd like to make a couple
> >> of clarifications for what it is worth:
> >>>
> >>>> This yielded 166 deps, so this implies we need to include 166 licenses
> >> and copyright notices in LICENSE.txt.
> >>>
> >>> There are some available simplifications:
> >>>
> >>> - Only need to include one entry with the complete text of a license,
> >> everything else can just name the license.
> >>>
> >>> - Where there are multiple artifacts coming from a single project, like
> >> Hadoop, only one entry for the project is needed.
> >>>
> >>>> Donald is looking at automating this but I’m personally dubious
> >>>
> >>> As I think I've mentioned before here we have successfully automated
> >> this for HBase (based on automation done by yet other Apache projects)
> so I
> >> hope you'll take my advice and evidence based assertion it can be done.
> >> Caveat: we use maven not SBT as build framework.
> >>>
> >>>
> >>>>
> >>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> >>>>
> >>>> This weekend I tracked down all out deps, which required a few scripts
> >> to process sbt output. This yielded 166 deps, so this implies we need to
> >> include 166 licenses and copyright notices in LICENSE.txt. As I read the
> >> Apache guidelines this should be the license that goes with the version
> we
> >> include since the copyright owner of license may have changed in newer
> >> versions.
> >>>>
> >>>> This may be near impossible to maintain by hand if we have frequent
> >> dependency upgrades and frequent releases. Donald is looking at
> automating
> >> this but I’m personally dubious about this because it require all 166
> deps
> >> have maintained their licenses in artifacts for all versions we might
> use.
> >>>>
> >>>> A source release requires that *only* the source included be reflected
> >> in LICENSES.txt. This would be ~0, I think a couple things are included.
> >>>>
> >>>> Several things lead me to favor a source-only release:
> >>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
> >> spend time on things that add more value
> >>>> 2) I have never used the binary release. Any version of a source
> >> download and `./make-dirstribution` works universally.
> >>>> 3) our install.sh now installs source and builds it for the user. This
> >> is good because we can use the same script for unreleased -SNAPSHOT
> >> versions sitting in the `develop` branch.
> >>>> 4) outside of instructions for downloading and installing the binary
> >> that do not yet exist afaik, there would be no obvious way for the user
> to
> >> get the binary.
> >>>> 5) indirectly any delay to release is getting to be a serious problem.
> >> We haven’t had a well supported release from the main project since
> close
> >> on a year ago and work on new features is being delayed.
> >>>> 6) we can do a source only release now and be clean of the license
> >> issue as far as the IPMC is concerned. We can add binary when we have a
> >> better answer to automation. In other words why hold the release for
> binary?
> >>>>
> >>>> Since this decision will affect the project for as long as it is in
> >> incubation. I’d like to see what others think. I believe we can release
> now
> >> if we do source-only.
> >>>>
> >>>> Source only, or source & binary?
> >>
>

Re: Binary or Source release

Posted by Andrew Purtell <an...@gmail.com>.
I also don't have experience with SBT, apologies. I did do some poking around on Google and it looks like SBT is well behind Maven in providing this type of functionality out of the box or by third party plugin (sbt-assembly does some useful and interesting things but is focused exclusively on producing über jars). I think that's to be expected given the origin story. "Maven is huge and crufty and we want new and simple!" "Ok, let's make Simple Build Tool!" Fast forward. No longer simple. Not able to do a lot of what Maven can. Years of reinventing the wheel ahead, ahoy! Happy to be corrected. 

Doing what I've described looks achievable by programming what is needed in SBT's DSL. Source only releases for a while maybe? Or work up LICENSE and NOTICE files by hand and figure out how to break release builds if dependencies change and the metadata hasn't been updated by hand? 

I was wondering why Spark went with Maven for their build of reference. 

My little rant on SBT aside I am NOT suggesting you replace SBT with Maven. That would be in my opinion an unfortunate use of developer bandwidth better put to task getting the current software with current build system out the door in a first Apache release. 

> On Sep 5, 2016, at 11:23 AM, Suneel Marthi <sm...@apache.org> wrote:
> 
> Its easy to do what Andy is describing using maven's assembly plugin in the
> maven world. I have no experience with sbt so can't speak to how it can be
> done with Sbt and would defer that to the experts.
> 
> We hit a similar issue with licenses in source and binary on the first Pirk
> release last week. We finally decided to make a source-only first release
> while we r now working on fixing the binary license packaging for the next
> release.
> 
> 
> 
> On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <an...@gmail.com>
> wrote:
> 
>> It covers LICENSE and NOTICE file generation for both source and binary
>> releases, and inclusion of the resulting files in source archives, binary
>> jars, and binary archives through integration with the maven build and
>> assembly targets.
>> 
>> Including the complete text of any given license in LICENSE is important
>> but only needs to be done once. You retain the copyright notice and mention
>> of the license type per dependency. We are just talking about
>> deduplicating, eg 100 full texts of the ASLv2 into one.
>> 
>>> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>>> 
>>> Thanks Andy.
>>> 
>>> RE “Only need to include one entry with the complete text of a license,
>> everything else can just name the license.” So the copyright notice in the
>> license is not important, only the license type? This is often the only
>> important difference in the license from one dep to another.
>>> 
>>> It sounds like your automation covered LICENSE.txt creation? or just
>> inclusion in the binary?
>>> 
>>> 
>>>> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com>
>>> wrote:
>>> 
>>> I won't weigh in on the question at hand but I'd like to make a couple
>> of clarifications for what it is worth:
>>> 
>>>> This yielded 166 deps, so this implies we need to include 166 licenses
>> and copyright notices in LICENSE.txt.
>>> 
>>> There are some available simplifications:
>>> 
>>> - Only need to include one entry with the complete text of a license,
>> everything else can just name the license.
>>> 
>>> - Where there are multiple artifacts coming from a single project, like
>> Hadoop, only one entry for the project is needed.
>>> 
>>>> Donald is looking at automating this but I’m personally dubious
>>> 
>>> As I think I've mentioned before here we have successfully automated
>> this for HBase (based on automation done by yet other Apache projects) so I
>> hope you'll take my advice and evidence based assertion it can be done.
>> Caveat: we use maven not SBT as build framework.
>>> 
>>> 
>>>> 
>>>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
>>>> 
>>>> This weekend I tracked down all out deps, which required a few scripts
>> to process sbt output. This yielded 166 deps, so this implies we need to
>> include 166 licenses and copyright notices in LICENSE.txt. As I read the
>> Apache guidelines this should be the license that goes with the version we
>> include since the copyright owner of license may have changed in newer
>> versions.
>>>> 
>>>> This may be near impossible to maintain by hand if we have frequent
>> dependency upgrades and frequent releases. Donald is looking at automating
>> this but I’m personally dubious about this because it require all 166 deps
>> have maintained their licenses in artifacts for all versions we might use.
>>>> 
>>>> A source release requires that *only* the source included be reflected
>> in LICENSES.txt. This would be ~0, I think a couple things are included.
>>>> 
>>>> Several things lead me to favor a source-only release:
>>>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
>> spend time on things that add more value
>>>> 2) I have never used the binary release. Any version of a source
>> download and `./make-dirstribution` works universally.
>>>> 3) our install.sh now installs source and builds it for the user. This
>> is good because we can use the same script for unreleased -SNAPSHOT
>> versions sitting in the `develop` branch.
>>>> 4) outside of instructions for downloading and installing the binary
>> that do not yet exist afaik, there would be no obvious way for the user to
>> get the binary.
>>>> 5) indirectly any delay to release is getting to be a serious problem.
>> We haven’t had a well supported release from the main project since close
>> on a year ago and work on new features is being delayed.
>>>> 6) we can do a source only release now and be clean of the license
>> issue as far as the IPMC is concerned. We can add binary when we have a
>> better answer to automation. In other words why hold the release for binary?
>>>> 
>>>> Since this decision will affect the project for as long as it is in
>> incubation. I’d like to see what others think. I believe we can release now
>> if we do source-only.
>>>> 
>>>> Source only, or source & binary?
>> 

Re: Binary or Source release

Posted by Suneel Marthi <sm...@apache.org>.
Its easy to do what Andy is describing using maven's assembly plugin in the
maven world. I have no experience with sbt so can't speak to how it can be
done with Sbt and would defer that to the experts.

We hit a similar issue with licenses in source and binary on the first Pirk
release last week. We finally decided to make a source-only first release
while we r now working on fixing the binary license packaging for the next
release.



On Mon, Sep 5, 2016 at 2:05 PM, Andrew Purtell <an...@gmail.com>
wrote:

> It covers LICENSE and NOTICE file generation for both source and binary
> releases, and inclusion of the resulting files in source archives, binary
> jars, and binary archives through integration with the maven build and
> assembly targets.
>
> Including the complete text of any given license in LICENSE is important
> but only needs to be done once. You retain the copyright notice and mention
> of the license type per dependency. We are just talking about
> deduplicating, eg 100 full texts of the ASLv2 into one.
>
> > On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> >
> > Thanks Andy.
> >
> > RE “Only need to include one entry with the complete text of a license,
> everything else can just name the license.” So the copyright notice in the
> license is not important, only the license type? This is often the only
> important difference in the license from one dep to another.
> >
> > It sounds like your automation covered LICENSE.txt creation? or just
> inclusion in the binary?
> >
> >
> > On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com>
> wrote:
> >
> > I won't weigh in on the question at hand but I'd like to make a couple
> of clarifications for what it is worth:
> >
> >> This yielded 166 deps, so this implies we need to include 166 licenses
> and copyright notices in LICENSE.txt.
> >
> > There are some available simplifications:
> >
> > - Only need to include one entry with the complete text of a license,
> everything else can just name the license.
> >
> > - Where there are multiple artifacts coming from a single project, like
> Hadoop, only one entry for the project is needed.
> >
> >> Donald is looking at automating this but I’m personally dubious
> >
> > As I think I've mentioned before here we have successfully automated
> this for HBase (based on automation done by yet other Apache projects) so I
> hope you'll take my advice and evidence based assertion it can be done.
> Caveat: we use maven not SBT as build framework.
> >
> >
> >>
> >> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> >>
> >> This weekend I tracked down all out deps, which required a few scripts
> to process sbt output. This yielded 166 deps, so this implies we need to
> include 166 licenses and copyright notices in LICENSE.txt. As I read the
> Apache guidelines this should be the license that goes with the version we
> include since the copyright owner of license may have changed in newer
> versions.
> >>
> >> This may be near impossible to maintain by hand if we have frequent
> dependency upgrades and frequent releases. Donald is looking at automating
> this but I’m personally dubious about this because it require all 166 deps
> have maintained their licenses in artifacts for all versions we might use.
> >>
> >> A source release requires that *only* the source included be reflected
> in LICENSES.txt. This would be ~0, I think a couple things are included.
> >>
> >> Several things lead me to favor a source-only release:
> >> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we
> spend time on things that add more value
> >> 2) I have never used the binary release. Any version of a source
> download and `./make-dirstribution` works universally.
> >> 3) our install.sh now installs source and builds it for the user. This
> is good because we can use the same script for unreleased -SNAPSHOT
> versions sitting in the `develop` branch.
> >> 4) outside of instructions for downloading and installing the binary
> that do not yet exist afaik, there would be no obvious way for the user to
> get the binary.
> >> 5) indirectly any delay to release is getting to be a serious problem.
> We haven’t had a well supported release from the main project since close
> on a year ago and work on new features is being delayed.
> >> 6) we can do a source only release now and be clean of the license
> issue as far as the IPMC is concerned. We can add binary when we have a
> better answer to automation. In other words why hold the release for binary?
> >>
> >> Since this decision will affect the project for as long as it is in
> incubation. I’d like to see what others think. I believe we can release now
> if we do source-only.
> >>
> >> Source only, or source & binary?
> >
>

Re: Binary or Source release

Posted by Andrew Purtell <an...@gmail.com>.
It covers LICENSE and NOTICE file generation for both source and binary releases, and inclusion of the resulting files in source archives, binary jars, and binary archives through integration with the maven build and assembly targets. 

Including the complete text of any given license in LICENSE is important but only needs to be done once. You retain the copyright notice and mention of the license type per dependency. We are just talking about deduplicating, eg 100 full texts of the ASLv2 into one. 

> On Sep 5, 2016, at 10:45 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> 
> Thanks Andy. 
> 
> RE “Only need to include one entry with the complete text of a license, everything else can just name the license.” So the copyright notice in the license is not important, only the license type? This is often the only important difference in the license from one dep to another.
> 
> It sounds like your automation covered LICENSE.txt creation? or just inclusion in the binary?
> 
> 
> On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com> wrote:
> 
> I won't weigh in on the question at hand but I'd like to make a couple of clarifications for what it is worth:
> 
>> This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt.
> 
> There are some available simplifications:
> 
> - Only need to include one entry with the complete text of a license, everything else can just name the license. 
> 
> - Where there are multiple artifacts coming from a single project, like Hadoop, only one entry for the project is needed. 
> 
>> Donald is looking at automating this but I’m personally dubious
> 
> As I think I've mentioned before here we have successfully automated this for HBase (based on automation done by yet other Apache projects) so I hope you'll take my advice and evidence based assertion it can be done. Caveat: we use maven not SBT as build framework. 
> 
> 
>> 
>> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
>> 
>> This weekend I tracked down all out deps, which required a few scripts to process sbt output. This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt. As I read the Apache guidelines this should be the license that goes with the version we include since the copyright owner of license may have changed in newer versions.
>> 
>> This may be near impossible to maintain by hand if we have frequent dependency upgrades and frequent releases. Donald is looking at automating this but I’m personally dubious about this because it require all 166 deps have maintained their licenses in artifacts for all versions we might use.
>> 
>> A source release requires that *only* the source included be reflected in LICENSES.txt. This would be ~0, I think a couple things are included.
>> 
>> Several things lead me to favor a source-only release:
>> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we spend time on things that add more value
>> 2) I have never used the binary release. Any version of a source download and `./make-dirstribution` works universally.
>> 3) our install.sh now installs source and builds it for the user. This is good because we can use the same script for unreleased -SNAPSHOT versions sitting in the `develop` branch.
>> 4) outside of instructions for downloading and installing the binary that do not yet exist afaik, there would be no obvious way for the user to get the binary. 
>> 5) indirectly any delay to release is getting to be a serious problem. We haven’t had a well supported release from the main project since close on a year ago and work on new features is being delayed.
>> 6) we can do a source only release now and be clean of the license issue as far as the IPMC is concerned. We can add binary when we have a better answer to automation. In other words why hold the release for binary?
>> 
>> Since this decision will affect the project for as long as it is in incubation. I’d like to see what others think. I believe we can release now if we do source-only.
>> 
>> Source only, or source & binary?
> 

Re: Binary or Source release

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Thanks Andy. 

RE “Only need to include one entry with the complete text of a license, everything else can just name the license.” So the copyright notice in the license is not important, only the license type? This is often the only important difference in the license from one dep to another.

It sounds like your automation covered LICENSE.txt creation? or just inclusion in the binary?


On Sep 5, 2016, at 9:59 AM, Andrew Purtell <an...@gmail.com> wrote:

I won't weigh in on the question at hand but I'd like to make a couple of clarifications for what it is worth:

> This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt.

There are some available simplifications:

- Only need to include one entry with the complete text of a license, everything else can just name the license. 

- Where there are multiple artifacts coming from a single project, like Hadoop, only one entry for the project is needed. 

> Donald is looking at automating this but I’m personally dubious 

As I think I've mentioned before here we have successfully automated this for HBase (based on automation done by yet other Apache projects) so I hope you'll take my advice and evidence based assertion it can be done. Caveat: we use maven not SBT as build framework. 


> 
> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> 
> This weekend I tracked down all out deps, which required a few scripts to process sbt output. This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt. As I read the Apache guidelines this should be the license that goes with the version we include since the copyright owner of license may have changed in newer versions.
> 
> This may be near impossible to maintain by hand if we have frequent dependency upgrades and frequent releases. Donald is looking at automating this but I’m personally dubious about this because it require all 166 deps have maintained their licenses in artifacts for all versions we might use.
> 
> A source release requires that *only* the source included be reflected in LICENSES.txt. This would be ~0, I think a couple things are included.
> 
> Several things lead me to favor a source-only release:
> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we spend time on things that add more value
> 2) I have never used the binary release. Any version of a source download and `./make-dirstribution` works universally.
> 3) our install.sh now installs source and builds it for the user. This is good because we can use the same script for unreleased -SNAPSHOT versions sitting in the `develop` branch.
> 4) outside of instructions for downloading and installing the binary that do not yet exist afaik, there would be no obvious way for the user to get the binary. 
> 5) indirectly any delay to release is getting to be a serious problem. We haven’t had a well supported release from the main project since close on a year ago and work on new features is being delayed.
> 6) we can do a source only release now and be clean of the license issue as far as the IPMC is concerned. We can add binary when we have a better answer to automation. In other words why hold the release for binary?
> 
> Since this decision will affect the project for as long as it is in incubation. I’d like to see what others think. I believe we can release now if we do source-only.
> 
> Source only, or source & binary?


Re: Binary or Source release

Posted by Andrew Purtell <an...@gmail.com>.
I won't weigh in on the question at hand but I'd like to make a couple of clarifications for what it is worth:

> This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt.

There are some available simplifications:

- Only need to include one entry with the complete text of a license, everything else can just name the license. 

- Where there are multiple artifacts coming from a single project, like Hadoop, only one entry for the project is needed. 

> Donald is looking at automating this but I’m personally dubious 

As I think I've mentioned before here we have successfully automated this for HBase (based on automation done by yet other Apache projects) so I hope you'll take my advice and evidence based assertion it can be done. Caveat: we use maven not SBT as build framework. 


> 
> On Sep 5, 2016, at 9:43 AM, Pat Ferrel <pa...@actionml.com> wrote:
> 
> This weekend I tracked down all out deps, which required a few scripts to process sbt output. This yielded 166 deps, so this implies we need to include 166 licenses and copyright notices in LICENSE.txt. As I read the Apache guidelines this should be the license that goes with the version we include since the copyright owner of license may have changed in newer versions.
> 
> This may be near impossible to maintain by hand if we have frequent dependency upgrades and frequent releases. Donald is looking at automating this but I’m personally dubious about this because it require all 166 deps have maintained their licenses in artifacts for all versions we might use.
> 
> A source release requires that *only* the source included be reflected in LICENSES.txt. This would be ~0, I think a couple things are included.
> 
> Several things lead me to favor a source-only release:
> 1) 166 licenses needed for binary ~0 needed for source—I’d rather we spend time on things that add more value
> 2) I have never used the binary release. Any version of a source download and `./make-dirstribution` works universally.
> 3) our install.sh now installs source and builds it for the user. This is good because we can use the same script for unreleased -SNAPSHOT versions sitting in the `develop` branch.
> 4) outside of instructions for downloading and installing the binary that do not yet exist afaik, there would be no obvious way for the user to get the binary. 
> 5) indirectly any delay to release is getting to be a serious problem. We haven’t had a well supported release from the main project since close on a year ago and work on new features is being delayed.
> 6) we can do a source only release now and be clean of the license issue as far as the IPMC is concerned. We can add binary when we have a better answer to automation. In other words why hold the release for binary?
> 
> Since this decision will affect the project for as long as it is in incubation. I’d like to see what others think. I believe we can release now if we do source-only.
> 
> Source only, or source & binary?