You are viewing a plain text version of this content. The canonical link for it is here.
Posted to legal-discuss@apache.org by Henri Yandell <ba...@apache.org> on 2017/10/02 05:01:27 UTC

MXNet: "Copyright Contributors"

Another podling question.

The MXNet code previously stated:

  Copyright (c) 2015 by

*Contributors*
This is pretty vague and begs the question of what to put in MXNet source
headers going forth. It might even perhaps be a tautology; something is
always copyright to its contributors (where contributors is open source
speak for authors).

Per the previous email, it's likely that not all contributors will have
signed an ICLA with Apache. Do we keep that one liner in every file, do we
put something in the NOTICE along the lines of:

"MXNet was previously published at https://github.com/dmlc/mxnet and was
Copyright (c) 2015 by *Contributors"*

Or put that in every source file.

It's a curious question imo. The current Apache source header says:

"Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership."

Where contributor license agreements include ICLAs, CCLAs, Software Grants
and Clause 5 of the Apache License 2.0; at least in so far as contributed
directly to Apache. It also possibly includes ICLAs/CCLAs signed with third
parties who then signed a Software Grant, and arguably Clause 5 of the
Apache License 2.0 to those third parties. MXNet is fun (aka normal for
most open source projects) in that it's unclear who that third party would
have been; presumably the individual who created the dmlc GitHub
organization as there is no legal entity there.

All of which is to show that I've thought about this a bit, with no perfect
answer. We are not going to get an ICLA from every one of the
"Contributors". We could leave that generic and confusing text in each
source file, but I feel this is an allowable case for moving that copyright
statement into the NOTICE per my proposed text of:

"MXNet was previously published at https://github.com/dmlc/mxnet and was
Copyright (c) 2015 by
*Contributors"*

Do folk on the Legal list feel this is sane, or that I'm barking mad? :)

Thanks,

Hen

Re: MXNet: "Copyright Contributors"

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Hi Henri,

Sorry, I was looking at the wrong repo.  Looks like there isn't already a file containing contributors like a Contributing.md.

IMO, because the code is already ALv2, an SGA from every committer is not required because most of the grants in SGA are already in ALv2.  You just need an email acknowledgement that they agree that their lines of code can be moved to the ASF and the header changed.  So ASF Secretary doesn't have to be involved, just dev@mxnet.

Until every line in a file has been permitted, it is best to treat as ALv2 third-party and thus not change the header.  But I don't think this has to be perfect.  Also, AIUI, headers and copyrights are just helpful reminders, not legally required.  And IIRC, a copyright owner can opt not to have their name in the header or NOTICE but doesn't give up copyright ownership.    So, from the peanut gallery, I'd say the process is


  1.  Email all 500+ committer asking for a reply email giving permission to move the code and change the header to the ASF header and not place any copyright in NOTICE.
  2.  Keep on coding.

Every once in a while, a volunteer can take some time to see if all lines in a file are signed for and change the header to the ASF header.  If you get  replies asking to have their copyright in NOTICE, try to talk them out of it, then if you can't, put them in NOTICE.  IMO, it shouldn't matter that much if a file could have an ASF header but doesn't.  And I don't think your users are going to care if it is the ASF header or the 3rd-party ALv2 header.

My 2 cents,
-Alex


From: Henri Yandell <ba...@apache.org>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Tuesday, October 3, 2017 at 10:47 AM
To: ASF Legal Discuss <le...@apache.org>>
Subject: Re: MXNet: "Copyright Contributors"

Sidenotes:

* Noting that the 'move' is from https://github.com/dmlc<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc&data=02%7C01%7C%7C8ee23efce73a49b742dd08d50a86df69%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636426496747372968&sdata=GztFG%2FXFWOsN1bj5xY2KDIq2xgq636aWHaQw%2FMpyxXk%3D&reserved=0> to https://github.com/apache<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache&data=02%7C01%7C%7C8ee23efce73a49b742dd08d50a86df69%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636426496747372968&sdata=FINvC7jffAwETJ5CyAHugA4rpYQm%2FmfauNzLLuR9MjA%3D&reserved=0>; OpenHub is Black Duck's rename for Ohloh :)

* Also, copyright assignment agreements are very much in the minority.

* On the Copyright statement; I think it'll be simpler to add 'Copyright Contributors' to every file. I don't see a need to list all 500+ contributors in a CONTRIBUTORS.md, I'm not aware of that being a standard for Apache projects. I also don't see a need to review every file to slowly remove such a meaningless statement. If it can't be removed generally, might as well add it everywhere.

[Note that this email thread is Copyright 2017 Contributors. I hope nobody removes this line when replying otherwise Contributors will be unhappy ;) ]

---

On SGA signing:

We're now up to 573 Author emails, but presumably commits after the repo transferred on 7/28 don't need an SGA/ICLA. That reduces the number to 533 authors and 5757 commits.  27 of the top 100 contributors (by GitHub's list) have signed ICLAs (no SGAs that I know of). 22 of the top 34, ie) the ICLAs are clustered to the top of the contribution list. Looking at commit logs, the 34th contributor by count has 30 commits. 10 commits for the contributor in 69th place, 5 commits for the contributor in 113th place and 296 contributor have 1 commit.

Estimating that half of the 296 1-commit contributors were trivial, that gives ~358 people to ask for ICLAs for.

Note that most of those individuals' emails in the git logs are GitHub noreply emails. So at least 300 people will need to be contacted via either a) writing a script to scrape their email address, or b) opening an issue on GitHub and tagging them. About half (209 of 437) of Apache public GitHub members list an email address so perhaps we'd get lucky and get 150 of those people's email addresses. So:

0) Do a triviality check on the commits of 506 contributors. This can fail fast as a contributor needs only one contribution that's non-trivial to move to step 1.

1) Campaign #1: Email 150 people. "Hi Contributor, We noticed you have contributed at least one commit to the MXNet project at github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7C8ee23efce73a49b742dd08d50a86df69%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636426496747372968&sdata=RXufcTFgsbJLH3bX%2F6UCuzHQ%2FTKw6p0PwIHJyHxUtsM%3D&reserved=0>. The project has moved to the Apache Software Foundation. Please could you give us your permission to transfer this project to Apache by printing up this SGA form <link> and sending it to secretary@. Have a great day, Apache".
1.1) Sub-campaign: Identify the email bounces. Attempt to figure out the GitHub login for these emails via the GitHub API and add them to Campaign #2.

2) Campaign #2: Open an Issue on GitHub. Paste the above email with 150+ GitHub @logins so they (hopefully) get an email from GitHub.

For both #1 and #2, check with secretary@ to see who has signed what. Rinse and repeat for 6 months.

3) After 6 months, identify every line of code from a non-trivial contribution that is not covered by an ICLA or SGA. Rewrite the code via a clean room (perhaps by appealing to an employer to pay a contractor, or perhaps by appealing to committers from other Apache projects).

4) Party. We now have permission to have moved the 'community' from one account on GitHub to another account on GitHub. We can sit back and bask in the value of the year we just spent getting that permission.

This is a full relicensing campaign. They are horribly intensive, even when the contributor only has to reply with "Yes" instead of having to sign a legal document.

Is this really legal-discuss@'s advice?

Hen

On Tue, Oct 3, 2017 at 8:42 AM, Alex Harui <ah...@adobe.com.invalid>> wrote:
Hi Henri,

Sometimes those agreements assign copyright and avoids the issue of reassigning an existing contributors agreement.

AIUI, the ASF is more interested in communities than code, so if source code is already under ALv2, the SGA or some agreement is showing that each copyright holder agrees to move the "Home" of that community to ASF which allows us to put the ASF header on those files.  If there is some other benefit or legal aspect to signing the SGA I'm definitely interested in what that is.  For MXNet, I think that is an agreement to stop making changes on OpenHub and start making changes in ASF repos.

For every contributor you can't track down, there is a risk that they may object "I don't want to move to the ASF, I want to stay at OpenHub", but since the code is already ALv2, they really can't stop MXNet from using their code in an ASF repo.  It occurs to me, though, if you want to be super picky, that you can't use Git Blame to determine the set of contributors.  If you commit:

  for (i = 0; i < n; i++)

And I later change that to:

  for (i = 0; i < n -1; i++)

I believe you still own some of that line of code but Git Blame will only show me.  So I think you have to ask every committer who ever committed.  Even if you prove their code was eventually fully replaced or discarded, you don't really know if their code influenced other code that wasn't.

So, if you agree my line of thinking, then some of these MXNet files are going to fall into the typical scenarios where some committer wants to inject ALv2 code found outside the ASF into a file in an ASF repo or an ASF committer wants to patch some non-ASF ALv2 code and bundle it in an ASF release.  Some files coming from OpenHub may have every line signed for and thus can have the header replaced, but if not every line is signed for, then you have mixed code and you get to make a judgement call about the header.  If it is mostly signed for, it is the first scenario, if it is not mostly signed for, it is the second scenario.

As to the actual words in the header or NOTICE, IMO, if a file is completely signed for and thus you can change the header, I would suggest moving the copyright to NOTICE as "Copyright Contributors (see Contributors.md)"  And try to make sure that Contributors.md is reasonably up to date.

My 2 cents,
-Alex


From: <hy...@gmail.com>> on behalf of Henri Yandell <he...@yandell.org>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Tuesday, October 3, 2017 at 1:22 AM
To: ASF Legal Discuss <le...@apache.org>>

Subject: Re: MXNet: "Copyright Contributors"

Alex: Ignoring the question of whether such an agreement could be reassigned to Apache; no there wasn't :)

Craig: I would agree with you on the copyright statement if it included authors' names. As it is, I'm tempted to have every file contain Copyright Contributors as it's as true for Apache as it is for pre-Apache; it's a nonsensical statement.  (Leaving the SGA/CLA discussion for the other thread).

Hen


On Mon, Oct 2, 2017 at 10:39 PM, Alex Harui <ah...@adobe.com.invalid>> wrote:
Was there any sort of contributors agreement signed by folks before committing code to wherever the code lived before Apache?

-Alex

From: Craig Russell <ap...@gmail.com>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Monday, October 2, 2017 at 3:34 PM
To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Subject: Re: MXNet: "Copyright Contributors"

Hi Henri,

This is indeed a tricky situation.

One of Apache's main operating principles is that downstream consumers can rest assured that Apache has done due diligence to establish provenance for the code base. In this case, establishing provenance for each line of code appears to be difficult.

But we have to try. It seems that the podling committers need to find out the long list of contributors and what they contributed. The git history should be sufficient for that, including all pull requests.

If we have "trivial" contributions that would not affect the viability of the project if the contributor subsequently decided that they wanted to withdraw their consent to use them, I'm ok with including those contributions.

But we should have clear IP grant (via either SGA or ICLA) from everyone who contributed major functionality to the project before it arrived here.

Finally, if we have significant code whose authorship is questionable, we should not change the header from "Copyright (c) 2015 by Contributors" to the Apache header, since we have no legal basis for the Apache header. If the code is under the Apache license, we can distribute that code under the terms of the license including NOTICE requirements. This is the least desirable path.

Craig

On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org>> wrote:

Another podling question.

The MXNet code previously stated:

  Copyright (c) 2015 by Contributors

This is pretty vague and begs the question of what to put in MXNet source headers going forth. It might even perhaps be a tautology; something is always copyright to its contributors (where contributors is open source speak for authors).

Per the previous email, it's likely that not all contributors will have signed an ICLA with Apache. Do we keep that one liner in every file, do we put something in the NOTICE along the lines of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Or put that in every source file.

It's a curious question imo. The current Apache source header says:

"Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership."

Where contributor license agreements include ICLAs, CCLAs, Software Grants and Clause 5 of the Apache License 2.0; at least in so far as contributed directly to Apache. It also possibly includes ICLAs/CCLAs signed with third parties who then signed a Software Grant, and arguably Clause 5 of the Apache License 2.0 to those third parties. MXNet is fun (aka normal for most open source projects) in that it's unclear who that third party would have been; presumably the individual who created the dmlc GitHub organization as there is no legal entity there.

All of which is to show that I've thought about this a bit, with no perfect answer. We are not going to get an ICLA from every one of the "Contributors". We could leave that generic and confusing text in each source file, but I feel this is an allowable case for moving that copyright statement into the NOTICE per my proposed text of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Do folk on the Legal list feel this is sane, or that I'm barking mad? :)

Thanks,

Hen

Craig L Russell
Secretary, Apache Software Foundation
clr@apache.org<ma...@apache.org> http://db.apache.org/jdo<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdb.apache.org%2Fjdo&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=kTwr6cY3Z1TxxvtOqNhItfO4FTQW7%2BSOtxyXbspf3%2BU%3D&reserved=0>




Re: MXNet: "Copyright Contributors"

Posted by Henri Yandell <ba...@apache.org>.
Sidenotes:

* Noting that the 'move' is from https://github.com/dmlc to
https://github.com/apache; OpenHub is Black Duck's rename for Ohloh :)

* Also, copyright assignment agreements are very much in the minority.

* On the Copyright statement; I think it'll be simpler to add 'Copyright
Contributors' to every file. I don't see a need to list all 500+
contributors in a CONTRIBUTORS.md, I'm not aware of that being a standard
for Apache projects. I also don't see a need to review every file to slowly
remove such a meaningless statement. If it can't be removed generally,
might as well add it everywhere.

[Note that this email thread is Copyright 2017 Contributors. I hope nobody
removes this line when replying otherwise Contributors will be unhappy ;) ]

---

On SGA signing:

We're now up to 573 Author emails, but presumably commits after the repo
transferred on 7/28 don't need an SGA/ICLA. That reduces the number to 533
authors and 5757 commits.  27 of the top 100 contributors (by GitHub's
list) have signed ICLAs (no SGAs that I know of). 22 of the top 34, ie) the
ICLAs are clustered to the top of the contribution list. Looking at commit
logs, the 34th contributor by count has 30 commits. 10 commits for the
contributor in 69th place, 5 commits for the contributor in 113th place and
296 contributor have 1 commit.

Estimating that half of the 296 1-commit contributors were trivial, that
gives ~358 people to ask for ICLAs for.

Note that most of those individuals' emails in the git logs are GitHub
noreply emails. So at least 300 people will need to be contacted via either
a) writing a script to scrape their email address, or b) opening an issue
on GitHub and tagging them. About half (209 of 437) of Apache public GitHub
members list an email address so perhaps we'd get lucky and get 150 of
those people's email addresses. So:

0) Do a triviality check on the commits of 506 contributors. This can fail
fast as a contributor needs only one contribution that's non-trivial to
move to step 1.

1) Campaign #1: Email 150 people. "Hi Contributor, We noticed you have
contributed at least one commit to the MXNet project at
github.com/dmlc/mxnet. The project has moved to the Apache Software
Foundation. Please could you give us your permission to transfer this
project to Apache by printing up this SGA form <link> and sending it to
secretary@. Have a great day, Apache".
1.1) Sub-campaign: Identify the email bounces. Attempt to figure out the
GitHub login for these emails via the GitHub API and add them to Campaign
#2.

2) Campaign #2: Open an Issue on GitHub. Paste the above email with 150+
GitHub @logins so they (hopefully) get an email from GitHub.

For both #1 and #2, check with secretary@ to see who has signed what. Rinse
and repeat for 6 months.

3) After 6 months, identify every line of code from a non-trivial
contribution that is not covered by an ICLA or SGA. Rewrite the code via a
clean room (perhaps by appealing to an employer to pay a contractor, or
perhaps by appealing to committers from other Apache projects).

4) Party. We now have permission to have moved the 'community' from one
account on GitHub to another account on GitHub. We can sit back and bask in
the value of the year we just spent getting that permission.

This is a full relicensing campaign. They are horribly intensive, even when
the contributor only has to reply with "Yes" instead of having to sign a
legal document.

Is this really legal-discuss@'s advice?

Hen

On Tue, Oct 3, 2017 at 8:42 AM, Alex Harui <ah...@adobe.com.invalid> wrote:

> Hi Henri,
>
> Sometimes those agreements assign copyright and avoids the issue of
> reassigning an existing contributors agreement.
>
> AIUI, the ASF is more interested in communities than code, so if source
> code is already under ALv2, the SGA or some agreement is showing that each
> copyright holder agrees to move the "Home" of that community to ASF which
> allows us to put the ASF header on those files.  If there is some other
> benefit or legal aspect to signing the SGA I'm definitely interested in
> what that is.  For MXNet, I think that is an agreement to stop making
> changes on OpenHub and start making changes in ASF repos.
>
> For every contributor you can't track down, there is a risk that they may
> object "I don't want to move to the ASF, I want to stay at OpenHub", but
> since the code is already ALv2, they really can't stop MXNet from using
> their code in an ASF repo.  It occurs to me, though, if you want to be
> super picky, that you can't use Git Blame to determine the set of
> contributors.  If you commit:
>
>   for (i = 0; i < n; i++)
>
> And I later change that to:
>
>   for (i = 0; i < n -1; i++)
>
> I believe you still own some of that line of code but Git Blame will only
> show me.  So I think you have to ask every committer who ever committed.
> Even if you prove their code was eventually fully replaced or discarded,
> you don't really know if their code influenced other code that wasn't.
>
> So, if you agree my line of thinking, then some of these MXNet files are
> going to fall into the typical scenarios where some committer wants to
> inject ALv2 code found outside the ASF into a file in an ASF repo or an ASF
> committer wants to patch some non-ASF ALv2 code and bundle it in an ASF
> release.  Some files coming from OpenHub may have every line signed for and
> thus can have the header replaced, but if not every line is signed for,
> then you have mixed code and you get to make a judgement call about the
> header.  If it is mostly signed for, it is the first scenario, if it is not
> mostly signed for, it is the second scenario.
>
> As to the actual words in the header or NOTICE, IMO, if a file is
> completely signed for and thus you can change the header, I would suggest
> moving the copyright to NOTICE as "Copyright Contributors (see
> Contributors.md)"  And try to make sure that Contributors.md is reasonably
> up to date.
>
> My 2 cents,
> -Alex
>
>
> From: <hy...@gmail.com> on behalf of Henri Yandell <he...@yandell.org>
> Reply-To: "legal-discuss@apache.org" <le...@apache.org>
> Date: Tuesday, October 3, 2017 at 1:22 AM
> To: ASF Legal Discuss <le...@apache.org>
>
> Subject: Re: MXNet: "Copyright Contributors"
>
> Alex: Ignoring the question of whether such an agreement could be
> reassigned to Apache; no there wasn't :)
>
> Craig: I would agree with you on the copyright statement if it included
> authors' names. As it is, I'm tempted to have every file contain Copyright
> Contributors as it's as true for Apache as it is for pre-Apache; it's a
> nonsensical statement.  (Leaving the SGA/CLA discussion for the other
> thread).
>
> Hen
>
>
> On Mon, Oct 2, 2017 at 10:39 PM, Alex Harui <ah...@adobe.com.invalid>
> wrote:
>
>> Was there any sort of contributors agreement signed by folks before
>> committing code to wherever the code lived before Apache?
>>
>> -Alex
>>
>> From: Craig Russell <ap...@gmail.com>
>> Reply-To: "legal-discuss@apache.org" <le...@apache.org>
>> Date: Monday, October 2, 2017 at 3:34 PM
>> To: "legal-discuss@apache.org" <le...@apache.org>
>> Subject: Re: MXNet: "Copyright Contributors"
>>
>> Hi Henri,
>>
>> This is indeed a tricky situation.
>>
>> One of Apache's main operating principles is that downstream consumers
>> can rest assured that Apache has done due diligence to establish provenance
>> for the code base. In this case, establishing provenance for each line of
>> code appears to be difficult.
>>
>> But we have to try. It seems that the podling committers need to find out
>> the long list of contributors and what they contributed. The git history
>> should be sufficient for that, including all pull requests.
>>
>> If we have "trivial" contributions that would not affect the viability of
>> the project if the contributor subsequently decided that they wanted to
>> withdraw their consent to use them, I'm ok with including those
>> contributions.
>>
>> But we should have clear IP grant (via either SGA or ICLA) from everyone
>> who contributed major functionality to the project before it arrived here.
>>
>> Finally, if we have significant code whose authorship is questionable, we
>> should not change the header from "Copyright (c) 2015 by *Contributors*"
>> to the Apache header, since we have no legal basis for the Apache header.
>> If the code is under the Apache license, we can distribute that code under
>> the terms of the license including NOTICE requirements. This is the least
>> desirable path.
>>
>> Craig
>>
>> On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org> wrote:
>>
>> Another podling question.
>>
>> The MXNet code previously stated:
>>
>>   Copyright (c) 2015 by
>>
>> *Contributors *
>> This is pretty vague and begs the question of what to put in MXNet source
>> headers going forth. It might even perhaps be a tautology; something is
>> always copyright to its contributors (where contributors is open source
>> speak for authors).
>>
>> Per the previous email, it's likely that not all contributors will have
>> signed an ICLA with Apache. Do we keep that one liner in every file, do we
>> put something in the NOTICE along the lines of:
>>
>> "MXNet was previously published at https://github.com/dmlc/mxnet
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0>
>> and was Copyright (c) 2015 by *Contributors"*
>>
>> Or put that in every source file.
>>
>> It's a curious question imo. The current Apache source header says:
>>
>> "Licensed to the Apache Software Foundation (ASF) under one
>>  or more contributor license agreements.  See the NOTICE file
>>  distributed with this work for additional information
>>  regarding copyright ownership."
>>
>> Where contributor license agreements include ICLAs, CCLAs, Software
>> Grants and Clause 5 of the Apache License 2.0; at least in so far as
>> contributed directly to Apache. It also possibly includes ICLAs/CCLAs
>> signed with third parties who then signed a Software Grant, and arguably
>> Clause 5 of the Apache License 2.0 to those third parties. MXNet is fun
>> (aka normal for most open source projects) in that it's unclear who that
>> third party would have been; presumably the individual who created the dmlc
>> GitHub organization as there is no legal entity there.
>>
>> All of which is to show that I've thought about this a bit, with no
>> perfect answer. We are not going to get an ICLA from every one of the
>> "Contributors". We could leave that generic and confusing text in each
>> source file, but I feel this is an allowable case for moving that copyright
>> statement into the NOTICE per my proposed text of:
>>
>> "MXNet was previously published at https://github.com/dmlc/mxnet
>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0>
>> and was Copyright (c) 2015 by
>> *Contributors" *
>>
>> Do folk on the Legal list feel this is sane, or that I'm barking mad? :)
>>
>> Thanks,
>>
>> Hen
>>
>>
>> Craig L Russell
>> Secretary, Apache Software Foundation
>> clr@apache.org http://db.apache.org/jdo
>> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdb.apache.org%2Fjdo&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=kTwr6cY3Z1TxxvtOqNhItfO4FTQW7%2BSOtxyXbspf3%2BU%3D&reserved=0>
>>
>>
>

Re: MXNet: "Copyright Contributors"

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Hi Henri,

Sometimes those agreements assign copyright and avoids the issue of reassigning an existing contributors agreement.

AIUI, the ASF is more interested in communities than code, so if source code is already under ALv2, the SGA or some agreement is showing that each copyright holder agrees to move the "Home" of that community to ASF which allows us to put the ASF header on those files.  If there is some other benefit or legal aspect to signing the SGA I'm definitely interested in what that is.  For MXNet, I think that is an agreement to stop making changes on OpenHub and start making changes in ASF repos.

For every contributor you can't track down, there is a risk that they may object "I don't want to move to the ASF, I want to stay at OpenHub", but since the code is already ALv2, they really can't stop MXNet from using their code in an ASF repo.  It occurs to me, though, if you want to be super picky, that you can't use Git Blame to determine the set of contributors.  If you commit:

  for (i = 0; i < n; i++)

And I later change that to:

  for (i = 0; i < n -1; i++)

I believe you still own some of that line of code but Git Blame will only show me.  So I think you have to ask every committer who ever committed.  Even if you prove their code was eventually fully replaced or discarded, you don't really know if their code influenced other code that wasn't.

So, if you agree my line of thinking, then some of these MXNet files are going to fall into the typical scenarios where some committer wants to inject ALv2 code found outside the ASF into a file in an ASF repo or an ASF committer wants to patch some non-ASF ALv2 code and bundle it in an ASF release.  Some files coming from OpenHub may have every line signed for and thus can have the header replaced, but if not every line is signed for, then you have mixed code and you get to make a judgement call about the header.  If it is mostly signed for, it is the first scenario, if it is not mostly signed for, it is the second scenario.

As to the actual words in the header or NOTICE, IMO, if a file is completely signed for and thus you can change the header, I would suggest moving the copyright to NOTICE as "Copyright Contributors (see Contributors.md)"  And try to make sure that Contributors.md is reasonably up to date.

My 2 cents,
-Alex


From: <hy...@gmail.com>> on behalf of Henri Yandell <he...@yandell.org>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Tuesday, October 3, 2017 at 1:22 AM
To: ASF Legal Discuss <le...@apache.org>>
Subject: Re: MXNet: "Copyright Contributors"

Alex: Ignoring the question of whether such an agreement could be reassigned to Apache; no there wasn't :)

Craig: I would agree with you on the copyright statement if it included authors' names. As it is, I'm tempted to have every file contain Copyright Contributors as it's as true for Apache as it is for pre-Apache; it's a nonsensical statement.  (Leaving the SGA/CLA discussion for the other thread).

Hen


On Mon, Oct 2, 2017 at 10:39 PM, Alex Harui <ah...@adobe.com.invalid>> wrote:
Was there any sort of contributors agreement signed by folks before committing code to wherever the code lived before Apache?

-Alex

From: Craig Russell <ap...@gmail.com>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Monday, October 2, 2017 at 3:34 PM
To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Subject: Re: MXNet: "Copyright Contributors"

Hi Henri,

This is indeed a tricky situation.

One of Apache's main operating principles is that downstream consumers can rest assured that Apache has done due diligence to establish provenance for the code base. In this case, establishing provenance for each line of code appears to be difficult.

But we have to try. It seems that the podling committers need to find out the long list of contributors and what they contributed. The git history should be sufficient for that, including all pull requests.

If we have "trivial" contributions that would not affect the viability of the project if the contributor subsequently decided that they wanted to withdraw their consent to use them, I'm ok with including those contributions.

But we should have clear IP grant (via either SGA or ICLA) from everyone who contributed major functionality to the project before it arrived here.

Finally, if we have significant code whose authorship is questionable, we should not change the header from "Copyright (c) 2015 by Contributors" to the Apache header, since we have no legal basis for the Apache header. If the code is under the Apache license, we can distribute that code under the terms of the license including NOTICE requirements. This is the least desirable path.

Craig

On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org>> wrote:

Another podling question.

The MXNet code previously stated:

  Copyright (c) 2015 by Contributors

This is pretty vague and begs the question of what to put in MXNet source headers going forth. It might even perhaps be a tautology; something is always copyright to its contributors (where contributors is open source speak for authors).

Per the previous email, it's likely that not all contributors will have signed an ICLA with Apache. Do we keep that one liner in every file, do we put something in the NOTICE along the lines of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Or put that in every source file.

It's a curious question imo. The current Apache source header says:

"Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership."

Where contributor license agreements include ICLAs, CCLAs, Software Grants and Clause 5 of the Apache License 2.0; at least in so far as contributed directly to Apache. It also possibly includes ICLAs/CCLAs signed with third parties who then signed a Software Grant, and arguably Clause 5 of the Apache License 2.0 to those third parties. MXNet is fun (aka normal for most open source projects) in that it's unclear who that third party would have been; presumably the individual who created the dmlc GitHub organization as there is no legal entity there.

All of which is to show that I've thought about this a bit, with no perfect answer. We are not going to get an ICLA from every one of the "Contributors". We could leave that generic and confusing text in each source file, but I feel this is an allowable case for moving that copyright statement into the NOTICE per my proposed text of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Do folk on the Legal list feel this is sane, or that I'm barking mad? :)

Thanks,

Hen

Craig L Russell
Secretary, Apache Software Foundation
clr@apache.org<ma...@apache.org> http://db.apache.org/jdo<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdb.apache.org%2Fjdo&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=kTwr6cY3Z1TxxvtOqNhItfO4FTQW7%2BSOtxyXbspf3%2BU%3D&reserved=0>



Re: MXNet: "Copyright Contributors"

Posted by Henri Yandell <he...@yandell.org>.
Alex: Ignoring the question of whether such an agreement could be
reassigned to Apache; no there wasn't :)

Craig: I would agree with you on the copyright statement if it included
authors' names. As it is, I'm tempted to have every file contain Copyright
Contributors as it's as true for Apache as it is for pre-Apache; it's a
nonsensical statement.  (Leaving the SGA/CLA discussion for the other
thread).

Hen


On Mon, Oct 2, 2017 at 10:39 PM, Alex Harui <ah...@adobe.com.invalid>
wrote:

> Was there any sort of contributors agreement signed by folks before
> committing code to wherever the code lived before Apache?
>
> -Alex
>
> From: Craig Russell <ap...@gmail.com>
> Reply-To: "legal-discuss@apache.org" <le...@apache.org>
> Date: Monday, October 2, 2017 at 3:34 PM
> To: "legal-discuss@apache.org" <le...@apache.org>
> Subject: Re: MXNet: "Copyright Contributors"
>
> Hi Henri,
>
> This is indeed a tricky situation.
>
> One of Apache's main operating principles is that downstream consumers can
> rest assured that Apache has done due diligence to establish provenance for
> the code base. In this case, establishing provenance for each line of code
> appears to be difficult.
>
> But we have to try. It seems that the podling committers need to find out
> the long list of contributors and what they contributed. The git history
> should be sufficient for that, including all pull requests.
>
> If we have "trivial" contributions that would not affect the viability of
> the project if the contributor subsequently decided that they wanted to
> withdraw their consent to use them, I'm ok with including those
> contributions.
>
> But we should have clear IP grant (via either SGA or ICLA) from everyone
> who contributed major functionality to the project before it arrived here.
>
> Finally, if we have significant code whose authorship is questionable, we
> should not change the header from "Copyright (c) 2015 by *Contributors*"
> to the Apache header, since we have no legal basis for the Apache header.
> If the code is under the Apache license, we can distribute that code under
> the terms of the license including NOTICE requirements. This is the least
> desirable path.
>
> Craig
>
> On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org> wrote:
>
> Another podling question.
>
> The MXNet code previously stated:
>
>   Copyright (c) 2015 by
>
> *Contributors *
> This is pretty vague and begs the question of what to put in MXNet source
> headers going forth. It might even perhaps be a tautology; something is
> always copyright to its contributors (where contributors is open source
> speak for authors).
>
> Per the previous email, it's likely that not all contributors will have
> signed an ICLA with Apache. Do we keep that one liner in every file, do we
> put something in the NOTICE along the lines of:
>
> "MXNet was previously published at https://github.com/dmlc/mxnet
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0>
> and was Copyright (c) 2015 by *Contributors"*
>
> Or put that in every source file.
>
> It's a curious question imo. The current Apache source header says:
>
> "Licensed to the Apache Software Foundation (ASF) under one
>  or more contributor license agreements.  See the NOTICE file
>  distributed with this work for additional information
>  regarding copyright ownership."
>
> Where contributor license agreements include ICLAs, CCLAs, Software Grants
> and Clause 5 of the Apache License 2.0; at least in so far as contributed
> directly to Apache. It also possibly includes ICLAs/CCLAs signed with third
> parties who then signed a Software Grant, and arguably Clause 5 of the
> Apache License 2.0 to those third parties. MXNet is fun (aka normal for
> most open source projects) in that it's unclear who that third party would
> have been; presumably the individual who created the dmlc GitHub
> organization as there is no legal entity there.
>
> All of which is to show that I've thought about this a bit, with no
> perfect answer. We are not going to get an ICLA from every one of the
> "Contributors". We could leave that generic and confusing text in each
> source file, but I feel this is an allowable case for moving that copyright
> statement into the NOTICE per my proposed text of:
>
> "MXNet was previously published at https://github.com/dmlc/mxnet
> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0>
> and was Copyright (c) 2015 by
> *Contributors" *
>
> Do folk on the Legal list feel this is sane, or that I'm barking mad? :)
>
> Thanks,
>
> Hen
>
>
> Craig L Russell
> Secretary, Apache Software Foundation
> clr@apache.org http://db.apache.org/jdo
> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdb.apache.org%2Fjdo&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=kTwr6cY3Z1TxxvtOqNhItfO4FTQW7%2BSOtxyXbspf3%2BU%3D&reserved=0>
>
>

Re: MXNet: "Copyright Contributors"

Posted by Alex Harui <ah...@adobe.com.INVALID>.
Was there any sort of contributors agreement signed by folks before committing code to wherever the code lived before Apache?

-Alex

From: Craig Russell <ap...@gmail.com>>
Reply-To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Date: Monday, October 2, 2017 at 3:34 PM
To: "legal-discuss@apache.org<ma...@apache.org>" <le...@apache.org>>
Subject: Re: MXNet: "Copyright Contributors"

Hi Henri,

This is indeed a tricky situation.

One of Apache's main operating principles is that downstream consumers can rest assured that Apache has done due diligence to establish provenance for the code base. In this case, establishing provenance for each line of code appears to be difficult.

But we have to try. It seems that the podling committers need to find out the long list of contributors and what they contributed. The git history should be sufficient for that, including all pull requests.

If we have "trivial" contributions that would not affect the viability of the project if the contributor subsequently decided that they wanted to withdraw their consent to use them, I'm ok with including those contributions.

But we should have clear IP grant (via either SGA or ICLA) from everyone who contributed major functionality to the project before it arrived here.

Finally, if we have significant code whose authorship is questionable, we should not change the header from "Copyright (c) 2015 by Contributors" to the Apache header, since we have no legal basis for the Apache header. If the code is under the Apache license, we can distribute that code under the terms of the license including NOTICE requirements. This is the least desirable path.

Craig

On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org>> wrote:

Another podling question.

The MXNet code previously stated:

  Copyright (c) 2015 by Contributors

This is pretty vague and begs the question of what to put in MXNet source headers going forth. It might even perhaps be a tautology; something is always copyright to its contributors (where contributors is open source speak for authors).

Per the previous email, it's likely that not all contributors will have signed an ICLA with Apache. Do we keep that one liner in every file, do we put something in the NOTICE along the lines of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Or put that in every source file.

It's a curious question imo. The current Apache source header says:

"Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership."

Where contributor license agreements include ICLAs, CCLAs, Software Grants and Clause 5 of the Apache License 2.0; at least in so far as contributed directly to Apache. It also possibly includes ICLAs/CCLAs signed with third parties who then signed a Software Grant, and arguably Clause 5 of the Apache License 2.0 to those third parties. MXNet is fun (aka normal for most open source projects) in that it's unclear who that third party would have been; presumably the individual who created the dmlc GitHub organization as there is no legal entity there.

All of which is to show that I've thought about this a bit, with no perfect answer. We are not going to get an ICLA from every one of the "Contributors". We could leave that generic and confusing text in each source file, but I feel this is an allowable case for moving that copyright statement into the NOTICE per my proposed text of:

"MXNet was previously published at https://github.com/dmlc/mxnet<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdmlc%2Fmxnet&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=haVjB7wYqatKsL7OFhmNWRby8OCJgyyXSSFe6fpmYIw%3D&reserved=0> and was Copyright (c) 2015 by Contributors"

Do folk on the Legal list feel this is sane, or that I'm barking mad? :)

Thanks,

Hen

Craig L Russell
Secretary, Apache Software Foundation
clr@apache.org<ma...@apache.org> http://db.apache.org/jdo<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdb.apache.org%2Fjdo&data=02%7C01%7C%7Cf97cc716fea2452ace4e08d509e5d0ab%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636425805041175506&sdata=kTwr6cY3Z1TxxvtOqNhItfO4FTQW7%2BSOtxyXbspf3%2BU%3D&reserved=0>


Re: MXNet: "Copyright Contributors"

Posted by Craig Russell <ap...@gmail.com>.
Hi Henri,

This is indeed a tricky situation. 

One of Apache's main operating principles is that downstream consumers can rest assured that Apache has done due diligence to establish provenance for the code base. In this case, establishing provenance for each line of code appears to be difficult. 

But we have to try. It seems that the podling committers need to find out the long list of contributors and what they contributed. The git history should be sufficient for that, including all pull requests.

If we have "trivial" contributions that would not affect the viability of the project if the contributor subsequently decided that they wanted to withdraw their consent to use them, I'm ok with including those contributions. 

But we should have clear IP grant (via either SGA or ICLA) from everyone who contributed major functionality to the project before it arrived here.

Finally, if we have significant code whose authorship is questionable, we should not change the header from "Copyright (c) 2015 by Contributors" to the Apache header, since we have no legal basis for the Apache header. If the code is under the Apache license, we can distribute that code under the terms of the license including NOTICE requirements. This is the least desirable path.

Craig

> On Oct 1, 2017, at 10:01 PM, Henri Yandell <ba...@apache.org> wrote:
> 
> Another podling question.
> 
> The MXNet code previously stated:
> 
>   Copyright (c) 2015 by Contributors
> 
> This is pretty vague and begs the question of what to put in MXNet source headers going forth. It might even perhaps be a tautology; something is always copyright to its contributors (where contributors is open source speak for authors). 
> 
> Per the previous email, it's likely that not all contributors will have signed an ICLA with Apache. Do we keep that one liner in every file, do we put something in the NOTICE along the lines of:
> 
> "MXNet was previously published at https://github.com/dmlc/mxnet <https://github.com/dmlc/mxnet> and was Copyright (c) 2015 by Contributors"
> 
> Or put that in every source file.
> 
> It's a curious question imo. The current Apache source header says:
> 
> "Licensed to the Apache Software Foundation (ASF) under one
>  or more contributor license agreements.  See the NOTICE file
>  distributed with this work for additional information
>  regarding copyright ownership."
> 
> Where contributor license agreements include ICLAs, CCLAs, Software Grants and Clause 5 of the Apache License 2.0; at least in so far as contributed directly to Apache. It also possibly includes ICLAs/CCLAs signed with third parties who then signed a Software Grant, and arguably Clause 5 of the Apache License 2.0 to those third parties. MXNet is fun (aka normal for most open source projects) in that it's unclear who that third party would have been; presumably the individual who created the dmlc GitHub organization as there is no legal entity there.
> 
> All of which is to show that I've thought about this a bit, with no perfect answer. We are not going to get an ICLA from every one of the "Contributors". We could leave that generic and confusing text in each source file, but I feel this is an allowable case for moving that copyright statement into the NOTICE per my proposed text of:
> 
> "MXNet was previously published at https://github.com/dmlc/mxnet <https://github.com/dmlc/mxnet> and was Copyright (c) 2015 by Contributors"
> 
> Do folk on the Legal list feel this is sane, or that I'm barking mad? :)
> 
> Thanks,
> 
> Hen

Craig L Russell
Secretary, Apache Software Foundation
clr@apache.org <ma...@apache.org> http://db.apache.org/jdo <http://db.apache.org/jdo>