You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@incubator.apache.org by Todd Lipcon <to...@apache.org> on 2016/01/04 23:50:55 UTC

File headers for third party utility code

Hi all,

I'm working on verifying licenses and copyrights, etc, in Apache Kudu
(incubating). There is one area I wanted to confirm the right way to
document in our LICENSE/NOTICE files:

Kudu makes use of a lot of open source utility code borrowed from (or
adapted from) other open source projects. In particular, we've borrowed a
lot of code from Chromium's "base" module[1] which is licensed under a BSD
3-clause license[2]. We also have some code from Google Supersonic[3]
licensed under the Apache License[4]. The majority of this borrowed code is
under a 'gutil' directory in the Kudu tree[5]. We also have some small
amounts of code borrowed from LevelDB under the BSD license[6].

Given that all of the borrowed code is under the Apache or BSD licenses,
the inclusion of the code is completely allowable under the license terms.
The only question is the best way to document the inclusion to best follow
established ASF practices. My understanding is that we should:

1) Maintain the original copyright notices and license headers in the files.
2) In the cases that we've made non-trivial changes to the source, we
should additionally add the ASF copyright notice at the top of the file,
and amend the original copyright statement with the words "Some portions"
as we've done for example in cache.cc[7].
3) In all files (regardless of whether we've made changes), we should add
the Apache license header above any existing license headers, while
maintaining the existing one.
4) In the LICENSE file, we should make note of the included code and its
copyrights as we have done here[8].

I'm aware that there is a notion that Apache projects should ask for
permission to borrow code rather than forking communities. However, this is
all very generic utility code with no standalone project community or
releases. In fact, many of these files can already be found copy-pasted
into many different open source projects beyond just Chromium or LevelDB.
So, I don't think there are any viable alternatives to copy-pasting.

Thanks
-Todd

[1] https://chromium.googlesource.com/chromium/src/base/+/master/
[2] https://chromium.googlesource.com/chromium/src/+/master/LICENSE
[3] https://github.com/google/supersonic
[4] https://github.com/google/supersonic/blob/master/LICENSE
[5] https://github.com/cloudera/kudu/tree/master/src/kudu/gutil
[6] https://github.com/google/leveldb/blob/master/LICENSE
[7] https://github.com/cloudera/kudu/blob/master/src/kudu/util/cache.cc#L15
[8] https://github.com/cloudera/kudu/blob/master/LICENSE.txt#L205

Re: File headers for third party utility code

Posted by Todd Lipcon <to...@cloudera.com>.
On Tue, Jan 5, 2016 at 2:39 AM, Steve Loughran <st...@hortonworks.com>
wrote:

>
> One thing to try here is contribute all changes back to the original
> author(s), at least as patch submissions, so giving them the option to
> incorporate it.
>
>
Sure, I agree that makes sense in the case that you're borrowing someone
else's code and using it in the same way that they do. In our case, we're
not fixing bugs or improving, but rather adapting the code to a new context
in ways that don't make sense to the original one.

Most of the thirdparty software that we bundle, we pull in the original
sources and apply a few local patches[1]. In those cases we've submitted
the patches upstream and typically drop the patch when we upgrade to the
next version of the dependency.

-Todd
[1] https://github.com/cloudera/kudu/tree/master/thirdparty/patches

Re: File headers for third party utility code

Posted by Steve Loughran <st...@hortonworks.com>.
One thing to try here is contribute all changes back to the original author(s), at least as patch submissions, so giving them the option to incorporate it.

The usual benefits of single-source/limited-diff OSS codebases apply, it's good community practise and stops your project unintentionally adopting a "let's fork everything" process.

That doesn't mean you can't keep the source in git/svn (for .py libraries, this is what you end up doing anyway), only that your project should avoid taking on all the maintenance.

-Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: File headers for third party utility code

Posted by Marvin Humphrey <ma...@rectangular.com>.
On Mon, Jan 4, 2016 at 3:02 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> On Mon, Jan 4, 2016 at 2:50 PM, Todd Lipcon <to...@apache.org> wrote:
>> Hi all,
>>
>> I'm working on verifying licenses and copyrights, etc, in Apache Kudu
>> (incubating). There is one area I wanted to confirm the right way to
>> document in our LICENSE/NOTICE files:
>>
>> Kudu makes use of a lot of open source utility code borrowed from (or
>> adapted from) other open source projects. In particular, we've borrowed a
>> lot of code from Chromium's "base" module[1] which is licensed under a BSD
>> 3-clause license[2]. We also have some code from Google Supersonic[3]
>> licensed under the Apache License[4]. The majority of this borrowed code is
>> under a 'gutil' directory in the Kudu tree[5]. We also have some small
>> amounts of code borrowed from LevelDB under the BSD license[6].
>
> I am in exactly the same boat dealing with Copyright statements of code
> borrowed from Postgres.
>
>> Given that all of the borrowed code is under the Apache or BSD licenses,
>> the inclusion of the code is completely allowable under the license terms.
>> The only question is the best way to document the inclusion to best follow
>> established ASF practices. My understanding is that we should:
>>
>> 1) Maintain the original copyright notices and license headers in the files.
>
> Correct. Unless you're a copyright holder or an authorized representative
> of one, you're not allowed to touch existing copyright notices in files.

+1

>> 2) In the cases that we've made non-trivial changes to the source, we
>> should additionally add the ASF copyright notice at the top of the file,
>> and amend the original copyright statement with the words "Some portions"
>> as we've done for example in cache.cc[7].
>
> I don't think you need to do that, but you do need an ASF license header.
>
> I don't think ASF encourages "Portions Copyright ... ASF" statements
> on individual files.

See below...

>> 3) In all files (regardless of whether we've made changes), we should add
>> the Apache license header above any existing license headers, while
>> maintaining the existing one.
>
> Correct and it should also solve #2

Alex got to this link before I did, which answers the question more
authoritatively:

    http://www.apache.org/legal/src-headers.html#3party

As to where to draw the line between major and minor modifications, I don't
recall having seen a conversation about that on either general@incubator or
legal-discuss@apache in the last several years.  (Though maybe it happened and
memory fails.)  I suggest bringing this topic up on legal-discuss@apache.

>> 4) In the LICENSE file, we should make note of the included code and its
>> copyrights as we have done here[8].
>
> I tend to be in the camp that values simplicity of LICENSE file. IOW,
> this need to be a succinct communication of what an overall license
> for the source bundle is and that's ALv2.

Perhaps I'm misinterpreting, but this advice seems to be "The LICENSE file
should contain the ALv2 and nothing more."

The position that only the ALv2 should go into LICENSE is legally defensible,
but it contradicts both ASF Release Policy and other documentation such as the
Licensing How-To[1].

    http://www.apache.org/legal/release-policy#license-file

    When a package bundles code under several licenses, the LICENSE file MUST
    contain details of all these licenses. For each component which is not
    Apache licensed, details of the component MUST be appended to the LICENSE
    file....

I think it's important that the Incubator avoid offering guidance on such
matters which conflicts with policy, even if the position is defensible.
Arguing about this stuff is boring and time-sucking.  The more formulaic we
can make licensing, the less of a drain it will be on our volunteers.

> You do need to update NOTICE files accordingly though.

This is hard to disagree with, but ambiguous -- so please bear with me while I
restate with different emphasis...

There are a handful of things which need absolutely need to go in NOTICE.
Anything else needs to be kept out.

Here is a non-exhaustive list of things which people seem quite tempted to put
into NOTICE files but which don't belong there:

*   Details of non-bundled dependencies
*   Details of dependencies which are bundled with a convenience binary but
    not with the official source release
*   Details of bundled MIT-licensed dependencies
*   Details of bundled BSD2-licensed/BSD3-licensed dependencies
*   Details of bundled ALv2-licensed dependencies which do not themselves
    provide a NOTICE file.

Please folks, keep LICENSE and NOTICE correct but minimal, so that downstream
consumers are spared from having to deal with needlessly complicated
licensing.

Marvin Humphrey

[1] http://www.apache.org/dev/licensing-howto

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: File headers for third party utility code

Posted by Alex Harui <ah...@adobe.com>.

On 1/4/16, 3:41 PM, "Todd Lipcon" <to...@cloudera.com> wrote:
>
>Right, I guess I'm not sure what qualifies minor vs major. In some cases,
>we've done trivial edits like putting things in a "kudu" namespace or
>removing some portability code. In other cases, we've made more
>substantial
>alterations to fit our codebase (eg
>https://github.com/cloudera/kudu/commit/0ee3218b9edcd7e5e9d450307bc22d0ead
>fb53be
>) but still kept the overall API/design. At what point do we go ahead and
>add the Apache License header?

The way I think of it is that every line of code has a home.  The home for
3rd-party code is not in the ASF repo or the source package you downloaded
from Apache, so we want to warn folks that more care is needed when
mucking in a particular file/folder.  When there are lines of code whose
home is the ASF mixed with code whose home is not at the ASF, IMO, that
still needs to be pointed out in LICENSE until the amount of non-ASF code
is reduced to the point where mucking with that code isn't going to matter
to the home community.

The uber ASF license in the LICENSE file grants permission to all lines
whose home is the ASF regardless of what the header looks like.  I would
put annotations in the source code about what code in a mixed file does
have a home at the ASF if I thought it would be helpful.  I would not add
all of that information to the LICENSE.

An attorney for my employer said that these headers are just sign posts
provided as a convenience to the consumer.  The code is owned and has a
home regardless of header.  The header and any other annotations in a
source file are just to save the consumer time in figuring out who owns
what and where the "canonical copy" lives.  The LICENSE file is IMO, also
a sign post.

So, when grabbing a release for use, LICENSE ought to give me a quick idea
of the ingredients (which is why I prefer pointers to 3rd party licenses
vs whole copies of licenses).  Then if I feel the need to go change some
source code, the headers and other annotations in that file give me a
warning that the contents may not all have a home at the ASF.  After that
it is a judgement call as to whether it is worth making the file more
bloated at the line-level to define the boundaries of the mixing or make
it an exercise of the consumer to figure it out.  But at least they've
been warned.

My 2 cents,
-Alex


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: File headers for third party utility code

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, Jan 4, 2016 at 3:29 PM, Alex Harui <ah...@adobe.com> wrote:

>
>
> On 1/4/16, 3:02 PM, "shaposhnik@gmail.com on behalf of Roman Shaposhnik"
> <shaposhnik@gmail.com on behalf of roman@shaposhnik.org> wrote:
> >
> >> 2) In the cases that we've made non-trivial changes to the source, we
> >> should additionally add the ASF copyright notice at the top of the file,
> >> and amend the original copyright statement with the words "Some
> >>portions"
> >> as we've done for example in cache.cc[7].
> >
> >I don't think you need to do that, but you do need an ASF license header.
> >
> >I don't think ASF encourages "Portions Copyright ... ASF" statements
> >on individual files.
> >
> >> 3) In all files (regardless of whether we've made changes), we should
> >>add
> >> the Apache license header above any existing license headers, while
> >> maintaining the existing one.
> >
> >Correct and it should also solve #2
>
> Doesn't #3 from http://www.apache.org/legal/src-headers.html#3party
> contradict this?
>
>
> 0. The term "third-party work" refers to a work not submitted directly to
> the ASF by the copyright owner or owner's agent.
> 1. Do not modify or remove any copyright notices or licenses within
> third-party works.
> 2. Do ensure that every third-party work includes its associated license,
> even if that requires adding a copy of the license from the third-party
> download site into the distribution.
> 3. Do not add the standard Apache License header to the top of third-party
> source files.
> 4. Minor modifications/additions to third-party source files should
> typically be licensed under the same terms as the rest of the rest of the
> third-party source for convenience.
> 5. Major modifications/additions to third-party should be dealt with on a
> case-by-case basis by the PMC.
>

Right, I guess I'm not sure what qualifies minor vs major. In some cases,
we've done trivial edits like putting things in a "kudu" namespace or
removing some portability code. In other cases, we've made more substantial
alterations to fit our codebase (eg
https://github.com/cloudera/kudu/commit/0ee3218b9edcd7e5e9d450307bc22d0eadfb53be
) but still kept the overall API/design. At what point do we go ahead and
add the Apache License header?

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: File headers for third party utility code

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Mon, Jan 4, 2016 at 3:29 PM, Alex Harui <ah...@adobe.com> wrote:
>
>
> On 1/4/16, 3:02 PM, "shaposhnik@gmail.com on behalf of Roman Shaposhnik"
> <shaposhnik@gmail.com on behalf of roman@shaposhnik.org> wrote:
>>
>>> 2) In the cases that we've made non-trivial changes to the source, we
>>> should additionally add the ASF copyright notice at the top of the file,
>>> and amend the original copyright statement with the words "Some
>>>portions"
>>> as we've done for example in cache.cc[7].
>>
>>I don't think you need to do that, but you do need an ASF license header.
>>
>>I don't think ASF encourages "Portions Copyright ... ASF" statements
>>on individual files.
>>
>>> 3) In all files (regardless of whether we've made changes), we should
>>>add
>>> the Apache license header above any existing license headers, while
>>> maintaining the existing one.
>>
>>Correct and it should also solve #2
>
> Doesn't #3 from http://www.apache.org/legal/src-headers.html#3party
> contradict this?

Hm. This is tricky, now that I re-read the language of the ASF license
header I'm not sure anymore. I *think* the language there should allow
you to slap said header on a compatible license code.

Besides, the alternative is much messier: every time somebody touches
that file he/she needs to decide whether it is time for an ASF header
or not.

I *think* (but I'd love for old-timers to chime in and correct me) that #3-5
were written from though-shall-not-fork-communities perspective.

WDOT?

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Re: File headers for third party utility code

Posted by Alex Harui <ah...@adobe.com>.

On 1/4/16, 3:02 PM, "shaposhnik@gmail.com on behalf of Roman Shaposhnik"
<shaposhnik@gmail.com on behalf of roman@shaposhnik.org> wrote:
>
>> 2) In the cases that we've made non-trivial changes to the source, we
>> should additionally add the ASF copyright notice at the top of the file,
>> and amend the original copyright statement with the words "Some
>>portions"
>> as we've done for example in cache.cc[7].
>
>I don't think you need to do that, but you do need an ASF license header.
>
>I don't think ASF encourages "Portions Copyright ... ASF" statements
>on individual files.
>
>> 3) In all files (regardless of whether we've made changes), we should
>>add
>> the Apache license header above any existing license headers, while
>> maintaining the existing one.
>
>Correct and it should also solve #2

Doesn't #3 from http://www.apache.org/legal/src-headers.html#3party
contradict this?


0. The term "third-party work" refers to a work not submitted directly to
the ASF by the copyright owner or owner's agent.
1. Do not modify or remove any copyright notices or licenses within
third-party works.
2. Do ensure that every third-party work includes its associated license,
even if that requires adding a copy of the license from the third-party
download site into the distribution.
3. Do not add the standard Apache License header to the top of third-party
source files.
4. Minor modifications/additions to third-party source files should
typically be licensed under the same terms as the rest of the rest of the
third-party source for convenience.
5. Major modifications/additions to third-party should be dealt with on a
case-by-case basis by the PMC.

Thanks,

-Alex


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org

Re: File headers for third party utility code

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Mon, Jan 4, 2016 at 2:50 PM, Todd Lipcon <to...@apache.org> wrote:
> Hi all,
>
> I'm working on verifying licenses and copyrights, etc, in Apache Kudu
> (incubating). There is one area I wanted to confirm the right way to
> document in our LICENSE/NOTICE files:
>
> Kudu makes use of a lot of open source utility code borrowed from (or
> adapted from) other open source projects. In particular, we've borrowed a
> lot of code from Chromium's "base" module[1] which is licensed under a BSD
> 3-clause license[2]. We also have some code from Google Supersonic[3]
> licensed under the Apache License[4]. The majority of this borrowed code is
> under a 'gutil' directory in the Kudu tree[5]. We also have some small
> amounts of code borrowed from LevelDB under the BSD license[6].

I am in exactly the same boat dealing with Copyright statements of code
borrowed from Postgres.

> Given that all of the borrowed code is under the Apache or BSD licenses,
> the inclusion of the code is completely allowable under the license terms.
> The only question is the best way to document the inclusion to best follow
> established ASF practices. My understanding is that we should:
>
> 1) Maintain the original copyright notices and license headers in the files.

Correct. Unless you're a copyright holder or an authorized representative
of one, you're not allowed to touch existing copyright notices in files.

> 2) In the cases that we've made non-trivial changes to the source, we
> should additionally add the ASF copyright notice at the top of the file,
> and amend the original copyright statement with the words "Some portions"
> as we've done for example in cache.cc[7].

I don't think you need to do that, but you do need an ASF license header.

I don't think ASF encourages "Portions Copyright ... ASF" statements
on individual files.

> 3) In all files (regardless of whether we've made changes), we should add
> the Apache license header above any existing license headers, while
> maintaining the existing one.

Correct and it should also solve #2

> 4) In the LICENSE file, we should make note of the included code and its
> copyrights as we have done here[8].

I tend to be in the camp that values simplicity of LICENSE file. IOW,
this need to be a succinct communication of what an overall license
for the source bundle is and that's ALv2.

You do need to update NOTICE files accordingly though.

Thanks,
Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org