You are viewing a plain text version of this content. The canonical link for it is here.
Posted to legal-discuss@apache.org by Thomas Weise <th...@apache.org> on 2016/08/30 20:51:30 UTC

Use of third party text content in source release

Hi,

We recently run into potential copyright issues with an Apache Apex source
release. I'm looking for an opinion regarding inclusion of following:

Content from Tweets (this was used as test data):

https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/demos/highlevelapi/src/test/resources/sampletweets.txt

Content from Project Gutenberg (again, used as test data):

https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/library/src/test/resources/wordcount.txt

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org

Blog feed from DataTorrent (RSS test data):

https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/contrib/src/test/resources/com/datatorrent/contrib/romesyndication/datatorrent_feed.rss

Would appreciate any feedback on whether the data can be used or not and
also any pointers to further information for the community to avoid similar
issues in the future.

Thanks!
Thomas

Re: Use of third party text content in source release

Posted by Justin Mclean <ju...@classsoftware.com>.
Hi,

> What do you think to 1C Justin?

Again INAL but I think is likely that could apply i.e. remove the Gutenberg license and keep the original text 

> We're in the US, and this file would seem to be in the public domain

It would be distributed outside of the US, but it seems likely it is in the public domain world wide whatever that means. The longest copyright terms world wide that I’m aware of are death of author + 100 years.

Thanks,
Justin
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: Use of third party text content in source release

Posted by Henri Yandell <ba...@apache.org>.
Per 1C, it would seem that the Tale of Two Cities file could have Gutenberg
references removed and be included.

Looking at the Gutenberg license, some items that jump out as interesting:

1.E.6 is a requirement to include a copy or link to a copy of the original
(let's say) source when using in a (let's call it) binary. I think that's
on the line for our category A criteria, but one I could see being on the
okay side.

1.E.7 and 1.E.8 add complexity for commercial works. I imagine any use
would want to keep a lot of distance between the Gutenberg licensed work
and the Apache licensed work to keep things clear and simple for our users.
The 1.F.6 indemnification clause might cause users to raise eyebrows too.

What do you think to 1C Justin? We're in the US, and this file would seem
to be in the public domain, though I'm not sure what the full process is to
confirm that as the file itself doesn't explicitly make that clear.

Hen







On Thu, Sep 1, 2016 at 1:04 AM, Justin Mclean <ju...@classsoftware.com>
wrote:

> Hi,
>
> INAL but it seems to me that for the gutenberg licensed content [1] may
> place further restrictions and thus not compatible with Apache license V2.
> These include you are not automatically given the ability to redistribute,
> you can’t make changes to the content, if it's charged for you need to pay
> royalties. Also the copyright status may vary from country to country
> outside of the US.
>
> Thanks,
> Justin
>
> 1. http://www.gutenberg.org/wiki/Gutenberg:The_Project_Gutenberg_License
>

Re: Use of third party text content in source release

Posted by Justin Mclean <ju...@classsoftware.com>.
Hi,

INAL but it seems to me that for the gutenberg licensed content [1] may place further restrictions and thus not compatible with Apache license V2. These include you are not automatically given the ability to redistribute, you can’t make changes to the content, if it's charged for you need to pay royalties. Also the copyright status may vary from country to country outside of the US.

Thanks,
Justin

1. http://www.gutenberg.org/wiki/Gutenberg:The_Project_Gutenberg_License <http://www.gutenberg.org/wiki/Gutenberg:The_Project_Gutenberg_License>

Re: Use of third party text content in source release

Posted by Roman Shaposhnik <rv...@apache.org>.
FWIW: when Bigtop found itself in a similar situation with Movielens we ended
up removing it and going through the extra mechanics of wget'ing before
the build/testrun.

In general, I find tracking licensing of data sets even tougher than code.

Thanks,
Roman.

On Tue, Aug 30, 2016 at 1:51 PM, Thomas Weise <th...@apache.org> wrote:
> Hi,
>
> We recently run into potential copyright issues with an Apache Apex source
> release. I'm looking for an opinion regarding inclusion of following:
>
> Content from Tweets (this was used as test data):
>
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/demos/highlevelapi/src/test/resources/sampletweets.txt
>
> Content from Project Gutenberg (again, used as test data):
>
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/library/src/test/resources/wordcount.txt
>
> This eBook is for the use of anyone anywhere at no cost and with
> almost no restrictions whatsoever.  You may copy it, give it away or
> re-use it under the terms of the Project Gutenberg License included
> with this eBook or online at www.gutenberg.org
>
> Blog feed from DataTorrent (RSS test data):
>
> https://github.com/apache/apex-malhar/blob/v3.5.0-RC1/contrib/src/test/resources/com/datatorrent/contrib/romesyndication/datatorrent_feed.rss
>
> Would appreciate any feedback on whether the data can be used or not and
> also any pointers to further information for the community to avoid similar
> issues in the future.
>
> Thanks!
> Thomas
>

---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org