You are viewing a plain text version of this content. The canonical link for it is here.
Posted to legal-discuss@apache.org by Grant Ingersoll <gs...@apache.org> on 2007/04/25 03:01:59 UTC

Wikipedia content, GNU Free Documentation License and Apache

We on Lucene Java are _considering_ mirroring a specific version of  
Wikipedia's collection for testing and benchmarking purposes on our  
zones account for people to download to run specific tests related to  
search performance and quality.  It is important that we use a  
specific version so that it is repeatable. WikiMedia doesn't archive  
for long enough for their links to be reliable, therefore we cannot  
just grab the latest version from Wikipedia.  See [1] for more  
details if interested.

The content, according to http://en.wikipedia.org/wiki/ 
Wikipedia:Text_of_the_GFDL, is licensed under the GNU Free  
Documentation License.  More details can be found at [2].

I guess the question is, is this all right?  http://en.wikipedia.org/ 
wiki/Wikipedia:Copyrights seems to say this would be OK, but IANAL  
and all that stuff.  Anyone have experience with how ASF deals with  
the GFDL in this kind of situation?  I suppose we could look into  
hosting off site, too, but would rather it be controlled by us.

Thanks,
Grant


[1] http://www.gossamer-threads.com/lists/lucene/java-dev/47760

[2] http://en.wikipedia.org/wiki/Wikipedia:Database_download

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Fwd: Wikipedia content, GNU Free Documentation License and Apache

Posted by Grant Ingersoll <gs...@apache.org>.
FYI from legal-discuss.  I think I will give it a little more time to  
percolate and then put it up on zones.  Other option is to host  
offsite somewhere, but I would prefer not to do that.

Begin forwarded message:

> From: "Justin Erenkrantz" <ju...@erenkrantz.com>
> Date: April 25, 2007 12:23:26 AM EDT
> To: "Grant Ingersoll" <gs...@apache.org>
> Cc: legal-discuss@apache.org
> Subject: Re: Wikipedia content, GNU Free Documentation License and  
> Apache
>
> On 4/24/07, Grant Ingersoll <gs...@apache.org> wrote:
>> We on Lucene Java are _considering_ mirroring a specific version of
>> Wikipedia's collection for testing and benchmarking purposes on our
>> zones account for people to download to run specific tests related to
>> search performance and quality.  It is important that we use a
>> specific version so that it is repeatable. WikiMedia doesn't archive
>> for long enough for their links to be reliable, therefore we cannot
>> just grab the latest version from Wikipedia.  See [1] for more
>> details if interested.
>>
>> The content, according to http://en.wikipedia.org/wiki/
>> Wikipedia:Text_of_the_GFDL, is licensed under the GNU Free
>> Documentation License.  More details can be found at [2].
>>
>> I guess the question is, is this all right?  http://en.wikipedia.org/
>> wiki/Wikipedia:Copyrights seems to say this would be OK, but IANAL
>> and all that stuff.  Anyone have experience with how ASF deals with
>> the GFDL in this kind of situation?  I suppose we could look into
>> hosting off site, too, but would rather it be controlled by us.
>
> As long as you do not distribute the Wikipedia database in a Lucene
> release and just have a copy hosted on your Lucene zone or something
> similar so that committers can get at it easily, I don't see a
> particular problem here.  If you make changes to the documentation or
> whatever, you would just need to follow the rules of GFDL
> (http://www.gnu.org/licenses/fdl.html).
>
> BTW, would you need to commit the database to Subversion?  (I'd
> certainly hope not.)
>
> HTH.  -- justin



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: Wikipedia content, GNU Free Documentation License and Apache

Posted by Grant Ingersoll <gs...@apache.org>.
It will be an explicit, non-default ANT target as part of a sub  
module.  It won't even be part of the main test.

I can ask on infrastructure if there is a better place to host it.

Thanks,
Grant


On Apr 25, 2007, at 3:21 PM, Roy T. Fielding wrote:

> On Apr 25, 2007, at 8:12 AM, Grant Ingersoll wrote:
>
>> No, we definitely would not put it in SVN, but the ANT task to  
>> download it would be part of the release.  All it would do is  
>> download from the site and unpack it.  We would in no way change  
>> what is in the content other than unpacking it.
>
> Umm, how big is it?  We don't want people downloading big hunks of
> content from a zone just because they do a 'build test' without a
> clear understanding of what is about to happen.
>
> Is there some reason you can't distribute it as an overlay package
> that can be optionally downloaded by developers who intend to do a
> larger test?
>
> ....Roy
>
> ---------------------------------------------------------------------
> DISCLAIMER: Discussions on this list are informational and educational
> only.  Statements made on this list are not privileged, do not
> constitute legal advice, and do not necessarily reflect the opinions
> and policies of the ASF.  See <http://www.apache.org/licenses/> for
> official ASF policies and documents.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
> For additional commands, e-mail: legal-discuss-help@apache.org
>



---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: Wikipedia content, GNU Free Documentation License and Apache

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
On Apr 25, 2007, at 8:12 AM, Grant Ingersoll wrote:

> No, we definitely would not put it in SVN, but the ANT task to  
> download it would be part of the release.  All it would do is  
> download from the site and unpack it.  We would in no way change  
> what is in the content other than unpacking it.

Umm, how big is it?  We don't want people downloading big hunks of
content from a zone just because they do a 'build test' without a
clear understanding of what is about to happen.

Is there some reason you can't distribute it as an overlay package
that can be optionally downloaded by developers who intend to do a
larger test?

....Roy

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: Wikipedia content, GNU Free Documentation License and Apache

Posted by Grant Ingersoll <gs...@apache.org>.
No, we definitely would not put it in SVN, but the ANT task to  
download it would be part of the release.  All it would do is  
download from the site and unpack it.  We would in no way change what  
is in the content other than unpacking it.

-Grant

On Apr 25, 2007, at 12:23 AM, Justin Erenkrantz wrote:

> On 4/24/07, Grant Ingersoll <gs...@apache.org> wrote:
>> We on Lucene Java are _considering_ mirroring a specific version of
>> Wikipedia's collection for testing and benchmarking purposes on our
>> zones account for people to download to run specific tests related to
>> search performance and quality.  It is important that we use a
>> specific version so that it is repeatable. WikiMedia doesn't archive
>> for long enough for their links to be reliable, therefore we cannot
>> just grab the latest version from Wikipedia.  See [1] for more
>> details if interested.
>>
>> The content, according to http://en.wikipedia.org/wiki/
>> Wikipedia:Text_of_the_GFDL, is licensed under the GNU Free
>> Documentation License.  More details can be found at [2].
>>
>> I guess the question is, is this all right?  http://en.wikipedia.org/
>> wiki/Wikipedia:Copyrights seems to say this would be OK, but IANAL
>> and all that stuff.  Anyone have experience with how ASF deals with
>> the GFDL in this kind of situation?  I suppose we could look into
>> hosting off site, too, but would rather it be controlled by us.
>
> As long as you do not distribute the Wikipedia database in a Lucene
> release and just have a copy hosted on your Lucene zone or something
> similar so that committers can get at it easily, I don't see a
> particular problem here.  If you make changes to the documentation or
> whatever, you would just need to follow the rules of GFDL
> (http://www.gnu.org/licenses/fdl.html).
>
> BTW, would you need to commit the database to Subversion?  (I'd
> certainly hope not.)
>
> HTH.  -- justin



---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org


Re: Wikipedia content, GNU Free Documentation License and Apache

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On 4/24/07, Grant Ingersoll <gs...@apache.org> wrote:
> We on Lucene Java are _considering_ mirroring a specific version of
> Wikipedia's collection for testing and benchmarking purposes on our
> zones account for people to download to run specific tests related to
> search performance and quality.  It is important that we use a
> specific version so that it is repeatable. WikiMedia doesn't archive
> for long enough for their links to be reliable, therefore we cannot
> just grab the latest version from Wikipedia.  See [1] for more
> details if interested.
>
> The content, according to http://en.wikipedia.org/wiki/
> Wikipedia:Text_of_the_GFDL, is licensed under the GNU Free
> Documentation License.  More details can be found at [2].
>
> I guess the question is, is this all right?  http://en.wikipedia.org/
> wiki/Wikipedia:Copyrights seems to say this would be OK, but IANAL
> and all that stuff.  Anyone have experience with how ASF deals with
> the GFDL in this kind of situation?  I suppose we could look into
> hosting off site, too, but would rather it be controlled by us.

As long as you do not distribute the Wikipedia database in a Lucene
release and just have a copy hosted on your Lucene zone or something
similar so that committers can get at it easily, I don't see a
particular problem here.  If you make changes to the documentation or
whatever, you would just need to follow the rules of GFDL
(http://www.gnu.org/licenses/fdl.html).

BTW, would you need to commit the database to Subversion?  (I'd
certainly hope not.)

HTH.  -- justin

---------------------------------------------------------------------
DISCLAIMER: Discussions on this list are informational and educational
only.  Statements made on this list are not privileged, do not
constitute legal advice, and do not necessarily reflect the opinions
and policies of the ASF.  See <http://www.apache.org/licenses/> for
official ASF policies and documents.
---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org