You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Miller, Timothy" <Ti...@childrens.harvard.edu> on 2012/08/01 13:01:01 UTC

licensing question

Hi all,
There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons.  Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands.  Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
Thanks
Tim Miller

Re: licensing question

Posted by Jörn Kottmann <ko...@gmail.com>.
Hello,

it depends on the texts you train it on, usually its a gray area. There 
are corpora
which are very restrictive in this regard and only allow usage for research,
but that conflicts with the Apache License.

As far as I know do copyright laws on the source text not really apply here,
because the models just contain statistics or bigrams but no original text.

Anyway if you train on your own text and then release the model under AL 2.0
its safe to include it and distribute it.

At OpenNLP we decided to not distribute any models which are trained on 
restricted
corpora at Apache without discussing it on the legal list first. But we 
never spoke to them,
and I personally like the idea much more to produce open training data 
which is
Apache friendly (e.g. based on wikinews or wikipeda).

HTH,
Jörn

On 10/04/2012 06:39 PM, Chen, Pei wrote:
> Hi Jorn,
> If we trained a model and included it as a resource within the ASF repo, just wanted to confirm if that's acceptable in ASF even though it's in a binary format?
> Were there any issues for openNLP with including trained models?
>
> Thanks,
> Pei
>
>> -----Original Message-----
>> From: Jörn Kottmann [mailto:kottmann@gmail.com]
>> Sent: Wednesday, August 01, 2012 8:01 AM
>> To: ctakes-dev@incubator.apache.org
>> Subject: Re: licensing question
>>
>> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
>>> There was some chatter last week about resources potentially being
>> downloaded via maven for license compatibility reasons.  Just wondering if
>> that brings about the possibility of using external libraries that are not
>> apache-licensed that would also be auto-downloaded under certain maven
>> build commands.  Specifically I was thinking of the GPL-licensed berkeley
>> parser which I've used to get significantly higher accuracy than the opennlp
>> parser we currently wrap in our constituency parser module.
>>
>> Making scripts or maven build commands which download stuff is fine, but it
>> might turn out to be quit limiting for your users which need the freedom of
>> the AL. That will be a problem if Berkeley is the only option.
>>
>> The HBase people for example have an optional dependency on LZO which is
>> GPL, and people there just need to install and download it themselves.
>> See here:
>> http://hbase.apache.org/book/lzo.compression.html
>>
>> Speaking as an OpenNLP committer now, it would of course be nice to make
>> our parser better.
>> If you want to work on that we will be happy to get some patches.
>>
>> Jörn


RE: licensing question

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Hi Jorn,
If we trained a model and included it as a resource within the ASF repo, just wanted to confirm if that's acceptable in ASF even though it's in a binary format?
Were there any issues for openNLP with including trained models?

Thanks,
Pei

> -----Original Message-----
> From: Jörn Kottmann [mailto:kottmann@gmail.com]
> Sent: Wednesday, August 01, 2012 8:01 AM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: licensing question
> 
> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
> > There was some chatter last week about resources potentially being
> downloaded via maven for license compatibility reasons.  Just wondering if
> that brings about the possibility of using external libraries that are not
> apache-licensed that would also be auto-downloaded under certain maven
> build commands.  Specifically I was thinking of the GPL-licensed berkeley
> parser which I've used to get significantly higher accuracy than the opennlp
> parser we currently wrap in our constituency parser module.
> 
> Making scripts or maven build commands which download stuff is fine, but it
> might turn out to be quit limiting for your users which need the freedom of
> the AL. That will be a problem if Berkeley is the only option.
> 
> The HBase people for example have an optional dependency on LZO which is
> GPL, and people there just need to install and download it themselves.
> See here:
> http://hbase.apache.org/book/lzo.compression.html
> 
> Speaking as an OpenNLP committer now, it would of course be nice to make
> our parser better.
> If you want to work on that we will be happy to get some patches.
> 
> Jörn

Re: licensing question

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Steve,

On Aug 1, 2012, at 11:08 AM, Steven Bethard wrote:

> On Aug 1, 2012, at 11:57 AM, Mattmann, Chris A (388J) wrote:
>> +1 to making OpenNLP better and eating the ASF dogfood, great response Jörn.
> 
> Also +1 on implementing the Berkeley parsing model in OpenNLP, but practically speaking, that's a *ton* of work and I don't think anyone is going to do that any time soon. Jörn, please correct me if I'm wrong.
> 
> What would Apache think about setting things up so that by default the OpenNLP parser was used, but making it easy to substitute the Berkeley parser if a downstream user wants to (and can accept the license requirements, and can add the dependency, etc.)?

If the OpenNLP parser is set by default but a user can change if they want to LGPL, that's fine I think. Just as long
as the LGPL dep isn't the default, and/or something that a downstream user "really" wants all the time. In that case,
e.g., if 98% of your users just throw out the OpenNLP thing and automatically switch to the Berkeley one, then 
we'd probably have to address that. 

But sounds good for now.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: licensing question

Posted by Steven Bethard <st...@Colorado.EDU>.
On Aug 1, 2012, at 11:57 AM, Mattmann, Chris A (388J) wrote:
> +1 to making OpenNLP better and eating the ASF dogfood, great response Jörn.

Also +1 on implementing the Berkeley parsing model in OpenNLP, but practically speaking, that's a *ton* of work and I don't think anyone is going to do that any time soon. Jörn, please correct me if I'm wrong.

What would Apache think about setting things up so that by default the OpenNLP parser was used, but making it easy to substitute the Berkeley parser if a downstream user wants to (and can accept the license requirements, and can add the dependency, etc.)?

Steve

> 
> Cheers,
> Chris
> 
> On Aug 1, 2012, at 5:00 AM, Jörn Kottmann wrote:
> 
>> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
>>> There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons.  Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands.  Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
>> 
>> Making scripts or maven build commands which download stuff is fine, but it might
>> turn out to be quit limiting for your users which need the freedom of the AL. That will be
>> a problem if Berkeley is the only option.
>> 
>> The HBase people for example have an optional dependency on LZO which is GPL,
>> and people there just need to install and download it themselves.
>> See here:
>> http://hbase.apache.org/book/lzo.compression.html
>> 
>> Speaking as an OpenNLP committer now, it would of course be nice to make our parser better.
>> If you want to work on that we will be happy to get some patches.
>> 
>> Jörn
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: licensing question

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi VJ,

On Aug 1, 2012, at 11:23 AM, vijay garla wrote:

> Hi,
> 
> There are a lot of open source projects that have gpl or lgpl licenses,
> many of which offer functionality for which there is no equivalent project
> with an apache license.

Well, that may be true, but I think it's best to deal with specifics and not
generalizations. If you have a specific list of those projects that you'd
like to use in cTAKES, we should discuss that here.

>  The issue is much broader than just the berkeley
> parser, and reimplementing every non Apache-license library would not be
> feasible in any reasonable time frame.

Re-implementing was one option I suggested. Another was convincing that
community to either relicense, or to consider dual-licensing in an ALv2
compatible way. See:

http://www.apache.org/legal/resolved.html#category-a

>  I'm curious as to how apache
> projects that rely on Java EE - e.g. jetty and tomcat - deal with the
> licensing issue: they must redistribute the Java EE api libraries, which I
> believe are not on the apache license.

Well, Apache Geromino is a fully ALv2 licensed implementation of the Java EE
JCP spec. Also Apache itself drove many of the JCP processes and eventual implementations.
JCPs and spec APIs aren't all ALv2 licensed, but as long as they are category-a
or compat with ALv2 per legal resolved, we're fine.

> 
> In particular, I don't know of any java based machine learning toolkits
> that would fit the bill (correct me if I'm wrong, but mahout is designed
> for map reduce).  Libsvm is essential to some of cTAKES' annotators.

Well, the key here is that Apache projects and software developed here
at the foundation cannot redistribute upstream LGPL or non ALv2 and category-A
compatible licenses with our software. 

> 
> Other libraries that cTAKES uses are public domain (LVG, JAMA); I assume
> these don't pose an issue for redistribution.

Well it depends on what public domain means. If their license is compatible with
http://www.apache.org/legal/resolved.html#category-a then we are fine. If not, we
have to discuss options for replacing those dependencies somehow.

> 
> Finally, users will have to download the SNOMED-CT/RXNORM dictionaries from
> an external web site; why not bundle the non-apache libraries for
> redistribution there?

That's possible in some ways, but we can't splinter the community and from a 
Apache cTAKES perspective, we have to discuss a workable solution for the 
project within the guidelines of the foundation. That may involve linking to 
external sites, that may involve hosting some things elsewhere that aren't
absolutely required by our users and so on and so forth.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: licensing question

Posted by Steven Bethard <st...@Colorado.EDU>.
On Aug 1, 2012, at 12:30 PM, Savova, Guergana wrote:
> LibSVM is BSD license which is compatible with Apache. The non-compatible licenses are GPL, LGPL (http://www.apache.org/legal/3party.html).
> 
> Mallet is CPL I believe. By ASF policy, CPL falls in the category of reciprocal licenses. I am not quite sure how reciprocal licensed tools work with Apache licensed tools.

Mallet claims to be CPL, but depends on Trove, which is LGPL. So Mallet is iffy.

Steve

> 
> --Guergana
> 
> -----Original Message-----
> From: vijay garla [mailto:vngarla@gmail.com] 
> Sent: Wednesday, August 01, 2012 2:24 PM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: licensing question
> 
> Hi,
> 
> There are a lot of open source projects that have gpl or lgpl licenses, many of which offer functionality for which there is no equivalent project with an apache license.  The issue is much broader than just the berkeley parser, and reimplementing every non Apache-license library would not be feasible in any reasonable time frame.  I'm curious as to how apache projects that rely on Java EE - e.g. jetty and tomcat - deal with the licensing issue: they must redistribute the Java EE api libraries, which I believe are not on the apache license.
> 
> In particular, I don't know of any java based machine learning toolkits that would fit the bill (correct me if I'm wrong, but mahout is designed for map reduce).  Libsvm is essential to some of cTAKES' annotators.
> 
> Other libraries that cTAKES uses are public domain (LVG, JAMA); I assume these don't pose an issue for redistribution.
> 
> Finally, users will have to download the SNOMED-CT/RXNORM dictionaries from an external web site; why not bundle the non-apache libraries for redistribution there?
> 
> -vj
> 
> On Wed, Aug 1, 2012 at 1:57 PM, Mattmann, Chris A (388J) < chris.a.mattmann@jpl.nasa.gov> wrote:
> 
>> +1 to making OpenNLP better and eating the ASF dogfood, great response
>> Jörn.
>> 
>> Cheers,
>> Chris
>> 
>> On Aug 1, 2012, at 5:00 AM, Jörn Kottmann wrote:
>> 
>>> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
>>>> There was some chatter last week about resources potentially being
>> downloaded via maven for license compatibility reasons.  Just 
>> wondering if that brings about the possibility of using external 
>> libraries that are not apache-licensed that would also be 
>> auto-downloaded under certain maven build commands.  Specifically I 
>> was thinking of the GPL-licensed berkeley parser which I've used to 
>> get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
>>> 
>>> Making scripts or maven build commands which download stuff is fine, 
>>> but
>> it might
>>> turn out to be quit limiting for your users which need the freedom 
>>> of
>> the AL. That will be
>>> a problem if Berkeley is the only option.
>>> 
>>> The HBase people for example have an optional dependency on LZO 
>>> which is
>> GPL,
>>> and people there just need to install and download it themselves.
>>> See here:
>>> http://hbase.apache.org/book/lzo.compression.html
>>> 
>>> Speaking as an OpenNLP committer now, it would of course be nice to 
>>> make
>> our parser better.
>>> If you want to work on that we will be happy to get some patches.
>>> 
>>> Jörn
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department University of 
>> Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
>> 


Re: licensing question

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Guergana,

I think CPL is a question that is still up for discussion on a case by case basis with the Apache legal
committee: for example I guess Junit is CPL, at least per this email:

http://s.apache.org/YqI

We should probably raise a few specific questions at: legal-discuss@apache.org and 
https://issues.apache.org/jira/browse/LEGAL and see where it gets us.

Cheers,
Chris

On Aug 1, 2012, at 11:30 AM, Savova, Guergana wrote:

> LibSVM is BSD license which is compatible with Apache. The non-compatible licenses are GPL, LGPL (http://www.apache.org/legal/3party.html).
> 
> Mallet is CPL I believe. By ASF policy, CPL falls in the category of reciprocal licenses. I am not quite sure how reciprocal licensed tools work with Apache licensed tools.
> 
> --Guergana


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


RE: licensing question

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
LibSVM is BSD license which is compatible with Apache. The non-compatible licenses are GPL, LGPL (http://www.apache.org/legal/3party.html).

Mallet is CPL I believe. By ASF policy, CPL falls in the category of reciprocal licenses. I am not quite sure how reciprocal licensed tools work with Apache licensed tools.

--Guergana

-----Original Message-----
From: vijay garla [mailto:vngarla@gmail.com] 
Sent: Wednesday, August 01, 2012 2:24 PM
To: ctakes-dev@incubator.apache.org
Subject: Re: licensing question

Hi,

There are a lot of open source projects that have gpl or lgpl licenses, many of which offer functionality for which there is no equivalent project with an apache license.  The issue is much broader than just the berkeley parser, and reimplementing every non Apache-license library would not be feasible in any reasonable time frame.  I'm curious as to how apache projects that rely on Java EE - e.g. jetty and tomcat - deal with the licensing issue: they must redistribute the Java EE api libraries, which I believe are not on the apache license.

In particular, I don't know of any java based machine learning toolkits that would fit the bill (correct me if I'm wrong, but mahout is designed for map reduce).  Libsvm is essential to some of cTAKES' annotators.

Other libraries that cTAKES uses are public domain (LVG, JAMA); I assume these don't pose an issue for redistribution.

Finally, users will have to download the SNOMED-CT/RXNORM dictionaries from an external web site; why not bundle the non-apache libraries for redistribution there?

-vj

On Wed, Aug 1, 2012 at 1:57 PM, Mattmann, Chris A (388J) < chris.a.mattmann@jpl.nasa.gov> wrote:

> +1 to making OpenNLP better and eating the ASF dogfood, great response
> Jörn.
>
> Cheers,
> Chris
>
> On Aug 1, 2012, at 5:00 AM, Jörn Kottmann wrote:
>
> > On 08/01/2012 01:01 PM, Miller, Timothy wrote:
> >> There was some chatter last week about resources potentially being
> downloaded via maven for license compatibility reasons.  Just 
> wondering if that brings about the possibility of using external 
> libraries that are not apache-licensed that would also be 
> auto-downloaded under certain maven build commands.  Specifically I 
> was thinking of the GPL-licensed berkeley parser which I've used to 
> get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
> >
> > Making scripts or maven build commands which download stuff is fine, 
> > but
> it might
> > turn out to be quit limiting for your users which need the freedom 
> > of
> the AL. That will be
> > a problem if Berkeley is the only option.
> >
> > The HBase people for example have an optional dependency on LZO 
> > which is
> GPL,
> > and people there just need to install and download it themselves.
> > See here:
> > http://hbase.apache.org/book/lzo.compression.html
> >
> > Speaking as an OpenNLP committer now, it would of course be nice to 
> > make
> our parser better.
> > If you want to work on that we will be happy to get some patches.
> >
> > Jörn
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department University of 
> Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>

Re: licensing question

Posted by vijay garla <vn...@gmail.com>.
Hi,

There are a lot of open source projects that have gpl or lgpl licenses,
many of which offer functionality for which there is no equivalent project
with an apache license.  The issue is much broader than just the berkeley
parser, and reimplementing every non Apache-license library would not be
feasible in any reasonable time frame.  I'm curious as to how apache
projects that rely on Java EE - e.g. jetty and tomcat - deal with the
licensing issue: they must redistribute the Java EE api libraries, which I
believe are not on the apache license.

In particular, I don't know of any java based machine learning toolkits
that would fit the bill (correct me if I'm wrong, but mahout is designed
for map reduce).  Libsvm is essential to some of cTAKES' annotators.

Other libraries that cTAKES uses are public domain (LVG, JAMA); I assume
these don't pose an issue for redistribution.

Finally, users will have to download the SNOMED-CT/RXNORM dictionaries from
an external web site; why not bundle the non-apache libraries for
redistribution there?

-vj

On Wed, Aug 1, 2012 at 1:57 PM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> +1 to making OpenNLP better and eating the ASF dogfood, great response
> Jörn.
>
> Cheers,
> Chris
>
> On Aug 1, 2012, at 5:00 AM, Jörn Kottmann wrote:
>
> > On 08/01/2012 01:01 PM, Miller, Timothy wrote:
> >> There was some chatter last week about resources potentially being
> downloaded via maven for license compatibility reasons.  Just wondering if
> that brings about the possibility of using external libraries that are not
> apache-licensed that would also be auto-downloaded under certain maven
> build commands.  Specifically I was thinking of the GPL-licensed berkeley
> parser which I've used to get significantly higher accuracy than the
> opennlp parser we currently wrap in our constituency parser module.
> >
> > Making scripts or maven build commands which download stuff is fine, but
> it might
> > turn out to be quit limiting for your users which need the freedom of
> the AL. That will be
> > a problem if Berkeley is the only option.
> >
> > The HBase people for example have an optional dependency on LZO which is
> GPL,
> > and people there just need to install and download it themselves.
> > See here:
> > http://hbase.apache.org/book/lzo.compression.html
> >
> > Speaking as an OpenNLP committer now, it would of course be nice to make
> our parser better.
> > If you want to work on that we will be happy to get some patches.
> >
> > Jörn
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>

Re: licensing question

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
+1 to making OpenNLP better and eating the ASF dogfood, great response Jörn.

Cheers,
Chris

On Aug 1, 2012, at 5:00 AM, Jörn Kottmann wrote:

> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
>> There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons.  Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands.  Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
> 
> Making scripts or maven build commands which download stuff is fine, but it might
> turn out to be quit limiting for your users which need the freedom of the AL. That will be
> a problem if Berkeley is the only option.
> 
> The HBase people for example have an optional dependency on LZO which is GPL,
> and people there just need to install and download it themselves.
> See here:
> http://hbase.apache.org/book/lzo.compression.html
> 
> Speaking as an OpenNLP committer now, it would of course be nice to make our parser better.
> If you want to work on that we will be happy to get some patches.
> 
> Jörn


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: licensing question

Posted by Tim Miller <ti...@childrens.harvard.edu>.
Hi Jörn,
Thanks very much for the info.  It sounds like at the very least it 
could be an optional thing that we can make available.  I will pay close 
attention if/when discussions about using maven start happening.

The opennlp parser is great but the berkeley parser has a really nifty 
way of learning grammars.  I would love to build a parser that can run 
with that grammar and replicate the good performance but that is a big 
project.  I will let the opennlp team know if I make any progress on that.

Tim

On 08/01/2012 08:00 AM, Jörn Kottmann wrote:
> On 08/01/2012 01:01 PM, Miller, Timothy wrote:
>> There was some chatter last week about resources potentially being 
>> downloaded via maven for license compatibility reasons.  Just 
>> wondering if that brings about the possibility of using external 
>> libraries that are not apache-licensed that would also be 
>> auto-downloaded under certain maven build commands.  Specifically I 
>> was thinking of the GPL-licensed berkeley parser which I've used to 
>> get significantly higher accuracy than the opennlp parser we 
>> currently wrap in our constituency parser module.
>
> Making scripts or maven build commands which download stuff is fine, 
> but it might
> turn out to be quit limiting for your users which need the freedom of 
> the AL. That will be
> a problem if Berkeley is the only option.
>
> The HBase people for example have an optional dependency on LZO which 
> is GPL,
> and people there just need to install and download it themselves.
> See here:
> http://hbase.apache.org/book/lzo.compression.html
>
> Speaking as an OpenNLP committer now, it would of course be nice to 
> make our parser better.
> If you want to work on that we will be happy to get some patches.
>
> Jörn



Re: licensing question

Posted by Jörn Kottmann <ko...@gmail.com>.
On 08/01/2012 01:01 PM, Miller, Timothy wrote:
> There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons.  Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands.  Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.

Making scripts or maven build commands which download stuff is fine, but 
it might
turn out to be quit limiting for your users which need the freedom of 
the AL. That will be
a problem if Berkeley is the only option.

The HBase people for example have an optional dependency on LZO which is 
GPL,
and people there just need to install and download it themselves.
See here:
http://hbase.apache.org/book/lzo.compression.html

Speaking as an OpenNLP committer now, it would of course be nice to make 
our parser better.
If you want to work on that we will be happy to get some patches.

Jörn

Re: licensing question

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Tim,

In general, the ASF doesn't like to present "surprises" to users downstream. The use of a parser that is better, and
that users will want to download to provide better accuracy, but that isn't by default included, is one such example to
me of a surprise that we don't want to have wired into the product.

There is discussion going on right now on the Apache legal-discuss@apache.org list related to CloudStack right
now where they are talking about this.

My 2c from a mentor's perspective: stay away from LGPL and try as much as possible to stay with Category A 
ASF compatible licenses from the legal resolved page. If there is something better that's LGPL, either rewrite it
here (if possible), or convince the authors of that library to relicense their product (has worked sometimes in the
past), or think of another creative solution that doesn't involve LGPL :)

Cheers,
Chris

On Aug 1, 2012, at 4:01 AM, Miller, Timothy wrote:

> Hi all,
> There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons.  Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands.  Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.
> Thanks
> Tim Miller


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++