You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Rodrigo Agerri <ra...@apache.org> on 2015/10/05 16:45:44 UTC

Re: mallet addon

Hi,

On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com> wrote:
> We can also move
> it to the sandbox, releasing it at Apache might be more difficult since
> mallet pulls in incompatible licensed dependencies. But maybe that changed,
> we can check.

Mallet is released under Common Public License

http://opensource.org/licenses/cpl1.0.php

but as you have mentioned, it pulls several dependencies that are
LGPL. These are the dependencies:

  <dependency>
      <groupId>org.beanshell</groupId>
      <artifactId>bsh</artifactId>
      <version>2.0b4</version>
    </dependency>

This version is LGPL, however, later versions are APL 2.0

https://github.com/beanshell/beanshell

<dependency>
      <groupId>jgrapht</groupId>
      <artifactId>jgrapht</artifactId>
      <version>0.6.0</version>
    </dependency>

that version was also LGPL, but it has now been dual-licensed with EPL 1.0

https://github.com/jgrapht/jgrapht/wiki/Relicensing

which could be included also in APL 2.0 projects

http://www.apache.org/legal/resolved.html

 <dependency>
      <groupId>net.sf.jwordnet</groupId>
      <artifactId>jwnl</artifactId>
      <version>1.4_rc3</version>
    </dependency>

BSD license, but this library has already been discussed here.

 <dependency>
      <groupId>net.sf.trove4j</groupId>
      <artifactId>trove4j</artifactId>
      <version>2.0.2</version>
    </dependency>

LGPL-ed.

<dependency>
      <groupId>com.googlecode.matrix-toolkits-java</groupId>
      <artifactId>mtj</artifactId>
      <version>0.9.14</version>
    </dependency>

also LGPL

Rodrigo

Re: mallet addon

Posted by Joern Kottmann <ko...@gmail.com>.
Hello,

I updated the code and afterwards spent some time evaluating it again. The
maxent training is very close to our maxent classifier. I also checked the
training code again and it looks good to me, but it would be nice if you
can review it.

There are a couple of other classifiers in mallet, it should be trivial to
expose them all to OpenNLP.

Jörn

On Tue, Oct 20, 2015 at 9:12 AM, Rodrigo Agerri <ro...@ehu.eus>
wrote:

> Hello,
>
> Thanks. I thought I had an idea for CRF not obtaining good results
> with OpenNLP default features, e.g.,
>
> http://lingpipe-blog.com/2006/11/22/why-do-you-hate-crfs/
>
> but if results are also worse in Maxent, that is intriguing. I will
> look at the Mallet implementation to see if I find out something.
>
> R
>
>
>
> On Mon, Oct 12, 2015 at 4:07 PM, Joern Kottmann <ko...@gmail.com>
> wrote:
> > Hello,
> >
> > fixed up the code a bit. The performance is not really good. Do you have
> > any idea why that could be?
> >
> > Neither the maxent or crf get good evaluation numbers on NER.
> >
> > I will push the changes and then you can experiment with it too.
> >
> > Jörn
> >
> >
> > On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org>
> wrote:
> >
> >> Hi,
> >>
> >> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
> >> wrote:
> >> > We can also move
> >> > it to the sandbox, releasing it at Apache might be more difficult
> since
> >> > mallet pulls in incompatible licensed dependencies. But maybe that
> >> changed,
> >> > we can check.
> >>
> >> Mallet is released under Common Public License
> >>
> >> http://opensource.org/licenses/cpl1.0.php
> >>
> >> but as you have mentioned, it pulls several dependencies that are
> >> LGPL. These are the dependencies:
> >>
> >>   <dependency>
> >>       <groupId>org.beanshell</groupId>
> >>       <artifactId>bsh</artifactId>
> >>       <version>2.0b4</version>
> >>     </dependency>
> >>
> >> This version is LGPL, however, later versions are APL 2.0
> >>
> >> https://github.com/beanshell/beanshell
> >>
> >> <dependency>
> >>       <groupId>jgrapht</groupId>
> >>       <artifactId>jgrapht</artifactId>
> >>       <version>0.6.0</version>
> >>     </dependency>
> >>
> >> that version was also LGPL, but it has now been dual-licensed with EPL
> 1.0
> >>
> >> https://github.com/jgrapht/jgrapht/wiki/Relicensing
> >>
> >> which could be included also in APL 2.0 projects
> >>
> >> http://www.apache.org/legal/resolved.html
> >>
> >>  <dependency>
> >>       <groupId>net.sf.jwordnet</groupId>
> >>       <artifactId>jwnl</artifactId>
> >>       <version>1.4_rc3</version>
> >>     </dependency>
> >>
> >> BSD license, but this library has already been discussed here.
> >>
> >>  <dependency>
> >>       <groupId>net.sf.trove4j</groupId>
> >>       <artifactId>trove4j</artifactId>
> >>       <version>2.0.2</version>
> >>     </dependency>
> >>
> >> LGPL-ed.
> >>
> >> <dependency>
> >>       <groupId>com.googlecode.matrix-toolkits-java</groupId>
> >>       <artifactId>mtj</artifactId>
> >>       <version>0.9.14</version>
> >>     </dependency>
> >>
> >> also LGPL
> >>
> >> Rodrigo
> >>
>

Re: mallet addon

Posted by Rodrigo Agerri <ro...@ehu.eus>.
Hello,

Thanks. I thought I had an idea for CRF not obtaining good results
with OpenNLP default features, e.g.,

http://lingpipe-blog.com/2006/11/22/why-do-you-hate-crfs/

but if results are also worse in Maxent, that is intriguing. I will
look at the Mallet implementation to see if I find out something.

R



On Mon, Oct 12, 2015 at 4:07 PM, Joern Kottmann <ko...@gmail.com> wrote:
> Hello,
>
> fixed up the code a bit. The performance is not really good. Do you have
> any idea why that could be?
>
> Neither the maxent or crf get good evaluation numbers on NER.
>
> I will push the changes and then you can experiment with it too.
>
> Jörn
>
>
> On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org> wrote:
>
>> Hi,
>>
>> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
>> wrote:
>> > We can also move
>> > it to the sandbox, releasing it at Apache might be more difficult since
>> > mallet pulls in incompatible licensed dependencies. But maybe that
>> changed,
>> > we can check.
>>
>> Mallet is released under Common Public License
>>
>> http://opensource.org/licenses/cpl1.0.php
>>
>> but as you have mentioned, it pulls several dependencies that are
>> LGPL. These are the dependencies:
>>
>>   <dependency>
>>       <groupId>org.beanshell</groupId>
>>       <artifactId>bsh</artifactId>
>>       <version>2.0b4</version>
>>     </dependency>
>>
>> This version is LGPL, however, later versions are APL 2.0
>>
>> https://github.com/beanshell/beanshell
>>
>> <dependency>
>>       <groupId>jgrapht</groupId>
>>       <artifactId>jgrapht</artifactId>
>>       <version>0.6.0</version>
>>     </dependency>
>>
>> that version was also LGPL, but it has now been dual-licensed with EPL 1.0
>>
>> https://github.com/jgrapht/jgrapht/wiki/Relicensing
>>
>> which could be included also in APL 2.0 projects
>>
>> http://www.apache.org/legal/resolved.html
>>
>>  <dependency>
>>       <groupId>net.sf.jwordnet</groupId>
>>       <artifactId>jwnl</artifactId>
>>       <version>1.4_rc3</version>
>>     </dependency>
>>
>> BSD license, but this library has already been discussed here.
>>
>>  <dependency>
>>       <groupId>net.sf.trove4j</groupId>
>>       <artifactId>trove4j</artifactId>
>>       <version>2.0.2</version>
>>     </dependency>
>>
>> LGPL-ed.
>>
>> <dependency>
>>       <groupId>com.googlecode.matrix-toolkits-java</groupId>
>>       <artifactId>mtj</artifactId>
>>       <version>0.9.14</version>
>>     </dependency>
>>
>> also LGPL
>>
>> Rodrigo
>>

Re: mallet addon

Posted by Joern Kottmann <ko...@gmail.com>.
Hello,

fixed up the code a bit. The performance is not really good. Do you have
any idea why that could be?

Neither the maxent or crf get good evaluation numbers on NER.

I will push the changes and then you can experiment with it too.

Jörn


On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org> wrote:

> Hi,
>
> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
> wrote:
> > We can also move
> > it to the sandbox, releasing it at Apache might be more difficult since
> > mallet pulls in incompatible licensed dependencies. But maybe that
> changed,
> > we can check.
>
> Mallet is released under Common Public License
>
> http://opensource.org/licenses/cpl1.0.php
>
> but as you have mentioned, it pulls several dependencies that are
> LGPL. These are the dependencies:
>
>   <dependency>
>       <groupId>org.beanshell</groupId>
>       <artifactId>bsh</artifactId>
>       <version>2.0b4</version>
>     </dependency>
>
> This version is LGPL, however, later versions are APL 2.0
>
> https://github.com/beanshell/beanshell
>
> <dependency>
>       <groupId>jgrapht</groupId>
>       <artifactId>jgrapht</artifactId>
>       <version>0.6.0</version>
>     </dependency>
>
> that version was also LGPL, but it has now been dual-licensed with EPL 1.0
>
> https://github.com/jgrapht/jgrapht/wiki/Relicensing
>
> which could be included also in APL 2.0 projects
>
> http://www.apache.org/legal/resolved.html
>
>  <dependency>
>       <groupId>net.sf.jwordnet</groupId>
>       <artifactId>jwnl</artifactId>
>       <version>1.4_rc3</version>
>     </dependency>
>
> BSD license, but this library has already been discussed here.
>
>  <dependency>
>       <groupId>net.sf.trove4j</groupId>
>       <artifactId>trove4j</artifactId>
>       <version>2.0.2</version>
>     </dependency>
>
> LGPL-ed.
>
> <dependency>
>       <groupId>com.googlecode.matrix-toolkits-java</groupId>
>       <artifactId>mtj</artifactId>
>       <version>0.9.14</version>
>     </dependency>
>
> also LGPL
>
> Rodrigo
>