You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Rodrigo Agerri <ra...@apache.org> on 2015/10/05 16:45:44 UTC
Re: mallet addon
Hi,
On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com> wrote:
> We can also move
> it to the sandbox, releasing it at Apache might be more difficult since
> mallet pulls in incompatible licensed dependencies. But maybe that changed,
> we can check.
Mallet is released under Common Public License
http://opensource.org/licenses/cpl1.0.php
but as you have mentioned, it pulls several dependencies that are
LGPL. These are the dependencies:
<dependency>
<groupId>org.beanshell</groupId>
<artifactId>bsh</artifactId>
<version>2.0b4</version>
</dependency>
This version is LGPL, however, later versions are APL 2.0
https://github.com/beanshell/beanshell
<dependency>
<groupId>jgrapht</groupId>
<artifactId>jgrapht</artifactId>
<version>0.6.0</version>
</dependency>
that version was also LGPL, but it has now been dual-licensed with EPL 1.0
https://github.com/jgrapht/jgrapht/wiki/Relicensing
which could be included also in APL 2.0 projects
http://www.apache.org/legal/resolved.html
<dependency>
<groupId>net.sf.jwordnet</groupId>
<artifactId>jwnl</artifactId>
<version>1.4_rc3</version>
</dependency>
BSD license, but this library has already been discussed here.
<dependency>
<groupId>net.sf.trove4j</groupId>
<artifactId>trove4j</artifactId>
<version>2.0.2</version>
</dependency>
LGPL-ed.
<dependency>
<groupId>com.googlecode.matrix-toolkits-java</groupId>
<artifactId>mtj</artifactId>
<version>0.9.14</version>
</dependency>
also LGPL
Rodrigo
Re: mallet addon
Posted by Joern Kottmann <ko...@gmail.com>.
Hello,
I updated the code and afterwards spent some time evaluating it again. The
maxent training is very close to our maxent classifier. I also checked the
training code again and it looks good to me, but it would be nice if you
can review it.
There are a couple of other classifiers in mallet, it should be trivial to
expose them all to OpenNLP.
Jörn
On Tue, Oct 20, 2015 at 9:12 AM, Rodrigo Agerri <ro...@ehu.eus>
wrote:
> Hello,
>
> Thanks. I thought I had an idea for CRF not obtaining good results
> with OpenNLP default features, e.g.,
>
> http://lingpipe-blog.com/2006/11/22/why-do-you-hate-crfs/
>
> but if results are also worse in Maxent, that is intriguing. I will
> look at the Mallet implementation to see if I find out something.
>
> R
>
>
>
> On Mon, Oct 12, 2015 at 4:07 PM, Joern Kottmann <ko...@gmail.com>
> wrote:
> > Hello,
> >
> > fixed up the code a bit. The performance is not really good. Do you have
> > any idea why that could be?
> >
> > Neither the maxent or crf get good evaluation numbers on NER.
> >
> > I will push the changes and then you can experiment with it too.
> >
> > Jörn
> >
> >
> > On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org>
> wrote:
> >
> >> Hi,
> >>
> >> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
> >> wrote:
> >> > We can also move
> >> > it to the sandbox, releasing it at Apache might be more difficult
> since
> >> > mallet pulls in incompatible licensed dependencies. But maybe that
> >> changed,
> >> > we can check.
> >>
> >> Mallet is released under Common Public License
> >>
> >> http://opensource.org/licenses/cpl1.0.php
> >>
> >> but as you have mentioned, it pulls several dependencies that are
> >> LGPL. These are the dependencies:
> >>
> >> <dependency>
> >> <groupId>org.beanshell</groupId>
> >> <artifactId>bsh</artifactId>
> >> <version>2.0b4</version>
> >> </dependency>
> >>
> >> This version is LGPL, however, later versions are APL 2.0
> >>
> >> https://github.com/beanshell/beanshell
> >>
> >> <dependency>
> >> <groupId>jgrapht</groupId>
> >> <artifactId>jgrapht</artifactId>
> >> <version>0.6.0</version>
> >> </dependency>
> >>
> >> that version was also LGPL, but it has now been dual-licensed with EPL
> 1.0
> >>
> >> https://github.com/jgrapht/jgrapht/wiki/Relicensing
> >>
> >> which could be included also in APL 2.0 projects
> >>
> >> http://www.apache.org/legal/resolved.html
> >>
> >> <dependency>
> >> <groupId>net.sf.jwordnet</groupId>
> >> <artifactId>jwnl</artifactId>
> >> <version>1.4_rc3</version>
> >> </dependency>
> >>
> >> BSD license, but this library has already been discussed here.
> >>
> >> <dependency>
> >> <groupId>net.sf.trove4j</groupId>
> >> <artifactId>trove4j</artifactId>
> >> <version>2.0.2</version>
> >> </dependency>
> >>
> >> LGPL-ed.
> >>
> >> <dependency>
> >> <groupId>com.googlecode.matrix-toolkits-java</groupId>
> >> <artifactId>mtj</artifactId>
> >> <version>0.9.14</version>
> >> </dependency>
> >>
> >> also LGPL
> >>
> >> Rodrigo
> >>
>
Re: mallet addon
Posted by Rodrigo Agerri <ro...@ehu.eus>.
Hello,
Thanks. I thought I had an idea for CRF not obtaining good results
with OpenNLP default features, e.g.,
http://lingpipe-blog.com/2006/11/22/why-do-you-hate-crfs/
but if results are also worse in Maxent, that is intriguing. I will
look at the Mallet implementation to see if I find out something.
R
On Mon, Oct 12, 2015 at 4:07 PM, Joern Kottmann <ko...@gmail.com> wrote:
> Hello,
>
> fixed up the code a bit. The performance is not really good. Do you have
> any idea why that could be?
>
> Neither the maxent or crf get good evaluation numbers on NER.
>
> I will push the changes and then you can experiment with it too.
>
> Jörn
>
>
> On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org> wrote:
>
>> Hi,
>>
>> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
>> wrote:
>> > We can also move
>> > it to the sandbox, releasing it at Apache might be more difficult since
>> > mallet pulls in incompatible licensed dependencies. But maybe that
>> changed,
>> > we can check.
>>
>> Mallet is released under Common Public License
>>
>> http://opensource.org/licenses/cpl1.0.php
>>
>> but as you have mentioned, it pulls several dependencies that are
>> LGPL. These are the dependencies:
>>
>> <dependency>
>> <groupId>org.beanshell</groupId>
>> <artifactId>bsh</artifactId>
>> <version>2.0b4</version>
>> </dependency>
>>
>> This version is LGPL, however, later versions are APL 2.0
>>
>> https://github.com/beanshell/beanshell
>>
>> <dependency>
>> <groupId>jgrapht</groupId>
>> <artifactId>jgrapht</artifactId>
>> <version>0.6.0</version>
>> </dependency>
>>
>> that version was also LGPL, but it has now been dual-licensed with EPL 1.0
>>
>> https://github.com/jgrapht/jgrapht/wiki/Relicensing
>>
>> which could be included also in APL 2.0 projects
>>
>> http://www.apache.org/legal/resolved.html
>>
>> <dependency>
>> <groupId>net.sf.jwordnet</groupId>
>> <artifactId>jwnl</artifactId>
>> <version>1.4_rc3</version>
>> </dependency>
>>
>> BSD license, but this library has already been discussed here.
>>
>> <dependency>
>> <groupId>net.sf.trove4j</groupId>
>> <artifactId>trove4j</artifactId>
>> <version>2.0.2</version>
>> </dependency>
>>
>> LGPL-ed.
>>
>> <dependency>
>> <groupId>com.googlecode.matrix-toolkits-java</groupId>
>> <artifactId>mtj</artifactId>
>> <version>0.9.14</version>
>> </dependency>
>>
>> also LGPL
>>
>> Rodrigo
>>
Re: mallet addon
Posted by Joern Kottmann <ko...@gmail.com>.
Hello,
fixed up the code a bit. The performance is not really good. Do you have
any idea why that could be?
Neither the maxent or crf get good evaluation numbers on NER.
I will push the changes and then you can experiment with it too.
Jörn
On Mon, Oct 5, 2015 at 4:45 PM, Rodrigo Agerri <ra...@apache.org> wrote:
> Hi,
>
> On Tue, Sep 29, 2015 at 3:41 PM, Joern Kottmann <ko...@gmail.com>
> wrote:
> > We can also move
> > it to the sandbox, releasing it at Apache might be more difficult since
> > mallet pulls in incompatible licensed dependencies. But maybe that
> changed,
> > we can check.
>
> Mallet is released under Common Public License
>
> http://opensource.org/licenses/cpl1.0.php
>
> but as you have mentioned, it pulls several dependencies that are
> LGPL. These are the dependencies:
>
> <dependency>
> <groupId>org.beanshell</groupId>
> <artifactId>bsh</artifactId>
> <version>2.0b4</version>
> </dependency>
>
> This version is LGPL, however, later versions are APL 2.0
>
> https://github.com/beanshell/beanshell
>
> <dependency>
> <groupId>jgrapht</groupId>
> <artifactId>jgrapht</artifactId>
> <version>0.6.0</version>
> </dependency>
>
> that version was also LGPL, but it has now been dual-licensed with EPL 1.0
>
> https://github.com/jgrapht/jgrapht/wiki/Relicensing
>
> which could be included also in APL 2.0 projects
>
> http://www.apache.org/legal/resolved.html
>
> <dependency>
> <groupId>net.sf.jwordnet</groupId>
> <artifactId>jwnl</artifactId>
> <version>1.4_rc3</version>
> </dependency>
>
> BSD license, but this library has already been discussed here.
>
> <dependency>
> <groupId>net.sf.trove4j</groupId>
> <artifactId>trove4j</artifactId>
> <version>2.0.2</version>
> </dependency>
>
> LGPL-ed.
>
> <dependency>
> <groupId>com.googlecode.matrix-toolkits-java</groupId>
> <artifactId>mtj</artifactId>
> <version>0.9.14</version>
> </dependency>
>
> also LGPL
>
> Rodrigo
>