You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Encolpe Degoute <en...@free.fr> on 2012/06/25 23:34:58 UTC

Development of french Open NLP models

Hello,

During the last IKS Workshop several companies showed their interest in
developing French NLP models.
I propose myself to work and to coordinate all efforts on this topic.
All released models will be opensource. We should discuss the exact licence.
Contact me directly to be involved in development and tests.

An other team is supposed to work on the Italian support.

Regards

-- 
Encolpe DEGOUTE
http://encolpe.degoute.free.fr/
Logiciels libres, hockey sur glace et autres activités cérébrales


Re: Development of french Open NLP models

Posted by Olivier Grisel <ol...@ensta.org>.
2012/6/26 Bertrand Delacretaz <bd...@apache.org>:
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
>
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.

Hi Encolpe,

You can start from the work at started last year an that is summed up
in this blog post:
http://dev.blogs.nuxeo.com/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Development of french Open NLP models

Posted by Emmanuel Hugonnet <em...@silverpeas.com>.
Hi,
I don't know how I could help but I would definitively like to 
participate in your effort for French.
Cheers,
Emmanuel
Le 26/06/2012 10:21, Bertrand Delacretaz a écrit :
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.
>
>> We should discuss the exact licence....
> There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)
>
>> Contact me directly to be involved in development and tests...
> IMO it would be cool if all this can happen in the open.
>
> Thanks for the initiative!
> -Bertrand

-- 
Emmanuel Hugonnet
Architect
emmanuel.hugonnet@silverpeas.org

Phone  +33 (0) 476 093 161

Silverpeas
http://www.silverpeas.com
http://www.twitter.com/silverpeas
1 place Firmin Gautier
38000 Grenoble - France




Re: Development of french Open NLP models

Posted by Olivier Grisel <ol...@ensta.org>.
2012/6/26 Bertrand Delacretaz <bd...@apache.org>:
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
>
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.

Hi Encolpe,

You can start from the work at started last year an that is summed up
in this blog post:
http://dev.blogs.nuxeo.com/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Development of french Open NLP models

Posted by Jörn Kottmann <ko...@gmail.com>.
Hello,

I started to work on a guide which explains how to use
the existing tools, but still need to make progress to have
it in a useful state, up to now is just explains how
to install the Corpus Server.

Its in our wiki:
https://cwiki.apache.org/OPENNLP/labeling-wikinews-articles-with-the-corpus-server-and-the-uima-cas-editor.html

Next step will be to explain how to get the wikinews data loaded, how to 
open it in the Cas Editor,
how to configure the eclipse plugin.

We would really need help with the web based labeling tools, the tagging 
server and a bit later
also with labeling data.

If you would like to participate it should not be hard to get involved 
for you.

Jörn

On 07/05/2012 11:25 AM, florent andré wrote:
> Hi,
>
> Simple and shared annotation tool will be really the way to go imo. 
> Thanks for starting this.
>
> I see this :
> https://cwiki.apache.org/OPENNLP/opennlp-annotations.html
>
> And some code here :
> https://svn.apache.org/repos/asf/opennlp/sandbox/
> (corpus-server-*, caseditor-*)
>
> Could it be possible to have some bootstraping information to try and 
> give a hand to that ?
>
> What is the TODO list to get a working tool ?
>
> Thanks for that !
> ++
>
> On 06/26/2012 10:32 AM, Jörn Kottmann wrote:
>> On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
>>> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
>>> your models if that helps - a neutral place like the Apache Software
>>> Foundation is probably good for such efforts.
>>
>> We would like to offer models over at OpenNLP, the reason we do
>> not distribute them is that we are restricted in terms of the license
>> here at Apache for the models we have today.
>>
>> It would be really great to have training data under an Open Source 
>> license
>> we can use to produce models under AL 2.0.
>>
>> We started to work on annotation tooling over at OpenNLP but are slow
>> because
>> we lack resources, would be nice to have people to help out with that.
>> Wikinews seems to be an interesting source of data and is available in
>> many languages.
>>
>> Jörn
>



Re: Development of french Open NLP models

Posted by florent andré <fl...@4sengines.com>.
Hi,

Simple and shared annotation tool will be really the way to go imo. 
Thanks for starting this.

I see this :
https://cwiki.apache.org/OPENNLP/opennlp-annotations.html

And some code here :
https://svn.apache.org/repos/asf/opennlp/sandbox/
(corpus-server-*, caseditor-*)

Could it be possible to have some bootstraping information to try and 
give a hand to that ?

What is the TODO list to get a working tool ?

Thanks for that !
++

On 06/26/2012 10:32 AM, Jörn Kottmann wrote:
> On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
>> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
>> your models if that helps - a neutral place like the Apache Software
>> Foundation is probably good for such efforts.
>
> We would like to offer models over at OpenNLP, the reason we do
> not distribute them is that we are restricted in terms of the license
> here at Apache for the models we have today.
>
> It would be really great to have training data under an Open Source license
> we can use to produce models under AL 2.0.
>
> We started to work on annotation tooling over at OpenNLP but are slow
> because
> we lack resources, would be nice to have people to help out with that.
> Wikinews seems to be an interesting source of data and is available in
> many languages.
>
> Jörn


Re: Development of french Open NLP models

Posted by Jörn Kottmann <ko...@gmail.com>.
On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.

We would like to offer models over at OpenNLP, the reason we do
not distribute them is that we are restricted in terms of the license
here at Apache for the models we have today.

It would be really great to have training data under an Open Source license
we can use to produce models under AL 2.0.

We started to work on annotation tooling over at OpenNLP but are slow 
because
we lack resources, would be nice to have people to help out with that.
Wikinews seems to be an interesting source of data and is available in 
many languages.

Jörn

Re: Development of french Open NLP models

Posted by Jörn Kottmann <ko...@gmail.com>.
On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.

We would like to offer models over at OpenNLP, the reason we do
not distribute them is that we are restricted in terms of the license
here at Apache for the models we have today.

It would be really great to have training data under an Open Source license
we can use to produce models under AL 2.0.

We started to work on annotation tooling over at OpenNLP but are slow 
because
we lack resources, would be nice to have people to help out with that.
Wikinews seems to be an interesting source of data and is available in 
many languages.

Jörn

Re: Development of french Open NLP models

Posted by Emmanuel Hugonnet <em...@silverpeas.com>.
Hi,
I don't know how I could help but I would definitively like to 
participate in your effort for French.
Cheers,
Emmanuel
Le 26/06/2012 10:21, Bertrand Delacretaz a écrit :
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.
>
>> We should discuss the exact licence....
> There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)
>
>> Contact me directly to be involved in development and tests...
> IMO it would be cool if all this can happen in the open.
>
> Thanks for the initiative!
> -Bertrand

-- 
Emmanuel Hugonnet
Architect
emmanuel.hugonnet@silverpeas.org

Phone  +33 (0) 476 093 161

Silverpeas
http://www.silverpeas.com
http://www.twitter.com/silverpeas
1 place Firmin Gautier
38000 Grenoble - France




Re: Development of french Open NLP models

Posted by Emmanuel Hugonnet <em...@silverpeas.com>.
Hi,
I don't know how I could help but I would definitively like to 
participate in your effort for French.
Cheers,
Emmanuel
Le 26/06/2012 10:21, Bertrand Delacretaz a écrit :
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.
>
>> We should discuss the exact licence....
> There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)
>
>> Contact me directly to be involved in development and tests...
> IMO it would be cool if all this can happen in the open.
>
> Thanks for the initiative!
> -Bertrand

-- 
Emmanuel Hugonnet
Architect
emmanuel.hugonnet@silverpeas.org

Phone  +33 (0) 476 093 161

Silverpeas
http://www.silverpeas.com
http://www.twitter.com/silverpeas
1 place Firmin Gautier
38000 Grenoble - France




Re: Development of french Open NLP models

Posted by Jörn Kottmann <ko...@gmail.com>.
On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.

We would like to offer models over at OpenNLP, the reason we do
not distribute them is that we are restricted in terms of the license
here at Apache for the models we have today.

It would be really great to have training data under an Open Source license
we can use to produce models under AL 2.0.

We started to work on annotation tooling over at OpenNLP but are slow 
because
we lack resources, would be nice to have people to help out with that.
Wikinews seems to be an interesting source of data and is available in 
many languages.

Jörn

Re: Development of french Open NLP models

Posted by Olivier Grisel <ol...@ensta.org>.
2012/6/26 Bertrand Delacretaz <bd...@apache.org>:
> Hi,
>
> On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
> <en...@free.fr> wrote:
>> ...During the last IKS Workshop several companies showed their interest in
>> developing French NLP models.
>> I propose myself to work and to coordinate all efforts on this topic.
>> All released models will be opensource...
>
> I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
> your models if that helps - a neutral place like the Apache Software
> Foundation is probably good for such efforts.
>
> Maybe OpenNLP is a better place if it uses the models directly, which
> Apache project does not really matter to me but it would IMO be a very
> good idea to have those models at the ASF.

Hi Encolpe,

You can start from the work at started last year an that is summed up
in this blog post:
http://dev.blogs.nuxeo.com/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: Development of french Open NLP models

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
<en...@free.fr> wrote:
> ...During the last IKS Workshop several companies showed their interest in
> developing French NLP models.
> I propose myself to work and to coordinate all efforts on this topic.
> All released models will be opensource...

I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
your models if that helps - a neutral place like the Apache Software
Foundation is probably good for such efforts.

Maybe OpenNLP is a better place if it uses the models directly, which
Apache project does not really matter to me but it would IMO be a very
good idea to have those models at the ASF.

> We should discuss the exact licence....

There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)

> Contact me directly to be involved in development and tests...

IMO it would be cool if all this can happen in the open.

Thanks for the initiative!
-Bertrand

Re: Development of french Open NLP models

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
<en...@free.fr> wrote:
> ...During the last IKS Workshop several companies showed their interest in
> developing French NLP models.
> I propose myself to work and to coordinate all efforts on this topic.
> All released models will be opensource...

I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
your models if that helps - a neutral place like the Apache Software
Foundation is probably good for such efforts.

Maybe OpenNLP is a better place if it uses the models directly, which
Apache project does not really matter to me but it would IMO be a very
good idea to have those models at the ASF.

> We should discuss the exact licence....

There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)

> Contact me directly to be involved in development and tests...

IMO it would be cool if all this can happen in the open.

Thanks for the initiative!
-Bertrand

Re: Development of french Open NLP models

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Mon, Jun 25, 2012 at 11:34 PM, Encolpe Degoute
<en...@free.fr> wrote:
> ...During the last IKS Workshop several companies showed their interest in
> developing French NLP models.
> I propose myself to work and to coordinate all efforts on this topic.
> All released models will be opensource...

I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
your models if that helps - a neutral place like the Apache Software
Foundation is probably good for such efforts.

Maybe OpenNLP is a better place if it uses the models directly, which
Apache project does not really matter to me but it would IMO be a very
good idea to have those models at the ASF.

> We should discuss the exact licence....

There's this http://www.apache.org/licenses/LICENSE-2.0 ;-)

> Contact me directly to be involved in development and tests...

IMO it would be cool if all this can happen in the open.

Thanks for the initiative!
-Bertrand