You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Aingaran Pillai <ap...@zaizi.com> on 2011/02/14 15:43:48 UTC

Multilingual Entity Extraction

Hi,

Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc. 

Regards,
Ainga
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 203 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.


Re: Multilingual Entity Extraction

Posted by valentina presutti <va...@cnr.it>.
In our group we are working to release a new FISE engine that extract DBPedia entities and handles Italian.
A demo of the tool is available at [1]. You can select either English or Italian.

Val

[1] http://150.146.88.63/PhpMoreInfo/

On Feb 14, 2011, at 3:52 PM, Olivier Grisel wrote:

> 2011/2/14 Aingaran Pillai <ap...@zaizi.com>:
>> Hi,
>> 
>> Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc.
> 
> Yes it is planned. There is some cooperation underway with the
> upstream OpenNLP project to build new statistical language model from
> various free to redistribute corpora. I have also started some proof
> of concept tools:
> 
>  http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html
> 
> On the Stanbol side, we need to upgrade to OpenNLP 1.5 asap and
> un-hard-code the model loading:
> 
>  https://issues.apache.org/jira/browse/STANBOL-13
> 
> -- 
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel


------------------------------------------------------------

Valentina Presutti
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, Rome - Italy

icq# 122838754
msn vpresutti@hotmail.it
skype bluvale


Re: Multilingual Entity Extraction

Posted by Olivier Grisel <ol...@ensta.org>.
2011/2/14 Aingaran Pillai <ap...@zaizi.com>:
> Hi,
>
> Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc.

Yes it is planned. There is some cooperation underway with the
upstream OpenNLP project to build new statistical language model from
various free to redistribute corpora. I have also started some proof
of concept tools:

  http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

On the Stanbol side, we need to upgrade to OpenNLP 1.5 asap and
un-hard-code the model loading:

  https://issues.apache.org/jira/browse/STANBOL-13

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel