You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Bertrand Delacretaz <bd...@apache.org> on 2011/02/14 11:36:31 UTC

Offline testing and launcher (was: Hudson build became unstable...)

Hi,

On Sun, Feb 13, 2011 at 12:22 PM, Olivier Grisel
<ol...@ensta.org> wrote:
> 2011/2/13 Rupert Westenthaler <ru...@gmail.com>:
>> Hi
>>
>> after some digging I found the reason for this is the unavailability
>> of the geonames.org webserivce
>>...
>> I suggest to deactivate all engines that depend on external service
>> for the integration builds. However I have no Idea how to do that
>> other than creating an own sling launcher configuration that excludes
>> such engines.
>
> A launcher for off-line stanbol is a good idea anyway: it would be
> useful for people pre-loading the entithub with precomputed entity
> dumps and only use enhancements related to this local base without
> relying on an internet connection to remote services and / or
> knowledge bases.

How about using a global stanbol.offline.mode system property instead
of a different launcher?

We can then create a component that registers an OnlineMode service
only if that property is not true.

Then, components that are only active if online just need an
@Reference to that OnlineMode service, and they'll all go away if the
property is set to true.

-Bertrand

Re: Multilingual Entity Extraction

Posted by valentina presutti <va...@cnr.it>.
In our group we are working to release a new FISE engine that extract DBPedia entities and handles Italian.
A demo of the tool is available at [1]. You can select either English or Italian.

Val

[1] http://150.146.88.63/PhpMoreInfo/

On Feb 14, 2011, at 3:52 PM, Olivier Grisel wrote:

> 2011/2/14 Aingaran Pillai <ap...@zaizi.com>:
>> Hi,
>> 
>> Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc.
> 
> Yes it is planned. There is some cooperation underway with the
> upstream OpenNLP project to build new statistical language model from
> various free to redistribute corpora. I have also started some proof
> of concept tools:
> 
>  http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html
> 
> On the Stanbol side, we need to upgrade to OpenNLP 1.5 asap and
> un-hard-code the model loading:
> 
>  https://issues.apache.org/jira/browse/STANBOL-13
> 
> -- 
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel


------------------------------------------------------------

Valentina Presutti
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, Rome - Italy

icq# 122838754
msn vpresutti@hotmail.it
skype bluvale


Re: Multilingual Entity Extraction

Posted by Olivier Grisel <ol...@ensta.org>.
2011/2/14 Aingaran Pillai <ap...@zaizi.com>:
> Hi,
>
> Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc.

Yes it is planned. There is some cooperation underway with the
upstream OpenNLP project to build new statistical language model from
various free to redistribute corpora. I have also started some proof
of concept tools:

  http://blogs.nuxeo.com/dev/2011/01/mining-wikipedia-with-hadoop-and-pig-for-natural-language-processing.html

On the Stanbol side, we need to upgrade to OpenNLP 1.5 asap and
un-hard-code the model loading:

  https://issues.apache.org/jira/browse/STANBOL-13

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Multilingual Entity Extraction

Posted by Aingaran Pillai <ap...@zaizi.com>.
Hi,

Is there any support planned to support Entity Extraction in other languages? E.g. French, German, etc. 

Regards,
Ainga
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 203 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.


Re: Offline testing and launcher (was: Hudson build became unstable...)

Posted by Bertrand Delacretaz <bd...@apache.org>.
On Mon, Feb 14, 2011 at 11:36 AM, Bertrand Delacretaz
<bd...@apache.org> wrote:
>> 2011/2/13 Rupert Westenthaler <ru...@gmail.com>:
>>> ...I suggest to deactivate all engines that depend on external service
>>> for the integration builds. However I have no Idea how to do that
>>> other than creating an own sling launcher configuration that excludes
>>> such engines....

I have implemented offline support, see STANBOL-87

For now I only disabled the LocationEnhancementEngine in offline mode,
to disable more simply add an @Reference to the OnlineMode service to
them.

If needed we can also create replacement services that are active in
offline mode, by making them dependent of the OfflineMode service.

-Bertrand

Re: Offline testing and launcher (was: Hudson build became unstable...)

Posted by Olivier Grisel <ol...@ensta.org>.
2011/2/14 Bertrand Delacretaz <bd...@apache.org>:
> Hi,
>
> On Sun, Feb 13, 2011 at 12:22 PM, Olivier Grisel
> <ol...@ensta.org> wrote:
>> 2011/2/13 Rupert Westenthaler <ru...@gmail.com>:
>>> Hi
>>>
>>> after some digging I found the reason for this is the unavailability
>>> of the geonames.org webserivce
>>>...
>>> I suggest to deactivate all engines that depend on external service
>>> for the integration builds. However I have no Idea how to do that
>>> other than creating an own sling launcher configuration that excludes
>>> such engines.
>>
>> A launcher for off-line stanbol is a good idea anyway: it would be
>> useful for people pre-loading the entithub with precomputed entity
>> dumps and only use enhancements related to this local base without
>> relying on an internet connection to remote services and / or
>> knowledge bases.
>
> How about using a global stanbol.offline.mode system property instead
> of a different launcher?
>
> We can then create a component that registers an OnlineMode service
> only if that property is not true.
>
> Then, components that are only active if online just need an
> @Reference to that OnlineMode service, and they'll all go away if the
> property is set to true.

Why not. We could also have re-purpose the "lite" launcher to only
include standalone bundles (those who don't require an online access).

Both solutions are fine to me. We could even implement them both at
the same time.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel