You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Alberto Musetti <mu...@cs.unibo.it> on 2012/03/01 15:30:38 UTC

Re: Add refactor engine in enhancer/bundlelist

Hi Rupert, all

Il giorno 29/feb/2012, alle ore 19:14, Rupert Westenthaler ha scritto:

> Hi Alberto
> 
> 
> On 29.02.2012, at 18:48, Alberto Musetti wrote:
>> Hi all,
>> 
>> I would like to add the refactor engine in enhancer/bundlelist.
>> It will be activated in the launcher full and full-war, 
>> but not in stable because there isn't rules and ontonet.
>> 
>> May I add the refactor engine?
>> 
> I think the full launcher should include all engines that are managed by the Stanbol Community. So in principle a +1 from my side.
> 
> Can I ask two questions:
> 
> (1) What would that mean for the memory footprint. I usually run Stanbol with -Xmx512m.  Would that still work with the refactor engine?

Yes, it would still work.
Concerning performance and memory usage we are adding full support to Clerezza 
in the Refactor with spending effort and memory in graph tansformations. 

> 
> (2) Would the engine be active by default? Do you plan to ship the full launcher with a default configuration or would a user need to manually configure an instance?
> 

There is a default configuration running  at bundle start-up, 
i.e., the refactoring for the SEO demo.

Best,
Alberto

> best
> Rupert
> 


Re: Add refactor engine in enhancer/bundlelist

Posted by Alessandro Adamou <ad...@cs.unibo.it>.
Hi Rupert, yes I had already reported our discussion internally but it's 
always better to discuss it on stanbol-dev.

On 3/3/12 10:13 AM, Rupert Westenthaler wrote:
> As the goal is to build the first release candidate Mo-Du next week the intension here is to find the best way on how to include the RefactorEngine in the full launcher. For this the idea was to
>
> * keep the Refactor engine in the full launcher. This engine is an important feature of Apache Stanbol and should be available by default
> * remove the default configuration of the Engine that uses the SEO (Search Engine Optimization) recipe. This means the the Refactor engine is not active in the default configuration (similar to the KeywordExtractionEngine).
> * keep the SEO recipe in the default configuration. This ensures that users that want to try this use case do not need to load/init the recipe.

this was the content of your recent commit, right? Personally it is fine 
with me

> * set the default values (the values in the @Property annotations) so that they work for the SEO recipe. This requires Users that want to test/use the SEO use case to only go to the configuration tab of the OSGI Web Console click the [+] of the RefactorEngine and than [OK].
> configuration tab of the OSGI Web Console click the [+] of the RefactorEngine and than [OK]

+1

> Longer Term plan  (after the 0.9 Release)
> ---------
>
> Alessandro was mentioning some points about adding native support for Clerezza graphs for the Rule component. Alessandro/Alberto maybe you can add more information here

Andrea should correct me if I'm wrong, but I'm told there's work under 
the hood on exporting Stanbol rules as ConstructQuery Clerezza objects 
natively and avoiding to go through the ordeal of Jena rule conversion.

> I had also the Idea to try using an IndexMGraph in-memory graph instead of the the file-based Jena TDB. This could considerable boost performance. However one would need validate the memory requirements.

As I told you, I do need to measure the memory footprints of many 
implementations of ours. I've checked the Oracle VisualVM you mentioned, 
but if there is some cool profiler in Apache I'd be happy to know. We'll 
return on this anyway.

In the meantime I've just started to use the IndexedMGraph for in-memory 
work in OntoNet.

> Build an other Usecase that is more simple than SEO with the goal to include it in the default configuration. SEO includes>60 rather complex rule definition. That is cool for showing the power of this component. For the default configuration I would like to have a simple Usecase that just need ~5 Rules. This could also include a 5min Tutorial how to create this 5 Rules and a 15min Tutorial that extends this default configuration by some additional Rules to include an other feature. Maybe the combination of IPTC, rNews, TopicEngine and RefactorEngine could be such an Example.

It could, but on an educated guess the IPTC/rNews combination might 
require a bit more than 5 rules, maybe just a subset.

best

Alessandro


On 02.03.2012, at 17:20, Alberto Musetti wrote:
>> Hi Rupert, all
>> I'm working on it, but i'm proceeding slowly because i cannot reproduce the error.
>>
>> i'm sorry
>> Alberto
>>
>> Il giorno 02/mar/2012, alle ore 00:20, Rupert Westenthaler ha scritto:
>>
>>> Hi Alberto
>>>
>>> I spent some time to look into "Build failed in Jenkins: stanbol-trunk-1.6 #772"
>>>
>>> The most interesting file is the error.log file of the Stanbol
>>> instance used for the integration test.
>>>
>>>   https://builds.apache.org/job/stanbol-trunk-1.6/ws/trunk/integration-tests/target/launchdir/sling/logs/error.log
>>>
>>> For the first view calls to the enhancer everything looks fine (~5sec
>>> for the Refactor engine).
>>>
>>> But later - after time code "01.03.2012 19:49:21.04" (use this to
>>> search in the file) something starts to went wrong with Jena TDB.
>>>
>>> [Thread-36]>>  seo_refactoring  and
>>> [Thread-12]>>  looks like a Jena TDB demon
>>>
>>> are the only active one (looks a little like a deadlock).
>>>
>>> You can safely ignore the [DataFileTrackingDaemon] this checks only
>>> every 5sec for Resources of the DataFIleProvider.
>>>
>>> I have really no idea what is going on here. Maybe you can make more
>>> sense of this
>>>
>>> best
>>> Rupert
>>>
>>> On Thu, Mar 1, 2012 at 3:30 PM, Alberto Musetti<mu...@cs.unibo.it>  wrote:
>>>> Hi Rupert, all
>>>>
>>>> Il giorno 29/feb/2012, alle ore 19:14, Rupert Westenthaler ha scritto:
>>>>
>>>>> Hi Alberto
>>>>>
>>>>>
>>>>> On 29.02.2012, at 18:48, Alberto Musetti wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I would like to add the refactor engine in enhancer/bundlelist.
>>>>>> It will be activated in the launcher full and full-war,
>>>>>> but not in stable because there isn't rules and ontonet.
>>>>>>
>>>>>> May I add the refactor engine?
>>>>>>
>>>>> I think the full launcher should include all engines that are managed by the Stanbol Community. So in principle a +1 from my side.
>>>>>
>>>>> Can I ask two questions:
>>>>>
>>>>> (1) What would that mean for the memory footprint. I usually run Stanbol with -Xmx512m.  Would that still work with the refactor engine?
>>>> Yes, it would still work.
>>>> Concerning performance and memory usage we are adding full support to Clerezza
>>>> in the Refactor with spending effort and memory in graph tansformations.
>>>>
>>>>> (2) Would the engine be active by default? Do you plan to ship the full launcher with a default configuration or would a user need to manually configure an instance?
>>>>>
>>>> There is a default configuration running  at bundle start-up,
>>>> i.e., the refactoring for the SEO demo.
>>>>
>>>> Best,
>>>> Alberto
>>>>
>>>>> best
>>>>> Rupert
>>>>>
>>>
>>>
>>> -- 
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>


-- 
M.Sc. Alessandro Adamou

Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy

Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy


"As for the charges against me, I am unconcerned. I am beyond their timid, lying morality, and so I am beyond caring."
(Col. Walter E. Kurtz)

Not sent from my iSnobTechDevice


Re: Add refactor engine in enhancer/bundlelist

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Alberto, Alessandro, all

Yesterday I had a discussion with Alessandro about the Refactor engine. This goal of this mail is to share the discussing things with the whole Stanbol community.

Alessandro can you please check than I have understood everything correctly and correct me where necessary.

Refactor Engine and the 0.9 Stanbol Release
---------

As the goal is to build the first release candidate Mo-Du next week the intension here is to find the best way on how to include the RefactorEngine in the full launcher. For this the idea was to

* keep the Refactor engine in the full launcher. This engine is an important feature of Apache Stanbol and should be available by default
* remove the default configuration of the Engine that uses the SEO (Search Engine Optimization) recipe. This means the the Refactor engine is not active in the default configuration (similar to the KeywordExtractionEngine). 
* keep the SEO recipe in the default configuration. This ensures that users that want to try this use case do not need to load/init the recipe.
* set the default values (the values in the @Property annotations) so that they work for the SEO recipe. This requires Users that want to test/use the SEO use case to only go to the configuration tab of the OSGI Web Console click the [+] of the RefactorEngine and than [OK].
configuration tab of the OSGI Web Console click the [+] of the RefactorEngine and than [OK]


Longer Term plan  (after the 0.9 Release)
---------

Alessandro was mentioning some points about adding native support for Clerezza graphs for the Rule component. Alessandro/Alberto maybe you can add more information here

I had also the Idea to try using an IndexMGraph in-memory graph instead of the the file-based Jena TDB. This could considerable boost performance. However one would need validate the memory requirements. I do not know if the above point is a pre requirement for this.

Build an other Usecase that is more simple than SEO with the goal to include it in the default configuration. SEO includes >60 rather complex rule definition. That is cool for showing the power of this component. For the default configuration I would like to have a simple Usecase that just need ~5 Rules. This could also include a 5min Tutorial how to create this 5 Rules and a 15min Tutorial that extends this default configuration by some additional Rules to include an other feature. Maybe the combination of IPTC, rNews, TopicEngine and RefactorEngine could be such an Example.

best
Rupert Westenthaler

On 02.03.2012, at 17:20, Alberto Musetti wrote:

> Hi Rupert, all
> I'm working on it, but i'm proceeding slowly because i cannot reproduce the error.
> 
> i'm sorry
> Alberto
> 
> Il giorno 02/mar/2012, alle ore 00:20, Rupert Westenthaler ha scritto:
> 
>> Hi Alberto
>> 
>> I spent some time to look into "Build failed in Jenkins: stanbol-trunk-1.6 #772"
>> 
>> The most interesting file is the error.log file of the Stanbol
>> instance used for the integration test.
>> 
>>  https://builds.apache.org/job/stanbol-trunk-1.6/ws/trunk/integration-tests/target/launchdir/sling/logs/error.log
>> 
>> For the first view calls to the enhancer everything looks fine (~5sec
>> for the Refactor engine).
>> 
>> But later - after time code "01.03.2012 19:49:21.04" (use this to
>> search in the file) something starts to went wrong with Jena TDB.
>> 
>> [Thread-36] >> seo_refactoring  and
>> [Thread-12] >> looks like a Jena TDB demon
>> 
>> are the only active one (looks a little like a deadlock).
>> 
>> You can safely ignore the [DataFileTrackingDaemon] this checks only
>> every 5sec for Resources of the DataFIleProvider.
>> 
>> I have really no idea what is going on here. Maybe you can make more
>> sense of this
>> 
>> best
>> Rupert
>> 
>> On Thu, Mar 1, 2012 at 3:30 PM, Alberto Musetti <mu...@cs.unibo.it> wrote:
>>> Hi Rupert, all
>>> 
>>> Il giorno 29/feb/2012, alle ore 19:14, Rupert Westenthaler ha scritto:
>>> 
>>>> Hi Alberto
>>>> 
>>>> 
>>>> On 29.02.2012, at 18:48, Alberto Musetti wrote:
>>>>> Hi all,
>>>>> 
>>>>> I would like to add the refactor engine in enhancer/bundlelist.
>>>>> It will be activated in the launcher full and full-war,
>>>>> but not in stable because there isn't rules and ontonet.
>>>>> 
>>>>> May I add the refactor engine?
>>>>> 
>>>> I think the full launcher should include all engines that are managed by the Stanbol Community. So in principle a +1 from my side.
>>>> 
>>>> Can I ask two questions:
>>>> 
>>>> (1) What would that mean for the memory footprint. I usually run Stanbol with -Xmx512m.  Would that still work with the refactor engine?
>>> 
>>> Yes, it would still work.
>>> Concerning performance and memory usage we are adding full support to Clerezza
>>> in the Refactor with spending effort and memory in graph tansformations.
>>> 
>>>> 
>>>> (2) Would the engine be active by default? Do you plan to ship the full launcher with a default configuration or would a user need to manually configure an instance?
>>>> 
>>> 
>>> There is a default configuration running  at bundle start-up,
>>> i.e., the refactoring for the SEO demo.
>>> 
>>> Best,
>>> Alberto
>>> 
>>>> best
>>>> Rupert
>>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
> 


Re: Add refactor engine in enhancer/bundlelist

Posted by Alberto Musetti <mu...@cs.unibo.it>.
Hi Rupert, all
I'm working on it, but i'm proceeding slowly because i cannot reproduce the error.

i'm sorry
Alberto

Il giorno 02/mar/2012, alle ore 00:20, Rupert Westenthaler ha scritto:

> Hi Alberto
> 
> I spent some time to look into "Build failed in Jenkins: stanbol-trunk-1.6 #772"
> 
> The most interesting file is the error.log file of the Stanbol
> instance used for the integration test.
> 
>   https://builds.apache.org/job/stanbol-trunk-1.6/ws/trunk/integration-tests/target/launchdir/sling/logs/error.log
> 
> For the first view calls to the enhancer everything looks fine (~5sec
> for the Refactor engine).
> 
> But later - after time code "01.03.2012 19:49:21.04" (use this to
> search in the file) something starts to went wrong with Jena TDB.
> 
> [Thread-36] >> seo_refactoring  and
> [Thread-12] >> looks like a Jena TDB demon
> 
> are the only active one (looks a little like a deadlock).
> 
> You can safely ignore the [DataFileTrackingDaemon] this checks only
> every 5sec for Resources of the DataFIleProvider.
> 
> I have really no idea what is going on here. Maybe you can make more
> sense of this
> 
> best
> Rupert
> 
> On Thu, Mar 1, 2012 at 3:30 PM, Alberto Musetti <mu...@cs.unibo.it> wrote:
>> Hi Rupert, all
>> 
>> Il giorno 29/feb/2012, alle ore 19:14, Rupert Westenthaler ha scritto:
>> 
>>> Hi Alberto
>>> 
>>> 
>>> On 29.02.2012, at 18:48, Alberto Musetti wrote:
>>>> Hi all,
>>>> 
>>>> I would like to add the refactor engine in enhancer/bundlelist.
>>>> It will be activated in the launcher full and full-war,
>>>> but not in stable because there isn't rules and ontonet.
>>>> 
>>>> May I add the refactor engine?
>>>> 
>>> I think the full launcher should include all engines that are managed by the Stanbol Community. So in principle a +1 from my side.
>>> 
>>> Can I ask two questions:
>>> 
>>> (1) What would that mean for the memory footprint. I usually run Stanbol with -Xmx512m.  Would that still work with the refactor engine?
>> 
>> Yes, it would still work.
>> Concerning performance and memory usage we are adding full support to Clerezza
>> in the Refactor with spending effort and memory in graph tansformations.
>> 
>>> 
>>> (2) Would the engine be active by default? Do you plan to ship the full launcher with a default configuration or would a user need to manually configure an instance?
>>> 
>> 
>> There is a default configuration running  at bundle start-up,
>> i.e., the refactoring for the SEO demo.
>> 
>> Best,
>> Alberto
>> 
>>> best
>>> Rupert
>>> 
>> 
> 
> 
> 
> -- 
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen


Re: Add refactor engine in enhancer/bundlelist

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Alberto

I spent some time to look into "Build failed in Jenkins: stanbol-trunk-1.6 #772"

The most interesting file is the error.log file of the Stanbol
instance used for the integration test.

   https://builds.apache.org/job/stanbol-trunk-1.6/ws/trunk/integration-tests/target/launchdir/sling/logs/error.log

For the first view calls to the enhancer everything looks fine (~5sec
for the Refactor engine).

But later - after time code "01.03.2012 19:49:21.04" (use this to
search in the file) something starts to went wrong with Jena TDB.

[Thread-36] >> seo_refactoring  and
[Thread-12] >> looks like a Jena TDB demon

are the only active one (looks a little like a deadlock).

You can safely ignore the [DataFileTrackingDaemon] this checks only
every 5sec for Resources of the DataFIleProvider.

I have really no idea what is going on here. Maybe you can make more
sense of this

best
Rupert

On Thu, Mar 1, 2012 at 3:30 PM, Alberto Musetti <mu...@cs.unibo.it> wrote:
> Hi Rupert, all
>
> Il giorno 29/feb/2012, alle ore 19:14, Rupert Westenthaler ha scritto:
>
>> Hi Alberto
>>
>>
>> On 29.02.2012, at 18:48, Alberto Musetti wrote:
>>> Hi all,
>>>
>>> I would like to add the refactor engine in enhancer/bundlelist.
>>> It will be activated in the launcher full and full-war,
>>> but not in stable because there isn't rules and ontonet.
>>>
>>> May I add the refactor engine?
>>>
>> I think the full launcher should include all engines that are managed by the Stanbol Community. So in principle a +1 from my side.
>>
>> Can I ask two questions:
>>
>> (1) What would that mean for the memory footprint. I usually run Stanbol with -Xmx512m.  Would that still work with the refactor engine?
>
> Yes, it would still work.
> Concerning performance and memory usage we are adding full support to Clerezza
> in the Refactor with spending effort and memory in graph tansformations.
>
>>
>> (2) Would the engine be active by default? Do you plan to ship the full launcher with a default configuration or would a user need to manually configure an instance?
>>
>
> There is a default configuration running  at bundle start-up,
> i.e., the refactoring for the SEO demo.
>
> Best,
> Alberto
>
>> best
>> Rupert
>>
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen