You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Andrea Ciapetti <an...@gmail.com> on 2012/02/29 16:08:24 UTC

Etcware early adoption proposal

Hello
IKS team and Apache Stanbol mailing list,



we have recently submitted an early adoption proposal for Apache Stanbol
and we’d like to post a brief summary of our proposal and a brief profile
of our company for presenting us to the community.

The complete version of the proposal is available on the IKS blog (
http://wiki.iks-project.eu/index.php/Etcware_Proposal).



*Company Profile*

Etcware s.r.l. is a SME (Small Medium Enterprise) based in Rome, Italy, and
founded in 2007 by highly skilled ITC professionals. We develop web portals
and content management solutions for the Public Administration and private
customers, by using the Liferay, OpenCMS and Drupal platforms.We are
focused in productizing and reusing implemented solutions and in performing
feasibility studies for complex scenarios.

Our company has also acquired significant competences in the usage of
semantic technologies and standards. After the experience in an Italian
research project, we have developed a product for SKOS thesaurus publishing
and management, named SKOSware (http://www.skosware.it).

* *

*Early adoption proposal*

Recently we have developed a Liferay based solution for a Public
Administration institution (Garante per la Protezione dei Dati Personali,
Italian Data Protection Authority), in which we have deployed an innovative
semantic search solution based on SKOSware. In this architecture we updated
the Liferay core to allow manual metadata enrichment for contents and
documents, through concepts included in one or more SKOS thesaurus. This
allows us to perform searches and refinements based on dynamic facets,
hierarchically organized. The facet structure is compliant with the SKOS
thesaurus organization. Metadata enrichments are published inside the HTML
pages as RDFa snippets, while geo-localization and chrono-references are
used to place contents on a map and on a timeline.

Our vision is to integrate Stanbol in place of our manual metadata
enrichment for the CMS contents. This will allow us to add additional
content enrichment through Stanbol engines. Moreover, content enrichment
and tagging will become mostly automatic in this way. Stanbol integration
in our Liferay solution will be “loosely coupled” to allow an easy porting
in the next version of the CMS, and to enable a maximum degree of reuse of
our semantic customization.

*The solution will be integrated in the Italian data protection Authority
portal as a demo*, running Stanbol enhancement engines on their document
corpus composed by 12.000 items, 2.000 of which already manually enriched
with metadata.

Our plan to integrate Stanbol is based on the following steps:

   1. Thesauri selected from SKOSware are imported into Stanbol to create a
   base custom knowledge domain.
   2. The Content editor creates or updates contents and documents on
   Liferay. These contents are enriched through Stanbol enhancement engines,
   on editing post-process event.
   3. The Liferay administrator launches Stanbol automatic metadata
   enrichment for all contents and documents (batch enrichment process).
   4. The End-user searches contents and documents by using full-text
   search or tag-cloud-based search and refines the results or expands the
   search scope on similar or related contents (under the scenes SKOS
   thesaurus concepts and semantic relations are used to define the related
   contents).
   5. As the end-user views portal contents, terms similar to SKOS concepts
   (skos:prefLabel or skos:altLabel are used for entity highlighting) are
   automatically decorated and their description is shown on some specific GUI
   event (like mouseover).
   6. Inference rules and semantic reasoning will be used to complete and
   enrich the domain knowledge base, thus suggesting additional concepts and
   OWL relations.
   7. Optional use of some IKS VIE widgets on the frontend presentation
   layer.



Best regards.


     -Andrea

------------------------------------------------
Andrea Ciapetti
Etcware srl
mail: a.ciapetti@etcware.it
mail: andrea.ciapetti@gmail.com
mobile: +39 320 6197534
------------------------------------------------

Re: Etcware early adoption proposal

Posted by Rupert Westenthaler <ru...@gmail.com>.
On 05.03.2012, at 21:23, Andrea Ciapetti wrote:

> Currently this is not possible as only the main "/entityhub" interface provides full CRUD ("read/write"). ReferencedSties (/entityhub/site/{name}) are read only.
> However this is no technical limitation but a design decision that can be changed.
> 
> If we change ReferencedSites to support read/write you could use the same interface as provided by "/entityhub/entity" also on referenced sites. This would allow use cases as described above because you than would have the possibility to add/update/delete single Concepts of a SKOS thesaurus in the Entityhub.
> 
> If you are interested we could create a JIRA issue about that.
> 
> 
> Hi Rupert, we are definitely interested in this feature.
> Do you think is it possible to schedule the change in the time period of our early adoption project?

Yes. 

> 
> Best Andrea


Re: Etcware early adoption proposal

Posted by Andrea Ciapetti <an...@gmail.com>.
>
> Currently this is not possible as only the main "/entityhub" interface
> provides full CRUD ("read/write"). ReferencedSties (/entityhub/site/{name})
> are read only.
> However this is no technical limitation but a design decision that can be
> changed.
>
> If we change ReferencedSites to support read/write you could use the same
> interface as provided by "/entityhub/entity" also on referenced sites. This
> would allow use cases as described above because you than would have the
> possibility to add/update/delete single Concepts of a SKOS thesaurus in the
> Entityhub.
>
> If you are interested we could create a JIRA issue about that.
>
>
*Hi Rupert, we are definitely interested in this feature.*
*Do you think is it possible to schedule the change in the time period of
our early adoption project?*
*
*
*Best Andrea*

Re: Etcware early adoption proposal

Posted by Rupert Westenthaler <ru...@gmail.com>.
On 02.03.2012, at 16:30, Andrea Ciapetti wrote:
> On Fri, Mar 2, 2012 at 15:50, Rupert Westenthaler <ru...@gmail.com> wrote:
> 
> [...]
> 
> login with guest/guest does not work for me. I will try it again @home maybe it is a problem with the Company firewall.
> 
> Ouch! Guest user has been disabled (perhaps for security reasons).
> I've made an user specifically for you (rupert/westenthaler). This should work as expected.
> 

No it works. I will have a look!

> 
> No. I was only thinking that it would be nicer if you could update the thesaurus in the Apache Entityhub after every change made within SKOSware (keep it in sync) rather than using the bulk import as supported by the  Generic RDF indexer. This would allow use cases where the user adds a new term to the Thesaurus and can immediately used it with the Stanbol Enhancer.
> 
> Sounds interesting. Which kind of functionalities should be implemented for this to work?
> 

Currently this is not possible as only the main "/entityhub" interface provides full CRUD ("read/write"). ReferencedSties (/entityhub/site/{name}) are read only.
However this is no technical limitation but a design decision that can be changed.

I we change ReferencedSites to support read/write you could use the same interface as provided by "/entityhub/entity" also on referenced sites. This would allow use cases as described above because you than would have the possibility to add/update/delete single Concepts of a SKOS thesaurus in the Entityhub.

If you are interested we could create a JIRA issue about that.

best
Rupert

> >
> >
> > Thanks again for all your useful suggestions. We will give a try to annotate.js soon. It seems very cool!
> > Do you think that we can use it also in page presentation or is it specific for rich html editors?
> >
> 
> I am no expert in that topic. I would rather directly ask the VIE community via
> 
>    http://groups.google.com/group/viejs
> 
> OK. Thanks. 
> 
> best
> Rupert
> 
> Best 
> 
>     -Andrea
> 


Re: Etcware early adoption proposal

Posted by Andrea Ciapetti <an...@gmail.com>.
On Fri, Mar 2, 2012 at 15:50, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

>
> [...]
>
> login with guest/guest does not work for me. I will try it again @home
> maybe it is a problem with the Company firewall.
>

*Ouch! Guest user has been disabled (perhaps for security reasons).*
*I've made an user specifically for you (rupert/westenthaler). This should
work as expected.*


> > Under the scenes, the PHP frontend access the REST interface (try for
> example
> http://www.skosware.it/rest/scheme/picoscheme/concept/abitazioni.rdf?namespace=http://culturaitalia.it/pico/thesaurus/4.1or
> http://www.skosware.it/rest/scheme/picoscheme.rdf for some significative
> examples).
> >
> This link works
>

*The REST interface does not require credentials.*


>
> No. I was only thinking that it would be nicer if you could update the
> thesaurus in the Apache Entityhub after every change made within SKOSware
> (keep it in sync) rather than using the bulk import as supported by the
>  Generic RDF indexer. This would allow use cases where the user adds a new
> term to the Thesaurus and can immediately used it with the Stanbol Enhancer.
>

*Sounds interesting. Which kind of functionalities should be implemented
for this to work?*


> >
> >
> > Thanks again for all your useful suggestions. We will give a try to
> annotate.js soon. It seems very cool!
> > Do you think that we can use it also in page presentation or is it
> specific for rich html editors?
> >
>
> I am no expert in that topic. I would rather directly ask the VIE
> community via
>
>    http://groups.google.com/group/viejs


*OK. Thanks. *
*
*

> best
> Rupert


*Best *
*
*
*    -Andrea
*

Re: Etcware early adoption proposal

Posted by Rupert Westenthaler <ru...@gmail.com>.
On 02.03.2012, at 13:10, Andrea Ciapetti wrote:

> was not able to look this up, because the website seems do be down at the moment.
> 
> Hi Rupert, the web site is up. Please use http://www.skosware.it/swphp for accessing the admin interface of SKOSware (our fault, we should redirect the user to the admin interface and not simply show the Apache default page ;-). Use the credentials guest/guest to login. The service is in beta test, so please forgive us for any problems found on some specific concepts or thesaurus.

login with guest/guest does not work for me. I will try it again @home maybe it is a problem with the Company firewall.

> Under the scenes, the PHP frontend access the REST interface (try for example http://www.skosware.it/rest/scheme/picoscheme/concept/abitazioni.rdf?namespace=http://culturaitalia.it/pico/thesaurus/4.1 or http://www.skosware.it/rest/scheme/picoscheme.rdf for some significative examples).
> 
This link works 

> Some technical details, if you're interested: REST interface is implemented with RestEasy platform (http://www.jboss.org/resteasy, it's an interesting alternative to Jersey in my opinion) on JBoss 6. RDF repository used for storing triples is Allegrograph (http://www.franz.com/agraph/allegrograph/, free edition flavour).
>  

As an Apache project we would rather switch to "http://incubator.apache.org/wink/" and I think there was already an discussion about that.
>  
> >   1. Thesauri selected from SKOSware are imported into Stanbol to create a
> >   base custom knowledge domain.
> 
> Currently this is only possible by using the Entityhub Indexing tool. This is OK for one-time imports and sporadic updates, but it might not be sufficient for your use case. So if you have additional requirements we might need to add some new functionality.
> 
> We have already imported the Garante SKOS thesaurus with the Generic RDF indexer and the concepts are shown nicely from the enhancement engines, too. Are we doing something wrong?
> 

No. I was only thinking that it would be nicer if you could update the thesaurus in the Apache Entityhub after every change made within SKOSware (keep it in sync) rather than using the bulk import as supported by the  Generic RDF indexer. This would allow use cases where the user adds a new term to the Thesaurus and can immediately used it with the Stanbol Enhancer.

> 
> 
> Thanks again for all your useful suggestions. We will give a try to annotate.js soon. It seems very cool!
> Do you think that we can use it also in page presentation or is it specific for rich html editors?
>  

I am no expert in that topic. I would rather directly ask the VIE community via 

    http://groups.google.com/group/viejs


best
Rupert

Re: Etcware early adoption proposal

Posted by Alessandro Adamou <ad...@cs.unibo.it>.
On 3/2/12 1:10 PM, Andrea Ciapetti wrote:
>>    6. Inference rules and semantic reasoning will be used to complete and
>>>    enrich the domain knowledge base, thus suggesting additional concepts
>> and
>>>    OWL relations.
>> Stanbol includes support for rules and reasoning. However I am not an
>> expert with that.
>>
>> Simple reasoning things can be also implemented by using LDPath directly
>> on the Entityhub. [...]
>>
> *We hope that Alessandro Adamou and David Riccitelli can give us a helpful
> hand on this hard task ;-)*
> *Anyway probably LDPath is enough for starting.*

Certainly glad to help from my side - ontology management

Alessandro

-- 
M.Sc. Alessandro Adamou

Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy

Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy


"As for the charges against me, I am unconcerned. I am beyond their timid, lying morality, and so I am beyond caring."
(Col. Walter E. Kurtz)

Not sent from my iSnobTechDevice


Re: Etcware early adoption proposal

Posted by Andrea Ciapetti <an...@gmail.com>.
>
> was not able to look this up, because the website seems do be down at the
> moment.
>

*Hi Rupert, the web site is up. Please use *http://www.skosware.it/swphp *for
accessing the admin interface of SKOSware (our fault, we should redirect
the user to the admin interface and not simply show the Apache default page
;-). Use the credentials **guest/guest **to login. The service is in beta
test, so please forgive us for any problems found on some specific concepts
or thesaurus.*
*
*
*Under the scenes, the PHP frontend access the REST interface (try for
example *
http://www.skosware.it/rest/scheme/picoscheme/concept/abitazioni.rdf?namespace=http://culturaitalia.it/pico/thesaurus/4.1
 *or *http://www.skosware.it/rest/scheme/picoscheme.rdf *for some
significative examples).*
*
*
*Some technical details, if you're interested: REST interface is
implemented with RestEasy platform (*http://www.jboss.org/resteasy, *it's
an interesting alternative to Jersey in my opinion) on JBoss 6. RDF
repository used for storing triples is Allegrograph (*
http://www.franz.com/agraph/allegrograph/, *free edition flavour).*


>
> nice use case. Do you also plan to use extracted enhancements as
> suggestions to extend the thesaurus managed by your tool? I am personally
> interested in the feasibility of such use cases.
>
>
*Yes, exactly. We are thinking to use "Rules" in background for thesaurus
enrichment, but we are open for suggestions.*
*
*


> >   1. Thesauri selected from SKOSware are imported into Stanbol to create
> a
> >   base custom knowledge domain.
>
> Currently this is only possible by using the Entityhub Indexing tool. This
> is OK for one-time imports and sporadic updates, but it might not be
> sufficient for your use case. So if you have additional requirements we
> might need to add some new functionality.
>

*We have already imported the Garante SKOS thesaurus with the Generic RDF
indexer and the concepts are shown nicely from the enhancement engines,
too. Are we doing something wrong?*


> You might also want to have a look at EnhancementChains. This would allow
> you to configure multiple enhancement endpoints that use a different set of
> Thesauri. The documentation of the Stanbol Enhancer [1] provides more
> information on that
>
> [1] http://incubator.apache.org/stanbol/docs/trunk/enhancer/


*Thanks a lot for the suggestion. We are going to study in depth this
section.*


> The Stanbol Contenthub allows to build Semantic Indexes based on
> Enhancement Results. This components basically allows you to configure you
> own SemanticIndex layout (by using LDPath[2]). Queries to the managed
> semantic index directly use Apache Solr [3].
> Note that we do plan to provide considerable improvements to this
> components in the coming months [4].
>
> [2] http://code.google.com/p/ldpath/
> [3]  http://lucene.apache.org/solr/
> [4] https://issues.apache.org/jira/browse/STANBOL-471


*We have tried LDPath and it seems already very powerful  for our simple
needs. We plan to continue to rebuild the SVN trunk periodically to insert
the new updates. Thks.*


> >   5. As the end-user views portal contents, terms similar to SKOS
> concepts
> >   (skos:prefLabel or skos:altLabel are used for entity highlighting) are
> >   automatically decorated and their description is shown on some
> specific GUI
> >   event (like mouseover).
>
> You might be interested in nano "annotate.js" [5]. Try the demo at [6]
>
> [5] https://github.com/szabyg/annotate.js
> [6] http://dev.iks-project.eu:8081/enhancervie
>

*Thanks again for all your useful suggestions. We will give a try to
annotate.js soon. It seems very cool!*
*Do you think that we can use it also in page presentation or is it
specific for rich html editors?*


>

>   6. Inference rules and semantic reasoning will be used to complete and
> >   enrich the domain knowledge base, thus suggesting additional concepts
> and
> >   OWL relations.
>
> Stanbol includes support for rules and reasoning. However I am not an
> expert with that.
>
> Simple reasoning things can be also implemented by using LDPath directly
> on the Entityhub. [...]
>

*We hope that Alessandro Adamou and David Riccitelli can give us a helpful
hand on this hard task ;-)*
*Anyway probably LDPath is enough for starting.*


>
> >   7. Optional use of some IKS VIE widgets on the frontend presentation
> >   layer.
>
> Ok you are already aware of VIE ^^
>
> best
> Rupert
>
> *
*
*Best regards and thank you again.*
*
*
*    -Andrea*



-- 
--------------------------------------------------
Andrea Ciapetti
Etcware srl

Email: a.ciapetti@etcware.it
          andrea.ciapetti@gmail.com
Mobile: +39 320 6197534
--------------------------------------------------

Re: Etcware early adoption proposal

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Andrea


On 29.02.2012, at 16:08, Andrea Ciapetti wrote:
> 
> Our company has also acquired significant competences in the usage of
> semantic technologies and standards. After the experience in an Italian
> research project, we have developed a product for SKOS thesaurus publishing
> and management, named SKOSware (http://www.skosware.it).
> 
was not able to look this up, because the website seems do be down at the moment.
> 
> Our vision is to integrate Stanbol in place of our manual metadata
> enrichment for the CMS contents. This will allow us to add additional
> content enrichment through Stanbol engines. Moreover, content enrichment
> and tagging will become mostly automatic in this way. Stanbol integration
> in our Liferay solution will be “loosely coupled” to allow an easy porting
> in the next version of the CMS, and to enable a maximum degree of reuse of
> our semantic customization.

nice use case. Do you also plan to use extracted enhancements as suggestions to extend the thesaurus managed by your tool? I am personally interested in the feasibility of such use cases.

> *The solution will be integrated in the Italian data protection Authority
> portal as a demo*, running Stanbol enhancement engines on their document
> corpus composed by 12.000 items, 2.000 of which already manually enriched
> with metadata.
> 
> Our plan to integrate Stanbol is based on the following steps:
> 
>   1. Thesauri selected from SKOSware are imported into Stanbol to create a
>   base custom knowledge domain.

Currently this is only possible by using the Entityhub Indexing tool. This is OK for one-time imports and sporadic updates, but it might not be sufficient for your use case. So if you have additional requirements we might need to add some new functionality.

You might also want to have a look at EnhancementChains. This would allow you to configure multiple enhancement endpoints that use a different set of Thesauri. The documentation of the Stanbol Enhancer [1] provides more information on that

[1] http://incubator.apache.org/stanbol/docs/trunk/enhancer/

>   2. The Content editor creates or updates contents and documents on
>   Liferay. These contents are enriched through Stanbol enhancement engines,
>   on editing post-process event.
>   3. The Liferay administrator launches Stanbol automatic metadata
>   enrichment for all contents and documents (batch enrichment process).
>   4. The End-user searches contents and documents by using full-text
>   search or tag-cloud-based search and refines the results or expands the
>   search scope on similar or related contents (under the scenes SKOS
>   thesaurus concepts and semantic relations are used to define the related
>   contents).

The Stanbol Contenthub allows to build Semantic Indexes based on Enhancement Results. This components basically allows you to configure you own SemanticIndex layout (by using LDPath[2]). Queries to the managed semantic index directly use Apache Solr [3]. 
Note that we do plan to provide considerable improvements to this components in the coming months [4]. 

[2] http://code.google.com/p/ldpath/
[3]  http://lucene.apache.org/solr/
[4] https://issues.apache.org/jira/browse/STANBOL-471

>   5. As the end-user views portal contents, terms similar to SKOS concepts
>   (skos:prefLabel or skos:altLabel are used for entity highlighting) are
>   automatically decorated and their description is shown on some specific GUI
>   event (like mouseover).

You might be interested in nano "annotate.js" [5]. Try the demo at [6]

[5] https://github.com/szabyg/annotate.js
[6] http://dev.iks-project.eu:8081/enhancervie

>   6. Inference rules and semantic reasoning will be used to complete and
>   enrich the domain knowledge base, thus suggesting additional concepts and
>   OWL relations.

Stanbol includes support for rules and reasoning. However I am not an expert with that.

Simple reasoning things can be also implemented by using LDPath directly on the Entityhub. You can try 

    http://dev.iks-project.eu:8081/entityhub/site/gemet/find

search e.g. for 

    nuclear

and use the following LDPath

skos:prefLabel;
skos:altLabel;
skos:hiddenLabel;
rdfs:label = (skos:prefLabel | skos:altLabel | skos:hiddenLabel);
skos:notation

skos:inScheme;

skos:broader = (skos:broader | ^skos:narrower);
skos:broaderTransitive = (skos:broader | ^skos:narrower)+;

skos:narrower = (^skos:broader | skos:narrower);
skos:narrowerTransitive = (^skos:broader | skos:narrower)+;

skos:related = (skos:related | skos:relatedMatch);
skos:relatedMatch;
skos:exactMatch = (skos:exactMatch)+;
skos:closeMatch = (skos:closeMatch | (skos:exactMatch)+);
skos:broaderMatch = (^skos:narrowMatch | skos:broaderMatch);
skos:narrowMatch = (skos:narrowMatch | ^skos:broaderMatch);


This will provide you with SKOS concepts that match the search term but also include information such as the transitive closure for broaderTransitive and narrowerTransitive

>   7. Optional use of some IKS VIE widgets on the frontend presentation
>   layer.

Ok you are already aware of VIE ^^

best
Rupert