You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Rupert Westenthaler <ru...@gmail.com> on 2013/03/18 10:04:15 UTC

Enhancer API extensions / changes

Hi all,

The intension of this mail is to inform all Stanbol Enhancer users
about upcoming additions to the Stanbol Enhancer that will also
involve (incompatible) API changes to the EnhancementEngine interface.

This mail only provides an overview about those changes and their
rational for detailed information please have a look at the
discussions in [1], [2] and also [4].

Changes will be applied to the trunk (Stanbol Enhancer version
0.11.0-SNAPSHOT).


EnhancementEngine API change
========================

While most of those changes will only affect lower level APIs there
will be a change of the API for EnhancementEngines. Therefore this
will require Users with custom EnhancementEngines to provide necessary
adaptions. As described by the last comment of [1] the API of
EnhancementEngine will change to

    int canEnhance(ContentItem ci,
        Map<String, Object> enhancementContext)
    void computeEnhancements(ContentItem ci,
        Map<String, Object> enhancementContext)

The enhancementContext will contain request specific configurations.
EnhancementEngines that support those will need to consider those
configurations in addition to the configuration parsed in activate(..)
method of the component.

Typical usage examples:

* parsing user name and pwd for an external service
* parsing document password for protected rich text documents to the TikaEngine

However this will also allow advanced use cases like parsing the users
& group to consider ACL for EntityLinking (as described in [3]).

In addition it will allow to move configurations currently from the
Stanbol instance to the Enhancement request. Something desirable for
use cases as described in [4] where you want to use the same Stanbol
configurations on multiple hosts to do load balancing.

EnhancementJob
=============

This new class will be used to represent an enhancement job. It will
contain the ContentItem, enhancement chain, execution metadata as well
as the enhancement context.

The ExecutionMetadata currently stored as ContentPart of the
ContentItem will move over to the EnhancementJob. Note that this means
that EnhancementEngines will no longer be able to access those
information.

The "enhancementContext" parsed to an EnhancementEngine will be
created by merging EnhancementChain level properties with
EnhancementEngine specific properties. This means that properties
defined on Chain level will be visible to all EnhancementEngines
called by a chain, while EnhancementEngine level properties will be
only visible to a specific Engine.

EnhancementEngine are supported to allow divagating configurations for
multiple instances of the same EnhancementEngine implementation being
called in the same enhancement chain. A typical example are multiple
EntityLinking engines for different vocabularies (e.g. dbpediaLinking
and geonamesLinking).

The EnhancementJobManager will be responsible for processing
EnhancementJobs. Therefore the API will be changed to take an
EnhancementJob instead of a ContentItem.

/enhancer/task RESTful service
=======================

This Endpoint will allow to parse EnhancementJobs including

1. a new end-point that can be added in /enhancer/task
2. the end-point takes a Task Request (interface to be defined)
3. the Task Request will allow to post:
    * content or URL submission
    * per-call engine parameters
    * per-cal EnhancementChain definitions
4. it supports synchronous operations and possible async execution
with callback URI

[5] suggests to use a JSON for the definition of such tasks, but in
principle the definition could be also supported by using RDF.

As pointed out in [6] it would be also possible to extend the current
"MultiPart ContentItem support" to achieve the same functionality.

WorkPlan
=======

1. change the API of the EnhancementEngine  interface. At first empty
maps will get parsed as enhancementContext (STANBOL-488). Adapt all
EnhancementEngine implementations so that they ignore the additional
parameter.
2. definition of the EnhancementJob interface and implementation of
the same based on the EnhancementJob class of the
o.a.s.enhancer.jobmanager.event module
3. specification and implementation of the "/enhancer/task RESTful service"
4. adaptions of the "MultiPart ContentItem support" to support EnhancementJobs
5. adapt to existing EnhancementEngines to support enhancementContext
(where applicable)

best
Rupert

[1] https://issues.apache.org/jira/browse/STANBOL-488
[2] http://markmail.org/message/hnwdw7o6bxt6pwbe
[3] https://github.com/nuxeo/nuxeo-solr/tree/master/architecture
[4] http://markmail.org/message/wba4ztzkkhvahcyg
[5] http://markmail.org/message/zqztwjhndwj74jqv (part of [2])
[6] http://markmail.org/message/bslhb7ojexdbv56l (part of [2])
--
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: Enhancer API extensions / changes

Posted by Fabian Christ <ch...@googlemail.com>.
Hi,

in order to preserve a compatible version before the breaking changes
are introduced, I created branches for enhancer-0.10.1 and
enhancement-engines-0.10.1 [1]. The trunk will be upgraded to
enhancer-0.11.0 and enhancement-engines-0.11.0 and the breaking
changes will be introduced in the near future on the trunk.

Once we have a new commons and entityhub release, we can also release
the enhancer-0.10.1 stuff from the branches.

[1] https://issues.apache.org/jira/browse/STANBOL-982


2013/3/18 Rupert Westenthaler <ru...@gmail.com>:
> Hi all,
>
> The intension of this mail is to inform all Stanbol Enhancer users
> about upcoming additions to the Stanbol Enhancer that will also
> involve (incompatible) API changes to the EnhancementEngine interface.
>
> This mail only provides an overview about those changes and their
> rational for detailed information please have a look at the
> discussions in [1], [2] and also [4].
>
> Changes will be applied to the trunk (Stanbol Enhancer version
> 0.11.0-SNAPSHOT).
>
>
> EnhancementEngine API change
> ========================
>
> While most of those changes will only affect lower level APIs there
> will be a change of the API for EnhancementEngines. Therefore this
> will require Users with custom EnhancementEngines to provide necessary
> adaptions. As described by the last comment of [1] the API of
> EnhancementEngine will change to
>
>     int canEnhance(ContentItem ci,
>         Map<String, Object> enhancementContext)
>     void computeEnhancements(ContentItem ci,
>         Map<String, Object> enhancementContext)
>
> The enhancementContext will contain request specific configurations.
> EnhancementEngines that support those will need to consider those
> configurations in addition to the configuration parsed in activate(..)
> method of the component.
>
> Typical usage examples:
>
> * parsing user name and pwd for an external service
> * parsing document password for protected rich text documents to the TikaEngine
>
> However this will also allow advanced use cases like parsing the users
> & group to consider ACL for EntityLinking (as described in [3]).
>
> In addition it will allow to move configurations currently from the
> Stanbol instance to the Enhancement request. Something desirable for
> use cases as described in [4] where you want to use the same Stanbol
> configurations on multiple hosts to do load balancing.
>
> EnhancementJob
> =============
>
> This new class will be used to represent an enhancement job. It will
> contain the ContentItem, enhancement chain, execution metadata as well
> as the enhancement context.
>
> The ExecutionMetadata currently stored as ContentPart of the
> ContentItem will move over to the EnhancementJob. Note that this means
> that EnhancementEngines will no longer be able to access those
> information.
>
> The "enhancementContext" parsed to an EnhancementEngine will be
> created by merging EnhancementChain level properties with
> EnhancementEngine specific properties. This means that properties
> defined on Chain level will be visible to all EnhancementEngines
> called by a chain, while EnhancementEngine level properties will be
> only visible to a specific Engine.
>
> EnhancementEngine are supported to allow divagating configurations for
> multiple instances of the same EnhancementEngine implementation being
> called in the same enhancement chain. A typical example are multiple
> EntityLinking engines for different vocabularies (e.g. dbpediaLinking
> and geonamesLinking).
>
> The EnhancementJobManager will be responsible for processing
> EnhancementJobs. Therefore the API will be changed to take an
> EnhancementJob instead of a ContentItem.
>
> /enhancer/task RESTful service
> =======================
>
> This Endpoint will allow to parse EnhancementJobs including
>
> 1. a new end-point that can be added in /enhancer/task
> 2. the end-point takes a Task Request (interface to be defined)
> 3. the Task Request will allow to post:
>     * content or URL submission
>     * per-call engine parameters
>     * per-cal EnhancementChain definitions
> 4. it supports synchronous operations and possible async execution
> with callback URI
>
> [5] suggests to use a JSON for the definition of such tasks, but in
> principle the definition could be also supported by using RDF.
>
> As pointed out in [6] it would be also possible to extend the current
> "MultiPart ContentItem support" to achieve the same functionality.
>
> WorkPlan
> =======
>
> 1. change the API of the EnhancementEngine  interface. At first empty
> maps will get parsed as enhancementContext (STANBOL-488). Adapt all
> EnhancementEngine implementations so that they ignore the additional
> parameter.
> 2. definition of the EnhancementJob interface and implementation of
> the same based on the EnhancementJob class of the
> o.a.s.enhancer.jobmanager.event module
> 3. specification and implementation of the "/enhancer/task RESTful service"
> 4. adaptions of the "MultiPart ContentItem support" to support EnhancementJobs
> 5. adapt to existing EnhancementEngines to support enhancementContext
> (where applicable)
>
> best
> Rupert
>
> [1] https://issues.apache.org/jira/browse/STANBOL-488
> [2] http://markmail.org/message/hnwdw7o6bxt6pwbe
> [3] https://github.com/nuxeo/nuxeo-solr/tree/master/architecture
> [4] http://markmail.org/message/wba4ztzkkhvahcyg
> [5] http://markmail.org/message/zqztwjhndwj74jqv (part of [2])
> [6] http://markmail.org/message/bslhb7ojexdbv56l (part of [2])
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen



-- 
Fabian
http://twitter.com/fctwitt