You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Tanneguy DULONG <ta...@arisem.com> on 2012/01/17 16:41:03 UTC

RE : Feedback about stanbol-414 specification

HI all,

@Florent
FYI,  Talend made a pretty good job at packaging Camel within their Talend Open Studio (TOS) ESB 
TOS is open-source and provides a documented Eclipse RCP graphical design tool for routes (among other nice features)

I have not tested the Camel integration but found the ETL part to have a high level of polish (better than some commercial software actually).
If as I surmise there  should be some chain prototyping from there, it can save you a lot of time.

Hope it helps

Tanneguy




________________________________________
De : florent andré [florent.andre-dev@4sengines.com]
Date d'envoi : mardi 17 janvier 2012 00:52
À : stanbol-dev@incubator.apache.org
Objet : Feedback about stanbol-414 specification

Hi Rupert, *

First, thanks a lot for this first draft definition.

I really like the idea of an RDF graph description of "enhancement
chain" and "engine".

Here come my points :

°°°°°° Entreprise integration patterns (EIP) and Apache Camel °°°°°°

My major remark is about not use a well know, and defined pattern : the
enterprise integration pattern [1].
Behind this "big name", this is all about transferring messages between
"processing unit".
Camel is a very generic framework that implements most of EIP [2], where
messages and processing unit can be almost anything.
Apply to Stanbol, we can consider ContentItem as message and Engines as
processing unit.
Cherry on the cake, camel take care of messages and processing units but
also machinery to make this in "music" (poll, ordering, grouping, error
management,...), and provide pretty simple ways to manage this.

Let's stop my Camel "commercial speech" :), and just say that I will
really try to commit the first version of a Camel enhancer this week.

By the way, as far as I know, Camel don't provide a graph to route
(Camel's term for chain) or route to graph utility... but there is well
define DSL's - spring[1], scala,... - so this can be a clue.

°°°°°° Forward building of chain °°°°°°

In you proposal, the chain is build on a "forward" nature :
you know that A is before B, because B depend on A (property ep:dependsOn).

I don't really like this way of define chain (but it's may be almost my
personal taste), for mainly two reasons :
- As a human, building, but more reading, understanding and make a
cognitive representation of a chain build in that way is pretty
difficult, and difficulty increase with chain complexity. Forward
processing is not a natural way for thinking chains.
- Chain is about processing data, information, message and in usual way
information come from a point and go to another point... and IMO
describe a chain is more about describe the path of the message than the
inner structure of the chain.

°°°°°° Missing features °°°°°°

There is IMO two main missing features in this definition :
1) No way to link chains each others ("chain linking")
2) No way to select engines (or subchain) depending of a condition
("selector")

Let's illustrate this feature with an example :

Imagine we have this 4 chains already defined :
- MusicChain : define a chain with music specifics engines (thesaurus,
ws, etc)
- FoodChain : define a chain with food specifics engines
- PizzaChain : the better chain for pizza
- otherStuffChain : chain for the rest

So far so good, but now I have content with no idea on that content...
I can submit it to all chains (not optimal), or to one random chain
(with the risk to put a Restaurant story in the musicChain)...

So let's define a CategorisationChain.
This chain have for example the topic engine and a generic dbpedia enhancer.
At the end of the chain we have a graph that lead to a with a pretty
good idea of the content's nature.

Now, with the "linking chain" and "selector" features we can define an
"UltimateBigChain" like that :

from(input_file) --> categorisationChain
--> if (graph has "music") --> musicChain.
--> elseif (graph has "food") --> foodChain --> if (graph has
"pizza")--> pizzaChain.
--> otherwise() --> otherStuffChain.

My two cents...
++

[1] : http://www.enterpriseintegrationpatterns.com/toc.html
[2] : http://camel.apache.org/eip.html
[3] : http://camel.apache.org/schema/spring/camel-spring-2.9.0.xsd