You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by florent andré <fl...@4sengines.com> on 2014/05/01 13:05:31 UTC

Community bonding period started

Hi there !

As you may notice Gsoc community bonding period has begin for some time now.

Speaking for Camel/Stanbol integration [1], the good proposal from 
Antonio was accepted ! Congrats !
So Antonio, now bonding have to start! :)

 From my point of view, a good way to bond the community to this 
integration could be to create sub-issues to the "can be considered as 
the main one" STANBOL-1008. So we can see more specific actions you will 
take and discuss specific parts in the related issue, and get a global 
overview when looking at the parent issue.

Antonio what do you think ? Can you do that ?

As a side point, I remembered this morning this mail [2] exchange that 
can give you pointer or idea for an "easy to set up throw REST" Camel's 
routes / flowchart.

Happy bonding !
++


[1] be warned, don't know if any-one can access it : 
https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120

[2] 
http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Rupert

On Tue, May 27, 2014 at 8:51 AM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> On Mon, May 26, 2014 at 6:58 PM, Antonio David Perez Morales
> <ap...@zaizi.com> wrote:
> > Hi Florent, Rupert and all
> >
> > I have already created two new issues as sub-tasks of Stanbol-1008 ([1]
> and
> > [2])
> > The first one intends to integrate the current Florent's approach into
> > Stanbol 1.0 to see if it works well.
> > The second one is about to add support to new routes deployed either as
> > bundles, either XML files put in a specific folder (containing routes and
> > loaded dynamically) or (if necessary) via a new REST endpoint  receiving
> > XML route files to be loaded (or removed).
> >
>
> Make sure to evaluate the Apache Sling Installer [3] and/or the Felix
> File Installer [4] both can be extended to support custom artifacts
> (such as XML route files).
>

I will take a look at it (before creating a specific monitor in a folder
with Java WatchService) after testing (and adapting if necessary) the
current Florent's code to the 1.0 version.


> > I think this can be a good advance for the midterm. This way (and
> > leveraging the engine camel component created by Florent) we would have
> > covered the current Enhancement Chain execution process using Camel
> routes.
> > Well, more powerful because all the existing Camel components could be
> used
> > in the routes to perform advanced (or parallel) processing.
> >
> > How do you see it guys?
> >
> > Taking into account that the second part of the GSoC is longer than the
> > first one, I would like to open a discussion about the new Camel
> components
> > to be created in the second part in order to be used in routes (apart
> from
> > improve the current engine component already developed). As discussed in
> > previous messages some interesting components could be:
> >  - chain: In order to create routes based on existing chains
>
> +1
>
> >  - store: To store the result in EntityHub or another store
>
> As the result of Enhancement Routes is content + metadata I can not
> see what you want to "store" in the Entityhub that is about managing
> Entities.
>

What I meant is that we can create components to deal with EntityHub
(getting and putting entities) and use that component in other routes
(inside or outside Stanbol)

> >  - entityhub: To query/update the entityhub component
>
> Maybe. If you can come up with a good use case ^^
>
> >  - contenthub: To develop a new content-hub using chain/engine components
> > and solr/elasticsearch/whatever component (solr and elasticsearch
> component
> > already exist in Camel)
>
> IMO implementing a new Contenthub like component is outside the scope
> of this GSoC project. However If there is already Solr/Elasticsearch
> component it would be a really useful thing
>

Yes Rupert, it was only an idea about what we can achieve integrating Camel
and developing new components.

>  - other components
> >
> > But I would like to sort them by importance of need based on interesting
> > use cases like:
> >  - Fetch from : local folder / rss / mail / ...
> >  - Enhance with engine 1
> >  - Depending on the result of this engine go to :
> >    -  Chain 1
> >   -  or to Chain 2 and 3 and merge results
> >  - Output the result to : email / ftp / ... and contenthub
> >
> > Do you have any interesting use cases in mind which would be good to have
> > in Stanbol? In this way, we can decide which components should be
> developed
> > first.
>
> A strong use-case - possibly Enterprise Integration alike  - would be cool.
>
> Perfect, so I will start with the first issue to integrate the current
trial into 1.0 and then will take a look at [3] and [4] to see if it can
fit to deploy  new routes in XML (for Java routes we will use bundles).

After that, we can decide the best components to be deployed first.

Regards

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all stanbolers

Continuing with the GSoC work, I have implemented a new component to index
content in Siren. This component extends the functionality of the
previously developed Stanbol Solr component (which in turn leverages the
functionality of Camel Solr component), to index the content along with its
enhancements in Siren, using the format expected by Siren. With such format
and unlike the Stanbol Solr component (which flattened the enhancement
properties, losing the relation between property value and entity which
belongs to) we can keep the structure of the content and all its
enhancements as children, improving the results of subsequent queries to
the index.

I also implemented the web part for workflow endpoint. It allows (like the
enhancer web part) set a text to be sent to a specific route and also
upload routes using a form. There are many things to be improved here, but
as first step I think it is good to have a easier way to upload and test
routes (we have also other mechanisms to add routes, like install bundles
generated by the route archetype or placing a route file in the stanbol
fileinstall directory). Take into account that unlike the enhancer, we can
have routes where the output is sent to another place (like Solr) or can be
triggered by some event (like a message sent to an activeMQ queue, etc), so
only routes using direct endpoints as first step in the route can be tested
through the web interface.

As for here, I will spend the remaining time of the GSoC improving things
and writting documentation. But if you guys think that some further
development is needed for some of your use cases (like a new component or
whatever), please let me know in order to re-schedule the rest of the time.

Regards


On Fri, Jul 25, 2014 at 12:23 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi people
>
> Regarding the last mail and continuing the work about the semantic search
> use case (where a stanbol solr component was already implemented in order
> to have some similar functionality like the old contenthub component), I
> have decided to give a try and implement a Siren [1] component for Stanbol
> workflow component. Siren is an extension of Solr that allows to store
> semi-structured components , fitting perfectly with the idea of store
> documents along with their related entities in order to allow subsequent
> semantic searches.
>
> The problem of the old content hub component (and also the problem of the
> new stanbol solr component) is that all the semantic information per
> document is stored in a plain form in the same Solr document (useful for
> some kind of searches) making impossible to relate the extracted attributes
> (properties) with their respective entities, losing the "parent-child"
> (document-entities) structure.
>
> I think it can be a great component for leveraging all the information
> extracted by Stanbol in searches.
>
> Please, feel free to comment or add whatever information you think useful
> for this.
>
> Regards
>
> [1] http://sirendb.com/
>
>
> On Mon, Jul 21, 2014 at 3:41 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi all
>>
>> As anticipated in the previous mail, I have develop a first version of
>> the Stanbol Solr component. This component (by default managing the
>> stanbol-solr camel protocol) extends the Camel Solr component, so all the
>> properties used to configure it ca be used in this component as well.
>>
>> The component is responsible of extracting fields and values from the
>> entities in the Content-Item and creates a Solr Document with the content
>> and metadata to be indexed in Solr. In this first version, no filtering is
>> being applied to the entities (for example, get the field-values only from
>> the entity with higher confidence value).
>>
>> The first version of the component allows three conf parameters in a
>> route:
>>  - ldpath : LDPath program to be used to extract the values of the
>> fields. As mentioned in the previous mail, if a different ldpath in th
>> dereference engine is used then the properties to be extracted may not
>> exist.
>>  - fields : A comma-separated list of values containing the fields to be
>> extracted from the entities and indexed in Solr.
>>  - useDereferenceLdpath: If no ldpath program is defined, then this
>> boolean flag allows to use the same ldpath program used by the dereference
>> engine (getting it from the information contained in the content-item and
>> passed in the HTTP request to the enhancer or configured in the
>> chain/engine component). Default value is true.
>>
>> A sample route using this component could be the following:
>> <routes xmlns="http://camel.apache.org/schema/spring">
>>     <route id="stanbolsolr">
>>           <from uri="direct://stanbolsolr" />
>>           <to
>> uri="chain://default?enhancer.engines.dereference.ldpath=%40prefix%20test%20%3A%20%3Chttp%3A%2F%2Ftest.org%2F%3E%3B%test%3Aname%3Drdfs%3Alabel%20%3A%3A%20xsd%3Astring%3B"
>> />
>>           <to uri="stanbol-solr://localhost:8983/solr" />
>>      </route>
>>  </routes>
>>
>> As a future extensions of this component, a new property specifying a
>> configured dereference engine to use for the ldpath and filtering the
>> entities to get only the one with the higher confidence value will be
>> developed.
>> With this component, we can have a some features similar to the old
>> Stanbol content hub. So, i think improving this component we could achieve
>> to have to content-hub back to Stanbol (but using an external Solr
>> instance, which I think is good to not overloading the Stanbol application)
>>
>> Moreover, as part of the "use cases" project part and as discussed in the
>> Stanbol IRC Channel, I'm also evaluating Siren [1], an extension of Solr
>> bringing new and improved capabilities to it. It's very useful for
>> structured document search.
>> So my idea is to try to create a Siren component for Camel integrated in
>> Stanbol, to bring the possibility to store (in an easy way) the content
>> along with the extracted metadata in a structured way, instead of simply
>> creating new fields for a document.
>>
>> Stay tuned for new advances.
>> As always, comments are more than welcome.
>>
>> Regards
>>
>> [1] http://sirendb.com/
>>
>>
>> On Wed, Jul 16, 2014 at 12:53 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi people
>>>
>>> Continuing with the project work , I have implemented some improvements
>>> to chain and engine components to allow defining enhancer properties (like
>>> enhancer.engines.dereference.ldpath) in the route component definition.
>>> Example :
>>> from(direct://test).to(engine://dereference-engine?enhancer.engines.dereference.ldpath=EXPRESSION).
>>> As said in previous mails, the engines and chains have to be configured
>>> through Felix console.
>>>
>>> Regarding the last discussion about a new kind of ContentHub back to
>>> Stanbol as an use case for the workflow integration, I have successfully
>>> created a custom Camel processor to create the document with the content
>>> and enhancement metadata in order to be sent to Solr. It takes the LDPath
>>> expression (configured in the dereference engine component via
>>> enhancer.engines.dereference.ldpath query parameter or camel component
>>> parameter) to extract the metadata to be indexed. So using a route like
>>> from().to(chain://Default).process(ContentItemProcessor).to(solr://localhost:8983/solr),
>>> we can have new indexed documents in Solr containing the text and the
>>> extracted enhancement metadata in order to be use in semantic searchs in
>>> the external Solr. Of course, the Solr schema needs to be created in the
>>> remote Solr beforehand. It is only a brief proof of concept of such
>>> functionality.
>>>
>>> My idea is to use an external Solr to store the content and semantic
>>> metadata for semantic search purposes, as opposite of the old ContentHub
>>> which was using an internal SolrYard, creating the schema from the
>>> configured LDPath expression.
>>>
>>> The next step in this task will be create a custom StanbolSolr
>>> component, able to perform the functionality of the previous processor and
>>> Solr, but allowing configuring the LDPath, fields and properties to be
>>> extracted and put as metadata in the new Solr document. These properties
>>> will be applied to the ContentItem metadata, so if an entity dereference
>>> engine is configured with a different LDPath expression or fields, maybe
>>> the properties to be extracted will not exist.
>>> As future improvement of this component, we could add a new conf
>>> parameter specifying a configured dereference engine to be used before
>>> applying the configuration.
>>>
>>> Stay tuned for further advances.
>>>
>>> As always if you have any questions or comments, please drop some lines
>>> here.
>>>
>>> Regards
>>>
>>> PS: The example routes used are very simple and lineals, but for some
>>> scenarios, parallel executions of engines, multicast, aggregator, etc
>>> (supported by camel) could be used to speed up the enhancement process.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 8, 2014 at 9:46 AM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>> Hi Rafa and all
>>>>
>>>> In my opinion, the Content Hub back in Stanbol for Semantic Search
>>>> capabilities is a great use case to be implemented.
>>>> Waiting for Florent's opinion, I could start first only with Solr
>>>> (whose component already exists in Camel but it needs to be adapted like
>>>> the ActiveMQ component) and creating a custom transformer bean for Camel to
>>>> have the original Content Hub. After that, we could think to create the
>>>> SIREn component and the new transformer for it, giving the users the
>>>> possibility of use one of them.
>>>>
>>>> What do you think? Is It an interesting use case for the Camel
>>>> integration application?
>>>>
>>>> Regards
>>>>
>>>>
>>>> On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rh...@apache.org> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>>>>>
>>>>>  Hi all
>>>>>>
>>>>>> Continuing with the project, I have managed successfully the
>>>>>> integration of
>>>>>> activemq camel component (and also jms) into the Stanbol Camel
>>>>>> integration.
>>>>>> This has been a hard task due to the dependencies needed by the
>>>>>> component
>>>>>> and also due to the fact that we had to provide an activemq component
>>>>>> configurable through Felix web console.
>>>>>>
>>>>>> With this addition, we are in the position to integrate business
>>>>>> logic into
>>>>>> Stanbol routes through a message service provided by activemq (jms).
>>>>>>
>>>>> Nice Antonio, let's see is someone has an interesting use case to
>>>>> implement in this context.
>>>>>
>>>>>
>>>>>> As a first test, I have deployed a route which consumes messages
>>>>>> (content)
>>>>>> from an activemq queue, enhance them using the default chain and then
>>>>>> write
>>>>>> the result into a file. It's a simple test but it works quite well.
>>>>>> In this
>>>>>> case, Stanbol is working in a standalone mode, that is to say, we
>>>>>> don't
>>>>>> have to explicitly call Stanbol to enhance content but Stanbol is
>>>>>> triggered
>>>>>> based on some external events (a new queue message)
>>>>>>
>>>>>> As indicated in the previous mail, I still have some pending things
>>>>>> to be
>>>>>> done (because I couldn't do them last week) but in order to go
>>>>>> forward with
>>>>>> the project I ask you for some interesting use cases where to apply
>>>>>> the new
>>>>>> workflow component in order to give added value to it and also in
>>>>>> order to
>>>>>> develop and provide more workflow (camel) components useful for those
>>>>>> and
>>>>>> other use cases.
>>>>>>
>>>>> Awaiting for the community feedback and also for Florent's opinion
>>>>> regarding the rest of the project, as I have expressed in recent emails,
>>>>> I'm eager to see the Content Hub back in Stanbol. And this is because of,
>>>>> from the point of view of the use of Stanbol in the enterprise, Semantic
>>>>> Search is one of the most common use cases. So, to have an enterprise
>>>>> search backend as the last component of a processing route in any
>>>>> architecture where stanbol could be plugged sounds key for me. In recent
>>>>> discussions at the Stanbol IRC channel, we have been analysing Siren (
>>>>> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which
>>>>> major advantage is the possibility to index tree structures, allowing then
>>>>> to index structured data without losing full text search capabilities. To
>>>>> refactor old ContentHub component to use Siren is out of scope of this
>>>>> project but, in my opinion, an interesting use case could be to develop a
>>>>> Siren Camel Component and a transformer from ContentItem to Siren Object or
>>>>> whatever and integrate both in Stanbol.
>>>>>
>>>>> What do you guys think?
>>>>>
>>>>> Cheers,
>>>>> Rafa
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi Stanbolers
>>>>>>>
>>>>>>> The GSoC 2014 midterm is here and I want to give you a summary of
>>>>>>> the work
>>>>>>> already done so far:
>>>>>>>
>>>>>>> - Adapted previous Camel integration PoC done by Florent into
>>>>>>> Stanbol 1.0
>>>>>>> version.
>>>>>>> - Improved EngineComponent used by Camel to execute Enhancement
>>>>>>> Engines
>>>>>>> (configured through Stanbol web console as usual) using the
>>>>>>> engine:// uri
>>>>>>> scheme in routes.
>>>>>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>>>>>> using
>>>>>>> the chain:// uri scheme in routes (both Camel components are
>>>>>>> provided as
>>>>>>> OSGI components, so the uri scheme can be changed through the
>>>>>>> Stanbol web
>>>>>>> console)
>>>>>>> - Created a custom artifact for Apache Felix Fileinstall in order to
>>>>>>> be
>>>>>>> able to install routes defined in Camel Spring XML DSL placing a
>>>>>>> route file
>>>>>>> (with 'route' extension) in the stanbol/fileinstall directory
>>>>>>> - Created a custom archetype to ease the development of bundles
>>>>>>> containing
>>>>>>> route definitions in Java DSL. The archetype generates a class
>>>>>>> extending
>>>>>>> 'RouteBuilder' which creates a default Camel direct endpoint used by
>>>>>>> other
>>>>>>> Stanbol Workflow components to execute the route.
>>>>>>> - Created a first version of Workflow API, which contains different
>>>>>>> OSGI
>>>>>>> components which allow registering Camel components/routes,
>>>>>>> start/stop/execute routes, add/remove components used in routes, etc.
>>>>>>> - REST endpoint is provided to test the execution of routes using
>>>>>>> REST
>>>>>>> requests (/flow/{routeId} )
>>>>>>> - Modified the PoC full launcher to use all the new bundles to
>>>>>>> support the
>>>>>>> workflow feature.
>>>>>>> - Installed JBoss developer studio which comes with Camel support in
>>>>>>> order
>>>>>>> to create routes in a visual way with the possibility to be exported
>>>>>>> as
>>>>>>> Spring XML DSL format
>>>>>>>
>>>>>>> Some pending things I will try to do during this week:
>>>>>>> - Improve the web package to create the needed endpoints to query the
>>>>>>> registered routes, registered camel components, etc
>>>>>>> - Improve the web package to remove classes copied from Stanbol
>>>>>>> jersey
>>>>>>> module used for testing
>>>>>>> - Update README.md files in the repository with all the new
>>>>>>> information
>>>>>>> - Document the installation and configuration of JBoss developer
>>>>>>> studio
>>>>>>> for Camel routes creation
>>>>>>> - Create all the JIRA issued related to the work already done
>>>>>>>
>>>>>>>
>>>>>>> For the second part of the project, I would like to read some
>>>>>>> comments
>>>>>>> about interesting use cases in order to develop the needed Stanbol
>>>>>>> and
>>>>>>> Camel components to support them.
>>>>>>>
>>>>>>> If you have any comment, please drop some lines in order to discuss
>>>>>>> the
>>>>>>> new things to be done.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi guys
>>>>>>>>
>>>>>>>> Continuing with the project, and as part of the refactoring/new
>>>>>>>> architecture I have started to modify some workflow components in
>>>>>>>> order to
>>>>>>>> create a better API and architecture based on OSGI components. As a
>>>>>>>> first
>>>>>>>> step and in order to have the same behavior than the current one
>>>>>>>> (regarding
>>>>>>>> enhancement process), a chain component has been created to
>>>>>>>> simulate the
>>>>>>>> chain behaviour. This new component uses internally the
>>>>>>>> ChainManager and
>>>>>>>> EnhancementJobManager component to perform the business logic. This
>>>>>>>> way, a
>>>>>>>> new protocol 'chain' can be used in the routes deployed in Stanbol.
>>>>>>>> The
>>>>>>>> chains are configured in the same way, using Stanbol admin console.
>>>>>>>>
>>>>>>>> Now, we can combine single engine executions with chains executions
>>>>>>>> in
>>>>>>>> routes deployed in Stanbol using the alternatives described in
>>>>>>>> previous
>>>>>>>> mails and in the issue [1]. Both engines and chains are configured
>>>>>>>> through
>>>>>>>> Stanbol admin console. You can see the refactoring advances in [2]
>>>>>>>> (a
>>>>>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>>>>>> 1.0). Of
>>>>>>>> course, the Camel EIP and other Camel components can be used in the
>>>>>>>> deployed routes as well.
>>>>>>>>
>>>>>>>> With the new Camel routes support, we can have a Stanbol running and
>>>>>>>> enhancing content without receiving any HTTP request to start the
>>>>>>>> enhancement process, because the routes can be triggered by
>>>>>>>> external events
>>>>>>>> ocurred in a queue, database, etc. Moreover the semantic lifting
>>>>>>>> process
>>>>>>>> can be splitted and merged with some application steps, so the
>>>>>>>> issue [3]
>>>>>>>> requesting asynchronous call support for enhancement could be
>>>>>>>> solved.
>>>>>>>>
>>>>>>>> Anyway, if some of you have any suggestions for new components to be
>>>>>>>> deployed for the second part of the project, or another kind of
>>>>>>>> suggestion,
>>>>>>>> please drop here some lines to continue with the discussion.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>> [2]
>>>>>>>> https://github.com/adperezmorales/stanbol-camel-
>>>>>>>> workflow/tree/refactoring
>>>>>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>
>>>>>>>>  Hi people
>>>>>>>>>
>>>>>>>>> As part of the GSoC project for the midterm and according to the
>>>>>>>>> issue
>>>>>>>>> [1], a custom Apache Felix Fileinstall artifact has been created
>>>>>>>>> in order
>>>>>>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file
>>>>>>>>> with
>>>>>>>>> .route extension in a configured directory (like
>>>>>>>>> stanbol/fileinstall
>>>>>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>>>>>> bundle, the
>>>>>>>>> created launcher has been modified to have that bundle in the OSGI
>>>>>>>>> context
>>>>>>>>> by default.
>>>>>>>>>
>>>>>>>>> So, once the current Camel integration POC has been integrated in
>>>>>>>>> Stanbol 1.0 and extended to support the deployment of routes
>>>>>>>>> defined by
>>>>>>>>> Java DSL (through bundles) and XML (route files), the next step
>>>>>>>>> will be
>>>>>>>>> thinking and redesigning the current architecture trying to avoid
>>>>>>>>> the
>>>>>>>>> duplicated code and providing a more extendable and easy to use
>>>>>>>>> Workflow
>>>>>>>>> API, because with the current integration only direct routes can be
>>>>>>>>> triggered using REST API which means that the defined routes must
>>>>>>>>> be
>>>>>>>>> configured properly using a direct endpoint consumer. Anyway,
>>>>>>>>> routes
>>>>>>>>> starting in some other way like timers are triggered directly in
>>>>>>>>> the
>>>>>>>>> deployment, so this has to be taken into account for the new API
>>>>>>>>> (and REST
>>>>>>>>> API).
>>>>>>>>>
>>>>>>>>> In parallel and for the second part, new Stanbol Camel components
>>>>>>>>> will
>>>>>>>>> be developed in order to be used in new routes. So if any of you
>>>>>>>>> have use
>>>>>>>>> cases for this involving Stanbol components, please drop some
>>>>>>>>> lines here in
>>>>>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>>>>>
>>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>
>>>>>>>>>  Hi stanbolers
>>>>>>>>>>
>>>>>>>>>> As part of the issue [1] , I have created a maven archetype
>>>>>>>>>> useful to
>>>>>>>>>> generate Camel routes in Java DSL.
>>>>>>>>>> The archetype generates a Java project with all the dependencies
>>>>>>>>>> and
>>>>>>>>>> one Java class with a method which has to be filled. In this
>>>>>>>>>> method, Camel
>>>>>>>>>> Java DSL syntax is used to create the route.
>>>>>>>>>> By default and as a first approach, the class will use the route
>>>>>>>>>> name
>>>>>>>>>> given during the project creation to enable a Camel direct
>>>>>>>>>> endpoint with
>>>>>>>>>> such name.
>>>>>>>>>> The code of the first archetype version can be found at [2].
>>>>>>>>>>
>>>>>>>>>> The next task will be providing a Felix custom artifact to be
>>>>>>>>>> able to
>>>>>>>>>> deploy XML-based routes in Stanbol, placing a custom file in the
>>>>>>>>>> Stanbol
>>>>>>>>>> datafiles directory.
>>>>>>>>>> After that, it will be time to think and redesign the
>>>>>>>>>> architecture to
>>>>>>>>>> integrate Camel workflows inside Stanbol in a better way, more
>>>>>>>>>> configurable
>>>>>>>>>> and extendable.
>>>>>>>>>>
>>>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>>
>>>>>>>>>>  Hi all
>>>>>>>>>>>
>>>>>>>>>>> After a hard fight this week, I managed to get it work the
>>>>>>>>>>> Florent's
>>>>>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>>>>>> The code is uploaded in my github account [3]. As I said in a
>>>>>>>>>>> previous
>>>>>>>>>>> mail, I prefer to do it separately and after the project,
>>>>>>>>>>> uploading the
>>>>>>>>>>> developed code into a Stanbol branch.
>>>>>>>>>>>
>>>>>>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints
>>>>>>>>>>> are
>>>>>>>>>>> registered and also new classes and packages, so it was not a
>>>>>>>>>>> trivial task
>>>>>>>>>>> to make work the current proof of concept. Moreover I don't like
>>>>>>>>>>> to simply
>>>>>>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>>>>>>> understand how the things work and how they are developed in
>>>>>>>>>>> order to be
>>>>>>>>>>> able to change/modify them or develop new code around them.
>>>>>>>>>>>
>>>>>>>>>>> The steps done to achieve it have been the following:
>>>>>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol
>>>>>>>>>>> 1.0
>>>>>>>>>>> version levels
>>>>>>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>>>>>>> classes and using Java OSGI annotations instead of SCR
>>>>>>>>>>> annotations in
>>>>>>>>>>> Javadoc
>>>>>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>>>>>>> modified needed resources
>>>>>>>>>>> - Added Java OSGI annotations to the route (WeightedChain)
>>>>>>>>>>> instead of
>>>>>>>>>>> SCR annotations in javadoc
>>>>>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>>>>>>> bundles
>>>>>>>>>>>
>>>>>>>>>>> So now, the http://localhost:8080/flow endpoint will use the
>>>>>>>>>>> only
>>>>>>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>>>>>>> Enhancement Engines (ordered by EnhancementEngine order
>>>>>>>>>>> property).
>>>>>>>>>>> For testing purposes, the /flow/{flowName} has been removed,
>>>>>>>>>>> because
>>>>>>>>>>> all this code needs to be re-designed and re-implemented so I
>>>>>>>>>>> only wanted
>>>>>>>>>>> to make it work to have a first (simple) integration in Stanbol
>>>>>>>>>>> 1.0. This
>>>>>>>>>>> functionality will be added again to trigger custom routes once
>>>>>>>>>>> the next
>>>>>>>>>>> step (defined below) is developed.
>>>>>>>>>>>
>>>>>>>>>>> The next step [2] will be support to write and configure routes
>>>>>>>>>>> in XML
>>>>>>>>>>> format, putting the file in datafiles in order to be loaded by a
>>>>>>>>>>> Felix
>>>>>>>>>>> custom artifact (as Rupert pointed out in a previous mail) and
>>>>>>>>>>> create a
>>>>>>>>>>> Maven archetype to create bundles defining routes which will be
>>>>>>>>>>> loaded
>>>>>>>>>>> using the Felix bundle tab. If necessary, as we talked in
>>>>>>>>>>> previous
>>>>>>>>>>> messages, a REST endpoint receiving routes in XML can be
>>>>>>>>>>> developed as an
>>>>>>>>>>> alternative to the first approach. This is my objective for the
>>>>>>>>>>> midterm.
>>>>>>>>>>>
>>>>>>>>>>> After the midterm, the new Stanbol components for Apache Camel
>>>>>>>>>>> will be
>>>>>>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>>>>>>
>>>>>>>>>>> Comments on this and for use cases for Stanbol Camel components
>>>>>>>>>>> are
>>>>>>>>>>> more than welcome.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>  Hi people
>>>>>>>>>>>>
>>>>>>>>>>>> I have already started to work on [1] to integrate current
>>>>>>>>>>>> Florent's
>>>>>>>>>>>> code into Stanbol 1.0.
>>>>>>>>>>>> As a first approach, only changing the dependency versions to
>>>>>>>>>>>> new
>>>>>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>>>>>   - Deprecated use of classes
>>>>>>>>>>>>   - Classes which have changed from package
>>>>>>>>>>>>   - Some classes not necessary now
>>>>>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>>>>>   - ...
>>>>>>>>>>>>
>>>>>>>>>>>> So now I'm trying to resolve all these problems to replicate
>>>>>>>>>>>> the same
>>>>>>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github
>>>>>>>>>>>> repository
>>>>>>>>>>>> in my account (which will be pushed later into a Stanbol branch
>>>>>>>>>>>> after the
>>>>>>>>>>>> project) in order to track the advances.
>>>>>>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous
>>>>>>>>>>>> message to find
>>>>>>>>>>>> out the best way to deploy (and manage) route configurations
>>>>>>>>>>>> (felix
>>>>>>>>>>>> artifacts, watchservice java, rest endpoint to receive xml
>>>>>>>>>>>> routes, etc).
>>>>>>>>>>>>
>>>>>>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>>>>>
>>>>>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>   As the result of Enhancement Routes is content + metadata I
>>>>>>>>>>>>> can not
>>>>>>>>>>>>>
>>>>>>>>>>>>>> see what you want to "store" in the Entityhub that is about
>>>>>>>>>>>>>> managing
>>>>>>>>>>>>>> Entities.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   >  - contenthub: To develop a new content-hub using
>>>>>>>>>>>>>> chain/engine
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> components
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and
>>>>>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> component
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> IMO implementing a new Contenthub like component is outside
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> scope
>>>>>>>>>>>>>> of this GSoC project. However If there is already
>>>>>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>>>>>> component it would be a really useful thing
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Regarding this, in my opinion, the use case of an eventual
>>>>>>>>>>>>> integration with a Content hub is probably one of the most
>>>>>>>>>>>>> clear for this
>>>>>>>>>>>>> project. I'm not sure if that is what Antonio was trying to
>>>>>>>>>>>>> explain but,
>>>>>>>>>>>>> with a single route using as last endpoint Solr or any other
>>>>>>>>>>>>> backend
>>>>>>>>>>>>> system, we would be almost cloning the same functionality than
>>>>>>>>>>>>> the previous
>>>>>>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be
>>>>>>>>>>>>> dereferenced
>>>>>>>>>>>>> using the EntityHub before storing the content along with the
>>>>>>>>>>>>> metadata,
>>>>>>>>>>>>> which is the point of integration of the EntityHub in such use
>>>>>>>>>>>>> case. And
>>>>>>>>>>>>> even most interesting, now with the integration of Marmotta
>>>>>>>>>>>>> contributed by
>>>>>>>>>>>>> Rupert, it would be possible to use a whole graph for
>>>>>>>>>>>>> dereferencing, so
>>>>>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr
>>>>>>>>>>>>> sounds to me like
>>>>>>>>>>>>> an interesting use case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> wdyt?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Rafa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi people

Regarding the last mail and continuing the work about the semantic search
use case (where a stanbol solr component was already implemented in order
to have some similar functionality like the old contenthub component), I
have decided to give a try and implement a Siren [1] component for Stanbol
workflow component. Siren is an extension of Solr that allows to store
semi-structured components , fitting perfectly with the idea of store
documents along with their related entities in order to allow subsequent
semantic searches.

The problem of the old content hub component (and also the problem of the
new stanbol solr component) is that all the semantic information per
document is stored in a plain form in the same Solr document (useful for
some kind of searches) making impossible to relate the extracted attributes
(properties) with their respective entities, losing the "parent-child"
(document-entities) structure.

I think it can be a great component for leveraging all the information
extracted by Stanbol in searches.

Please, feel free to comment or add whatever information you think useful
for this.

Regards

[1] http://sirendb.com/


On Mon, Jul 21, 2014 at 3:41 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi all
>
> As anticipated in the previous mail, I have develop a first version of the
> Stanbol Solr component. This component (by default managing the
> stanbol-solr camel protocol) extends the Camel Solr component, so all the
> properties used to configure it ca be used in this component as well.
>
> The component is responsible of extracting fields and values from the
> entities in the Content-Item and creates a Solr Document with the content
> and metadata to be indexed in Solr. In this first version, no filtering is
> being applied to the entities (for example, get the field-values only from
> the entity with higher confidence value).
>
> The first version of the component allows three conf parameters in a route:
>  - ldpath : LDPath program to be used to extract the values of the fields.
> As mentioned in the previous mail, if a different ldpath in th dereference
> engine is used then the properties to be extracted may not exist.
>  - fields : A comma-separated list of values containing the fields to be
> extracted from the entities and indexed in Solr.
>  - useDereferenceLdpath: If no ldpath program is defined, then this
> boolean flag allows to use the same ldpath program used by the dereference
> engine (getting it from the information contained in the content-item and
> passed in the HTTP request to the enhancer or configured in the
> chain/engine component). Default value is true.
>
> A sample route using this component could be the following:
> <routes xmlns="http://camel.apache.org/schema/spring">
>     <route id="stanbolsolr">
>           <from uri="direct://stanbolsolr" />
>           <to
> uri="chain://default?enhancer.engines.dereference.ldpath=%40prefix%20test%20%3A%20%3Chttp%3A%2F%2Ftest.org%2F%3E%3B%test%3Aname%3Drdfs%3Alabel%20%3A%3A%20xsd%3Astring%3B"
> />
>           <to uri="stanbol-solr://localhost:8983/solr" />
>      </route>
>  </routes>
>
> As a future extensions of this component, a new property specifying a
> configured dereference engine to use for the ldpath and filtering the
> entities to get only the one with the higher confidence value will be
> developed.
> With this component, we can have a some features similar to the old
> Stanbol content hub. So, i think improving this component we could achieve
> to have to content-hub back to Stanbol (but using an external Solr
> instance, which I think is good to not overloading the Stanbol application)
>
> Moreover, as part of the "use cases" project part and as discussed in the
> Stanbol IRC Channel, I'm also evaluating Siren [1], an extension of Solr
> bringing new and improved capabilities to it. It's very useful for
> structured document search.
> So my idea is to try to create a Siren component for Camel integrated in
> Stanbol, to bring the possibility to store (in an easy way) the content
> along with the extracted metadata in a structured way, instead of simply
> creating new fields for a document.
>
> Stay tuned for new advances.
> As always, comments are more than welcome.
>
> Regards
>
> [1] http://sirendb.com/
>
>
> On Wed, Jul 16, 2014 at 12:53 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi people
>>
>> Continuing with the project work , I have implemented some improvements
>> to chain and engine components to allow defining enhancer properties (like
>> enhancer.engines.dereference.ldpath) in the route component definition.
>> Example :
>> from(direct://test).to(engine://dereference-engine?enhancer.engines.dereference.ldpath=EXPRESSION).
>> As said in previous mails, the engines and chains have to be configured
>> through Felix console.
>>
>> Regarding the last discussion about a new kind of ContentHub back to
>> Stanbol as an use case for the workflow integration, I have successfully
>> created a custom Camel processor to create the document with the content
>> and enhancement metadata in order to be sent to Solr. It takes the LDPath
>> expression (configured in the dereference engine component via
>> enhancer.engines.dereference.ldpath query parameter or camel component
>> parameter) to extract the metadata to be indexed. So using a route like
>> from().to(chain://Default).process(ContentItemProcessor).to(solr://localhost:8983/solr),
>> we can have new indexed documents in Solr containing the text and the
>> extracted enhancement metadata in order to be use in semantic searchs in
>> the external Solr. Of course, the Solr schema needs to be created in the
>> remote Solr beforehand. It is only a brief proof of concept of such
>> functionality.
>>
>> My idea is to use an external Solr to store the content and semantic
>> metadata for semantic search purposes, as opposite of the old ContentHub
>> which was using an internal SolrYard, creating the schema from the
>> configured LDPath expression.
>>
>> The next step in this task will be create a custom StanbolSolr component,
>> able to perform the functionality of the previous processor and Solr, but
>> allowing configuring the LDPath, fields and properties to be extracted and
>> put as metadata in the new Solr document. These properties will be applied
>> to the ContentItem metadata, so if an entity dereference engine is
>> configured with a different LDPath expression or fields, maybe the
>> properties to be extracted will not exist.
>> As future improvement of this component, we could add a new conf
>> parameter specifying a configured dereference engine to be used before
>> applying the configuration.
>>
>> Stay tuned for further advances.
>>
>> As always if you have any questions or comments, please drop some lines
>> here.
>>
>> Regards
>>
>> PS: The example routes used are very simple and lineals, but for some
>> scenarios, parallel executions of engines, multicast, aggregator, etc
>> (supported by camel) could be used to speed up the enhancement process.
>>
>>
>>
>>
>>
>>
>> On Tue, Jul 8, 2014 at 9:46 AM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi Rafa and all
>>>
>>> In my opinion, the Content Hub back in Stanbol for Semantic Search
>>> capabilities is a great use case to be implemented.
>>> Waiting for Florent's opinion, I could start first only with Solr (whose
>>> component already exists in Camel but it needs to be adapted like the
>>> ActiveMQ component) and creating a custom transformer bean for Camel to
>>> have the original Content Hub. After that, we could think to create the
>>> SIREn component and the new transformer for it, giving the users the
>>> possibility of use one of them.
>>>
>>> What do you think? Is It an interesting use case for the Camel
>>> integration application?
>>>
>>> Regards
>>>
>>>
>>> On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rh...@apache.org> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>>>>
>>>>  Hi all
>>>>>
>>>>> Continuing with the project, I have managed successfully the
>>>>> integration of
>>>>> activemq camel component (and also jms) into the Stanbol Camel
>>>>> integration.
>>>>> This has been a hard task due to the dependencies needed by the
>>>>> component
>>>>> and also due to the fact that we had to provide an activemq component
>>>>> configurable through Felix web console.
>>>>>
>>>>> With this addition, we are in the position to integrate business logic
>>>>> into
>>>>> Stanbol routes through a message service provided by activemq (jms).
>>>>>
>>>> Nice Antonio, let's see is someone has an interesting use case to
>>>> implement in this context.
>>>>
>>>>
>>>>> As a first test, I have deployed a route which consumes messages
>>>>> (content)
>>>>> from an activemq queue, enhance them using the default chain and then
>>>>> write
>>>>> the result into a file. It's a simple test but it works quite well. In
>>>>> this
>>>>> case, Stanbol is working in a standalone mode, that is to say, we don't
>>>>> have to explicitly call Stanbol to enhance content but Stanbol is
>>>>> triggered
>>>>> based on some external events (a new queue message)
>>>>>
>>>>> As indicated in the previous mail, I still have some pending things to
>>>>> be
>>>>> done (because I couldn't do them last week) but in order to go forward
>>>>> with
>>>>> the project I ask you for some interesting use cases where to apply
>>>>> the new
>>>>> workflow component in order to give added value to it and also in
>>>>> order to
>>>>> develop and provide more workflow (camel) components useful for those
>>>>> and
>>>>> other use cases.
>>>>>
>>>> Awaiting for the community feedback and also for Florent's opinion
>>>> regarding the rest of the project, as I have expressed in recent emails,
>>>> I'm eager to see the Content Hub back in Stanbol. And this is because of,
>>>> from the point of view of the use of Stanbol in the enterprise, Semantic
>>>> Search is one of the most common use cases. So, to have an enterprise
>>>> search backend as the last component of a processing route in any
>>>> architecture where stanbol could be plugged sounds key for me. In recent
>>>> discussions at the Stanbol IRC channel, we have been analysing Siren (
>>>> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which major
>>>> advantage is the possibility to index tree structures, allowing then to
>>>> index structured data without losing full text search capabilities. To
>>>> refactor old ContentHub component to use Siren is out of scope of this
>>>> project but, in my opinion, an interesting use case could be to develop a
>>>> Siren Camel Component and a transformer from ContentItem to Siren Object or
>>>> whatever and integrate both in Stanbol.
>>>>
>>>> What do you guys think?
>>>>
>>>> Cheers,
>>>> Rafa
>>>>
>>>>
>>>>
>>>>
>>>>> Regards
>>>>>
>>>>>
>>>>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>  Hi Stanbolers
>>>>>>
>>>>>> The GSoC 2014 midterm is here and I want to give you a summary of the
>>>>>> work
>>>>>> already done so far:
>>>>>>
>>>>>> - Adapted previous Camel integration PoC done by Florent into Stanbol
>>>>>> 1.0
>>>>>> version.
>>>>>> - Improved EngineComponent used by Camel to execute Enhancement
>>>>>> Engines
>>>>>> (configured through Stanbol web console as usual) using the engine://
>>>>>> uri
>>>>>> scheme in routes.
>>>>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>>>>> using
>>>>>> the chain:// uri scheme in routes (both Camel components are provided
>>>>>> as
>>>>>> OSGI components, so the uri scheme can be changed through the Stanbol
>>>>>> web
>>>>>> console)
>>>>>> - Created a custom artifact for Apache Felix Fileinstall in order to
>>>>>> be
>>>>>> able to install routes defined in Camel Spring XML DSL placing a
>>>>>> route file
>>>>>> (with 'route' extension) in the stanbol/fileinstall directory
>>>>>> - Created a custom archetype to ease the development of bundles
>>>>>> containing
>>>>>> route definitions in Java DSL. The archetype generates a class
>>>>>> extending
>>>>>> 'RouteBuilder' which creates a default Camel direct endpoint used by
>>>>>> other
>>>>>> Stanbol Workflow components to execute the route.
>>>>>> - Created a first version of Workflow API, which contains different
>>>>>> OSGI
>>>>>> components which allow registering Camel components/routes,
>>>>>> start/stop/execute routes, add/remove components used in routes, etc.
>>>>>> - REST endpoint is provided to test the execution of routes using REST
>>>>>> requests (/flow/{routeId} )
>>>>>> - Modified the PoC full launcher to use all the new bundles to
>>>>>> support the
>>>>>> workflow feature.
>>>>>> - Installed JBoss developer studio which comes with Camel support in
>>>>>> order
>>>>>> to create routes in a visual way with the possibility to be exported
>>>>>> as
>>>>>> Spring XML DSL format
>>>>>>
>>>>>> Some pending things I will try to do during this week:
>>>>>> - Improve the web package to create the needed endpoints to query the
>>>>>> registered routes, registered camel components, etc
>>>>>> - Improve the web package to remove classes copied from Stanbol jersey
>>>>>> module used for testing
>>>>>> - Update README.md files in the repository with all the new
>>>>>> information
>>>>>> - Document the installation and configuration of JBoss developer
>>>>>> studio
>>>>>> for Camel routes creation
>>>>>> - Create all the JIRA issued related to the work already done
>>>>>>
>>>>>>
>>>>>> For the second part of the project, I would like to read some comments
>>>>>> about interesting use cases in order to develop the needed Stanbol and
>>>>>> Camel components to support them.
>>>>>>
>>>>>> If you have any comment, please drop some lines in order to discuss
>>>>>> the
>>>>>> new things to be done.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi guys
>>>>>>>
>>>>>>> Continuing with the project, and as part of the refactoring/new
>>>>>>> architecture I have started to modify some workflow components in
>>>>>>> order to
>>>>>>> create a better API and architecture based on OSGI components. As a
>>>>>>> first
>>>>>>> step and in order to have the same behavior than the current one
>>>>>>> (regarding
>>>>>>> enhancement process), a chain component has been created to simulate
>>>>>>> the
>>>>>>> chain behaviour. This new component uses internally the ChainManager
>>>>>>> and
>>>>>>> EnhancementJobManager component to perform the business logic. This
>>>>>>> way, a
>>>>>>> new protocol 'chain' can be used in the routes deployed in Stanbol.
>>>>>>> The
>>>>>>> chains are configured in the same way, using Stanbol admin console.
>>>>>>>
>>>>>>> Now, we can combine single engine executions with chains executions
>>>>>>> in
>>>>>>> routes deployed in Stanbol using the alternatives described in
>>>>>>> previous
>>>>>>> mails and in the issue [1]. Both engines and chains are configured
>>>>>>> through
>>>>>>> Stanbol admin console. You can see the refactoring advances in [2] (a
>>>>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>>>>> 1.0). Of
>>>>>>> course, the Camel EIP and other Camel components can be used in the
>>>>>>> deployed routes as well.
>>>>>>>
>>>>>>> With the new Camel routes support, we can have a Stanbol running and
>>>>>>> enhancing content without receiving any HTTP request to start the
>>>>>>> enhancement process, because the routes can be triggered by external
>>>>>>> events
>>>>>>> ocurred in a queue, database, etc. Moreover the semantic lifting
>>>>>>> process
>>>>>>> can be splitted and merged with some application steps, so the issue
>>>>>>> [3]
>>>>>>> requesting asynchronous call support for enhancement could be solved.
>>>>>>>
>>>>>>> Anyway, if some of you have any suggestions for new components to be
>>>>>>> deployed for the second part of the project, or another kind of
>>>>>>> suggestion,
>>>>>>> please drop here some lines to continue with the discussion.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>> [2]
>>>>>>> https://github.com/adperezmorales/stanbol-camel-
>>>>>>> workflow/tree/refactoring
>>>>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi people
>>>>>>>>
>>>>>>>> As part of the GSoC project for the midterm and according to the
>>>>>>>> issue
>>>>>>>> [1], a custom Apache Felix Fileinstall artifact has been created in
>>>>>>>> order
>>>>>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file
>>>>>>>> with
>>>>>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>>>>> bundle, the
>>>>>>>> created launcher has been modified to have that bundle in the OSGI
>>>>>>>> context
>>>>>>>> by default.
>>>>>>>>
>>>>>>>> So, once the current Camel integration POC has been integrated in
>>>>>>>> Stanbol 1.0 and extended to support the deployment of routes
>>>>>>>> defined by
>>>>>>>> Java DSL (through bundles) and XML (route files), the next step
>>>>>>>> will be
>>>>>>>> thinking and redesigning the current architecture trying to avoid
>>>>>>>> the
>>>>>>>> duplicated code and providing a more extendable and easy to use
>>>>>>>> Workflow
>>>>>>>> API, because with the current integration only direct routes can be
>>>>>>>> triggered using REST API which means that the defined routes must be
>>>>>>>> configured properly using a direct endpoint consumer. Anyway, routes
>>>>>>>> starting in some other way like timers are triggered directly in the
>>>>>>>> deployment, so this has to be taken into account for the new API
>>>>>>>> (and REST
>>>>>>>> API).
>>>>>>>>
>>>>>>>> In parallel and for the second part, new Stanbol Camel components
>>>>>>>> will
>>>>>>>> be developed in order to be used in new routes. So if any of you
>>>>>>>> have use
>>>>>>>> cases for this involving Stanbol components, please drop some lines
>>>>>>>> here in
>>>>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>>>>
>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>
>>>>>>>>  Hi stanbolers
>>>>>>>>>
>>>>>>>>> As part of the issue [1] , I have created a maven archetype useful
>>>>>>>>> to
>>>>>>>>> generate Camel routes in Java DSL.
>>>>>>>>> The archetype generates a Java project with all the dependencies
>>>>>>>>> and
>>>>>>>>> one Java class with a method which has to be filled. In this
>>>>>>>>> method, Camel
>>>>>>>>> Java DSL syntax is used to create the route.
>>>>>>>>> By default and as a first approach, the class will use the route
>>>>>>>>> name
>>>>>>>>> given during the project creation to enable a Camel direct
>>>>>>>>> endpoint with
>>>>>>>>> such name.
>>>>>>>>> The code of the first archetype version can be found at [2].
>>>>>>>>>
>>>>>>>>> The next task will be providing a Felix custom artifact to be able
>>>>>>>>> to
>>>>>>>>> deploy XML-based routes in Stanbol, placing a custom file in the
>>>>>>>>> Stanbol
>>>>>>>>> datafiles directory.
>>>>>>>>> After that, it will be time to think and redesign the architecture
>>>>>>>>> to
>>>>>>>>> integrate Camel workflows inside Stanbol in a better way, more
>>>>>>>>> configurable
>>>>>>>>> and extendable.
>>>>>>>>>
>>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>
>>>>>>>>>  Hi all
>>>>>>>>>>
>>>>>>>>>> After a hard fight this week, I managed to get it work the
>>>>>>>>>> Florent's
>>>>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>>>>> The code is uploaded in my github account [3]. As I said in a
>>>>>>>>>> previous
>>>>>>>>>> mail, I prefer to do it separately and after the project,
>>>>>>>>>> uploading the
>>>>>>>>>> developed code into a Stanbol branch.
>>>>>>>>>>
>>>>>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>>>>>>> registered and also new classes and packages, so it was not a
>>>>>>>>>> trivial task
>>>>>>>>>> to make work the current proof of concept. Moreover I don't like
>>>>>>>>>> to simply
>>>>>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>>>>>> understand how the things work and how they are developed in
>>>>>>>>>> order to be
>>>>>>>>>> able to change/modify them or develop new code around them.
>>>>>>>>>>
>>>>>>>>>> The steps done to achieve it have been the following:
>>>>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol
>>>>>>>>>> 1.0
>>>>>>>>>> version levels
>>>>>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>>>>>> classes and using Java OSGI annotations instead of SCR
>>>>>>>>>> annotations in
>>>>>>>>>> Javadoc
>>>>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>>>>>> modified needed resources
>>>>>>>>>> - Added Java OSGI annotations to the route (WeightedChain)
>>>>>>>>>> instead of
>>>>>>>>>> SCR annotations in javadoc
>>>>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>>>>>> bundles
>>>>>>>>>>
>>>>>>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>>>>>>> For testing purposes, the /flow/{flowName} has been removed,
>>>>>>>>>> because
>>>>>>>>>> all this code needs to be re-designed and re-implemented so I
>>>>>>>>>> only wanted
>>>>>>>>>> to make it work to have a first (simple) integration in Stanbol
>>>>>>>>>> 1.0. This
>>>>>>>>>> functionality will be added again to trigger custom routes once
>>>>>>>>>> the next
>>>>>>>>>> step (defined below) is developed.
>>>>>>>>>>
>>>>>>>>>> The next step [2] will be support to write and configure routes
>>>>>>>>>> in XML
>>>>>>>>>> format, putting the file in datafiles in order to be loaded by a
>>>>>>>>>> Felix
>>>>>>>>>> custom artifact (as Rupert pointed out in a previous mail) and
>>>>>>>>>> create a
>>>>>>>>>> Maven archetype to create bundles defining routes which will be
>>>>>>>>>> loaded
>>>>>>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>>>>>>> messages, a REST endpoint receiving routes in XML can be
>>>>>>>>>> developed as an
>>>>>>>>>> alternative to the first approach. This is my objective for the
>>>>>>>>>> midterm.
>>>>>>>>>>
>>>>>>>>>> After the midterm, the new Stanbol components for Apache Camel
>>>>>>>>>> will be
>>>>>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>>>>>
>>>>>>>>>> Comments on this and for use cases for Stanbol Camel components
>>>>>>>>>> are
>>>>>>>>>> more than welcome.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>>
>>>>>>>>>>  Hi people
>>>>>>>>>>>
>>>>>>>>>>> I have already started to work on [1] to integrate current
>>>>>>>>>>> Florent's
>>>>>>>>>>> code into Stanbol 1.0.
>>>>>>>>>>> As a first approach, only changing the dependency versions to new
>>>>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>>>>   - Deprecated use of classes
>>>>>>>>>>>   - Classes which have changed from package
>>>>>>>>>>>   - Some classes not necessary now
>>>>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>>>>   - ...
>>>>>>>>>>>
>>>>>>>>>>> So now I'm trying to resolve all these problems to replicate the
>>>>>>>>>>> same
>>>>>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github
>>>>>>>>>>> repository
>>>>>>>>>>> in my account (which will be pushed later into a Stanbol branch
>>>>>>>>>>> after the
>>>>>>>>>>> project) in order to track the advances.
>>>>>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous
>>>>>>>>>>> message to find
>>>>>>>>>>> out the best way to deploy (and manage) route configurations
>>>>>>>>>>> (felix
>>>>>>>>>>> artifacts, watchservice java, rest endpoint to receive xml
>>>>>>>>>>> routes, etc).
>>>>>>>>>>>
>>>>>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>>
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>>>>
>>>>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>>>>
>>>>>>>>>>>>   As the result of Enhancement Routes is content + metadata I
>>>>>>>>>>>> can not
>>>>>>>>>>>>
>>>>>>>>>>>>> see what you want to "store" in the Entityhub that is about
>>>>>>>>>>>>> managing
>>>>>>>>>>>>> Entities.
>>>>>>>>>>>>>
>>>>>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>>>>>
>>>>>>>>>>>>>   >  - contenthub: To develop a new content-hub using
>>>>>>>>>>>>> chain/engine
>>>>>>>>>>>>>
>>>>>>>>>>>>>> components
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and
>>>>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> component
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> IMO implementing a new Contenthub like component is outside
>>>>>>>>>>>>> the
>>>>>>>>>>>>> scope
>>>>>>>>>>>>> of this GSoC project. However If there is already
>>>>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>>>>> component it would be a really useful thing
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Regarding this, in my opinion, the use case of an eventual
>>>>>>>>>>>> integration with a Content hub is probably one of the most
>>>>>>>>>>>> clear for this
>>>>>>>>>>>> project. I'm not sure if that is what Antonio was trying to
>>>>>>>>>>>> explain but,
>>>>>>>>>>>> with a single route using as last endpoint Solr or any other
>>>>>>>>>>>> backend
>>>>>>>>>>>> system, we would be almost cloning the same functionality than
>>>>>>>>>>>> the previous
>>>>>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be
>>>>>>>>>>>> dereferenced
>>>>>>>>>>>> using the EntityHub before storing the content along with the
>>>>>>>>>>>> metadata,
>>>>>>>>>>>> which is the point of integration of the EntityHub in such use
>>>>>>>>>>>> case. And
>>>>>>>>>>>> even most interesting, now with the integration of Marmotta
>>>>>>>>>>>> contributed by
>>>>>>>>>>>> Rupert, it would be possible to use a whole graph for
>>>>>>>>>>>> dereferencing, so
>>>>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr
>>>>>>>>>>>> sounds to me like
>>>>>>>>>>>> an interesting use case.
>>>>>>>>>>>>
>>>>>>>>>>>> wdyt?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Rafa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all

As anticipated in the previous mail, I have develop a first version of the
Stanbol Solr component. This component (by default managing the
stanbol-solr camel protocol) extends the Camel Solr component, so all the
properties used to configure it ca be used in this component as well.

The component is responsible of extracting fields and values from the
entities in the Content-Item and creates a Solr Document with the content
and metadata to be indexed in Solr. In this first version, no filtering is
being applied to the entities (for example, get the field-values only from
the entity with higher confidence value).

The first version of the component allows three conf parameters in a route:
 - ldpath : LDPath program to be used to extract the values of the fields.
As mentioned in the previous mail, if a different ldpath in th dereference
engine is used then the properties to be extracted may not exist.
 - fields : A comma-separated list of values containing the fields to be
extracted from the entities and indexed in Solr.
 - useDereferenceLdpath: If no ldpath program is defined, then this boolean
flag allows to use the same ldpath program used by the dereference engine
(getting it from the information contained in the content-item and passed
in the HTTP request to the enhancer or configured in the chain/engine
component). Default value is true.

A sample route using this component could be the following:
<routes xmlns="http://camel.apache.org/schema/spring">
    <route id="stanbolsolr">
          <from uri="direct://stanbolsolr" />
          <to
uri="chain://default?enhancer.engines.dereference.ldpath=%40prefix%20test%20%3A%20%3Chttp%3A%2F%2Ftest.org%2F%3E%3B%test%3Aname%3Drdfs%3Alabel%20%3A%3A%20xsd%3Astring%3B"
/>
          <to uri="stanbol-solr://localhost:8983/solr" />
     </route>
 </routes>

As a future extensions of this component, a new property specifying a
configured dereference engine to use for the ldpath and filtering the
entities to get only the one with the higher confidence value will be
developed.
With this component, we can have a some features similar to the old Stanbol
content hub. So, i think improving this component we could achieve to have
to content-hub back to Stanbol (but using an external Solr instance, which
I think is good to not overloading the Stanbol application)

Moreover, as part of the "use cases" project part and as discussed in the
Stanbol IRC Channel, I'm also evaluating Siren [1], an extension of Solr
bringing new and improved capabilities to it. It's very useful for
structured document search.
So my idea is to try to create a Siren component for Camel integrated in
Stanbol, to bring the possibility to store (in an easy way) the content
along with the extracted metadata in a structured way, instead of simply
creating new fields for a document.

Stay tuned for new advances.
As always, comments are more than welcome.

Regards

[1] http://sirendb.com/


On Wed, Jul 16, 2014 at 12:53 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi people
>
> Continuing with the project work , I have implemented some improvements to
> chain and engine components to allow defining enhancer properties (like
> enhancer.engines.dereference.ldpath) in the route component definition.
> Example :
> from(direct://test).to(engine://dereference-engine?enhancer.engines.dereference.ldpath=EXPRESSION).
> As said in previous mails, the engines and chains have to be configured
> through Felix console.
>
> Regarding the last discussion about a new kind of ContentHub back to
> Stanbol as an use case for the workflow integration, I have successfully
> created a custom Camel processor to create the document with the content
> and enhancement metadata in order to be sent to Solr. It takes the LDPath
> expression (configured in the dereference engine component via
> enhancer.engines.dereference.ldpath query parameter or camel component
> parameter) to extract the metadata to be indexed. So using a route like
> from().to(chain://Default).process(ContentItemProcessor).to(solr://localhost:8983/solr),
> we can have new indexed documents in Solr containing the text and the
> extracted enhancement metadata in order to be use in semantic searchs in
> the external Solr. Of course, the Solr schema needs to be created in the
> remote Solr beforehand. It is only a brief proof of concept of such
> functionality.
>
> My idea is to use an external Solr to store the content and semantic
> metadata for semantic search purposes, as opposite of the old ContentHub
> which was using an internal SolrYard, creating the schema from the
> configured LDPath expression.
>
> The next step in this task will be create a custom StanbolSolr component,
> able to perform the functionality of the previous processor and Solr, but
> allowing configuring the LDPath, fields and properties to be extracted and
> put as metadata in the new Solr document. These properties will be applied
> to the ContentItem metadata, so if an entity dereference engine is
> configured with a different LDPath expression or fields, maybe the
> properties to be extracted will not exist.
> As future improvement of this component, we could add a new conf parameter
> specifying a configured dereference engine to be used before applying the
> configuration.
>
> Stay tuned for further advances.
>
> As always if you have any questions or comments, please drop some lines
> here.
>
> Regards
>
> PS: The example routes used are very simple and lineals, but for some
> scenarios, parallel executions of engines, multicast, aggregator, etc
> (supported by camel) could be used to speed up the enhancement process.
>
>
>
>
>
>
> On Tue, Jul 8, 2014 at 9:46 AM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi Rafa and all
>>
>> In my opinion, the Content Hub back in Stanbol for Semantic Search
>> capabilities is a great use case to be implemented.
>> Waiting for Florent's opinion, I could start first only with Solr (whose
>> component already exists in Camel but it needs to be adapted like the
>> ActiveMQ component) and creating a custom transformer bean for Camel to
>> have the original Content Hub. After that, we could think to create the
>> SIREn component and the new transformer for it, giving the users the
>> possibility of use one of them.
>>
>> What do you think? Is It an interesting use case for the Camel
>> integration application?
>>
>> Regards
>>
>>
>> On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rh...@apache.org> wrote:
>>
>>> Hi guys,
>>>
>>> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>>>
>>>  Hi all
>>>>
>>>> Continuing with the project, I have managed successfully the
>>>> integration of
>>>> activemq camel component (and also jms) into the Stanbol Camel
>>>> integration.
>>>> This has been a hard task due to the dependencies needed by the
>>>> component
>>>> and also due to the fact that we had to provide an activemq component
>>>> configurable through Felix web console.
>>>>
>>>> With this addition, we are in the position to integrate business logic
>>>> into
>>>> Stanbol routes through a message service provided by activemq (jms).
>>>>
>>> Nice Antonio, let's see is someone has an interesting use case to
>>> implement in this context.
>>>
>>>
>>>> As a first test, I have deployed a route which consumes messages
>>>> (content)
>>>> from an activemq queue, enhance them using the default chain and then
>>>> write
>>>> the result into a file. It's a simple test but it works quite well. In
>>>> this
>>>> case, Stanbol is working in a standalone mode, that is to say, we don't
>>>> have to explicitly call Stanbol to enhance content but Stanbol is
>>>> triggered
>>>> based on some external events (a new queue message)
>>>>
>>>> As indicated in the previous mail, I still have some pending things to
>>>> be
>>>> done (because I couldn't do them last week) but in order to go forward
>>>> with
>>>> the project I ask you for some interesting use cases where to apply the
>>>> new
>>>> workflow component in order to give added value to it and also in order
>>>> to
>>>> develop and provide more workflow (camel) components useful for those
>>>> and
>>>> other use cases.
>>>>
>>> Awaiting for the community feedback and also for Florent's opinion
>>> regarding the rest of the project, as I have expressed in recent emails,
>>> I'm eager to see the Content Hub back in Stanbol. And this is because of,
>>> from the point of view of the use of Stanbol in the enterprise, Semantic
>>> Search is one of the most common use cases. So, to have an enterprise
>>> search backend as the last component of a processing route in any
>>> architecture where stanbol could be plugged sounds key for me. In recent
>>> discussions at the Stanbol IRC channel, we have been analysing Siren (
>>> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which major
>>> advantage is the possibility to index tree structures, allowing then to
>>> index structured data without losing full text search capabilities. To
>>> refactor old ContentHub component to use Siren is out of scope of this
>>> project but, in my opinion, an interesting use case could be to develop a
>>> Siren Camel Component and a transformer from ContentItem to Siren Object or
>>> whatever and integrate both in Stanbol.
>>>
>>> What do you guys think?
>>>
>>> Cheers,
>>> Rafa
>>>
>>>
>>>
>>>
>>>> Regards
>>>>
>>>>
>>>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>  Hi Stanbolers
>>>>>
>>>>> The GSoC 2014 midterm is here and I want to give you a summary of the
>>>>> work
>>>>> already done so far:
>>>>>
>>>>> - Adapted previous Camel integration PoC done by Florent into Stanbol
>>>>> 1.0
>>>>> version.
>>>>> - Improved EngineComponent used by Camel to execute Enhancement Engines
>>>>> (configured through Stanbol web console as usual) using the engine://
>>>>> uri
>>>>> scheme in routes.
>>>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>>>> using
>>>>> the chain:// uri scheme in routes (both Camel components are provided
>>>>> as
>>>>> OSGI components, so the uri scheme can be changed through the Stanbol
>>>>> web
>>>>> console)
>>>>> - Created a custom artifact for Apache Felix Fileinstall in order to be
>>>>> able to install routes defined in Camel Spring XML DSL placing a route
>>>>> file
>>>>> (with 'route' extension) in the stanbol/fileinstall directory
>>>>> - Created a custom archetype to ease the development of bundles
>>>>> containing
>>>>> route definitions in Java DSL. The archetype generates a class
>>>>> extending
>>>>> 'RouteBuilder' which creates a default Camel direct endpoint used by
>>>>> other
>>>>> Stanbol Workflow components to execute the route.
>>>>> - Created a first version of Workflow API, which contains different
>>>>> OSGI
>>>>> components which allow registering Camel components/routes,
>>>>> start/stop/execute routes, add/remove components used in routes, etc.
>>>>> - REST endpoint is provided to test the execution of routes using REST
>>>>> requests (/flow/{routeId} )
>>>>> - Modified the PoC full launcher to use all the new bundles to support
>>>>> the
>>>>> workflow feature.
>>>>> - Installed JBoss developer studio which comes with Camel support in
>>>>> order
>>>>> to create routes in a visual way with the possibility to be exported as
>>>>> Spring XML DSL format
>>>>>
>>>>> Some pending things I will try to do during this week:
>>>>> - Improve the web package to create the needed endpoints to query the
>>>>> registered routes, registered camel components, etc
>>>>> - Improve the web package to remove classes copied from Stanbol jersey
>>>>> module used for testing
>>>>> - Update README.md files in the repository with all the new information
>>>>> - Document the installation and configuration of JBoss developer studio
>>>>> for Camel routes creation
>>>>> - Create all the JIRA issued related to the work already done
>>>>>
>>>>>
>>>>> For the second part of the project, I would like to read some comments
>>>>> about interesting use cases in order to develop the needed Stanbol and
>>>>> Camel components to support them.
>>>>>
>>>>> If you have any comment, please drop some lines in order to discuss the
>>>>> new things to be done.
>>>>>
>>>>> Regards
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>  Hi guys
>>>>>>
>>>>>> Continuing with the project, and as part of the refactoring/new
>>>>>> architecture I have started to modify some workflow components in
>>>>>> order to
>>>>>> create a better API and architecture based on OSGI components. As a
>>>>>> first
>>>>>> step and in order to have the same behavior than the current one
>>>>>> (regarding
>>>>>> enhancement process), a chain component has been created to simulate
>>>>>> the
>>>>>> chain behaviour. This new component uses internally the ChainManager
>>>>>> and
>>>>>> EnhancementJobManager component to perform the business logic. This
>>>>>> way, a
>>>>>> new protocol 'chain' can be used in the routes deployed in Stanbol.
>>>>>> The
>>>>>> chains are configured in the same way, using Stanbol admin console.
>>>>>>
>>>>>> Now, we can combine single engine executions with chains executions in
>>>>>> routes deployed in Stanbol using the alternatives described in
>>>>>> previous
>>>>>> mails and in the issue [1]. Both engines and chains are configured
>>>>>> through
>>>>>> Stanbol admin console. You can see the refactoring advances in [2] (a
>>>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>>>> 1.0). Of
>>>>>> course, the Camel EIP and other Camel components can be used in the
>>>>>> deployed routes as well.
>>>>>>
>>>>>> With the new Camel routes support, we can have a Stanbol running and
>>>>>> enhancing content without receiving any HTTP request to start the
>>>>>> enhancement process, because the routes can be triggered by external
>>>>>> events
>>>>>> ocurred in a queue, database, etc. Moreover the semantic lifting
>>>>>> process
>>>>>> can be splitted and merged with some application steps, so the issue
>>>>>> [3]
>>>>>> requesting asynchronous call support for enhancement could be solved.
>>>>>>
>>>>>> Anyway, if some of you have any suggestions for new components to be
>>>>>> deployed for the second part of the project, or another kind of
>>>>>> suggestion,
>>>>>> please drop here some lines to continue with the discussion.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>> [2]
>>>>>> https://github.com/adperezmorales/stanbol-camel-
>>>>>> workflow/tree/refactoring
>>>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi people
>>>>>>>
>>>>>>> As part of the GSoC project for the midterm and according to the
>>>>>>> issue
>>>>>>> [1], a custom Apache Felix Fileinstall artifact has been created in
>>>>>>> order
>>>>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file
>>>>>>> with
>>>>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>>>> bundle, the
>>>>>>> created launcher has been modified to have that bundle in the OSGI
>>>>>>> context
>>>>>>> by default.
>>>>>>>
>>>>>>> So, once the current Camel integration POC has been integrated in
>>>>>>> Stanbol 1.0 and extended to support the deployment of routes defined
>>>>>>> by
>>>>>>> Java DSL (through bundles) and XML (route files), the next step will
>>>>>>> be
>>>>>>> thinking and redesigning the current architecture trying to avoid the
>>>>>>> duplicated code and providing a more extendable and easy to use
>>>>>>> Workflow
>>>>>>> API, because with the current integration only direct routes can be
>>>>>>> triggered using REST API which means that the defined routes must be
>>>>>>> configured properly using a direct endpoint consumer. Anyway, routes
>>>>>>> starting in some other way like timers are triggered directly in the
>>>>>>> deployment, so this has to be taken into account for the new API
>>>>>>> (and REST
>>>>>>> API).
>>>>>>>
>>>>>>> In parallel and for the second part, new Stanbol Camel components
>>>>>>> will
>>>>>>> be developed in order to be used in new routes. So if any of you
>>>>>>> have use
>>>>>>> cases for this involving Stanbol components, please drop some lines
>>>>>>> here in
>>>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>>>
>>>>>>> Comments and suggestions are more than welcome
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi stanbolers
>>>>>>>>
>>>>>>>> As part of the issue [1] , I have created a maven archetype useful
>>>>>>>> to
>>>>>>>> generate Camel routes in Java DSL.
>>>>>>>> The archetype generates a Java project with all the dependencies and
>>>>>>>> one Java class with a method which has to be filled. In this
>>>>>>>> method, Camel
>>>>>>>> Java DSL syntax is used to create the route.
>>>>>>>> By default and as a first approach, the class will use the route
>>>>>>>> name
>>>>>>>> given during the project creation to enable a Camel direct endpoint
>>>>>>>> with
>>>>>>>> such name.
>>>>>>>> The code of the first archetype version can be found at [2].
>>>>>>>>
>>>>>>>> The next task will be providing a Felix custom artifact to be able
>>>>>>>> to
>>>>>>>> deploy XML-based routes in Stanbol, placing a custom file in the
>>>>>>>> Stanbol
>>>>>>>> datafiles directory.
>>>>>>>> After that, it will be time to think and redesign the architecture
>>>>>>>> to
>>>>>>>> integrate Camel workflows inside Stanbol in a better way, more
>>>>>>>> configurable
>>>>>>>> and extendable.
>>>>>>>>
>>>>>>>> Comments and suggestions are more than welcome
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>
>>>>>>>>  Hi all
>>>>>>>>>
>>>>>>>>> After a hard fight this week, I managed to get it work the
>>>>>>>>> Florent's
>>>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>>>> The code is uploaded in my github account [3]. As I said in a
>>>>>>>>> previous
>>>>>>>>> mail, I prefer to do it separately and after the project,
>>>>>>>>> uploading the
>>>>>>>>> developed code into a Stanbol branch.
>>>>>>>>>
>>>>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>>>>>> registered and also new classes and packages, so it was not a
>>>>>>>>> trivial task
>>>>>>>>> to make work the current proof of concept. Moreover I don't like
>>>>>>>>> to simply
>>>>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>>>>> understand how the things work and how they are developed in order
>>>>>>>>> to be
>>>>>>>>> able to change/modify them or develop new code around them.
>>>>>>>>>
>>>>>>>>> The steps done to achieve it have been the following:
>>>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol
>>>>>>>>> 1.0
>>>>>>>>> version levels
>>>>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>>>>> classes and using Java OSGI annotations instead of SCR annotations
>>>>>>>>> in
>>>>>>>>> Javadoc
>>>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>>>>> modified needed resources
>>>>>>>>> - Added Java OSGI annotations to the route (WeightedChain) instead
>>>>>>>>> of
>>>>>>>>> SCR annotations in javadoc
>>>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>>>>> bundles
>>>>>>>>>
>>>>>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>>>>>> For testing purposes, the /flow/{flowName} has been removed,
>>>>>>>>> because
>>>>>>>>> all this code needs to be re-designed and re-implemented so I only
>>>>>>>>> wanted
>>>>>>>>> to make it work to have a first (simple) integration in Stanbol
>>>>>>>>> 1.0. This
>>>>>>>>> functionality will be added again to trigger custom routes once
>>>>>>>>> the next
>>>>>>>>> step (defined below) is developed.
>>>>>>>>>
>>>>>>>>> The next step [2] will be support to write and configure routes in
>>>>>>>>> XML
>>>>>>>>> format, putting the file in datafiles in order to be loaded by a
>>>>>>>>> Felix
>>>>>>>>> custom artifact (as Rupert pointed out in a previous mail) and
>>>>>>>>> create a
>>>>>>>>> Maven archetype to create bundles defining routes which will be
>>>>>>>>> loaded
>>>>>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>>>>>> messages, a REST endpoint receiving routes in XML can be developed
>>>>>>>>> as an
>>>>>>>>> alternative to the first approach. This is my objective for the
>>>>>>>>> midterm.
>>>>>>>>>
>>>>>>>>> After the midterm, the new Stanbol components for Apache Camel
>>>>>>>>> will be
>>>>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>>>>
>>>>>>>>> Comments on this and for use cases for Stanbol Camel components are
>>>>>>>>> more than welcome.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>>
>>>>>>>>>  Hi people
>>>>>>>>>>
>>>>>>>>>> I have already started to work on [1] to integrate current
>>>>>>>>>> Florent's
>>>>>>>>>> code into Stanbol 1.0.
>>>>>>>>>> As a first approach, only changing the dependency versions to new
>>>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>>>   - Deprecated use of classes
>>>>>>>>>>   - Classes which have changed from package
>>>>>>>>>>   - Some classes not necessary now
>>>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>>>   - ...
>>>>>>>>>>
>>>>>>>>>> So now I'm trying to resolve all these problems to replicate the
>>>>>>>>>> same
>>>>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github
>>>>>>>>>> repository
>>>>>>>>>> in my account (which will be pushed later into a Stanbol branch
>>>>>>>>>> after the
>>>>>>>>>> project) in order to track the advances.
>>>>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous
>>>>>>>>>> message to find
>>>>>>>>>> out the best way to deploy (and manage) route configurations
>>>>>>>>>> (felix
>>>>>>>>>> artifacts, watchservice java, rest endpoint to receive xml
>>>>>>>>>> routes, etc).
>>>>>>>>>>
>>>>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>>>
>>>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>>>
>>>>>>>>>>>   As the result of Enhancement Routes is content + metadata I
>>>>>>>>>>> can not
>>>>>>>>>>>
>>>>>>>>>>>> see what you want to "store" in the Entityhub that is about
>>>>>>>>>>>> managing
>>>>>>>>>>>> Entities.
>>>>>>>>>>>>
>>>>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>>>>
>>>>>>>>>>>>   >  - contenthub: To develop a new content-hub using
>>>>>>>>>>>> chain/engine
>>>>>>>>>>>>
>>>>>>>>>>>>> components
>>>>>>>>>>>>>
>>>>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and
>>>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>>>
>>>>>>>>>>>>> component
>>>>>>>>>>>>>
>>>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>>>
>>>>>>>>>>>>> IMO implementing a new Contenthub like component is outside the
>>>>>>>>>>>> scope
>>>>>>>>>>>> of this GSoC project. However If there is already
>>>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>>>> component it would be a really useful thing
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Regarding this, in my opinion, the use case of an eventual
>>>>>>>>>>> integration with a Content hub is probably one of the most clear
>>>>>>>>>>> for this
>>>>>>>>>>> project. I'm not sure if that is what Antonio was trying to
>>>>>>>>>>> explain but,
>>>>>>>>>>> with a single route using as last endpoint Solr or any other
>>>>>>>>>>> backend
>>>>>>>>>>> system, we would be almost cloning the same functionality than
>>>>>>>>>>> the previous
>>>>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be
>>>>>>>>>>> dereferenced
>>>>>>>>>>> using the EntityHub before storing the content along with the
>>>>>>>>>>> metadata,
>>>>>>>>>>> which is the point of integration of the EntityHub in such use
>>>>>>>>>>> case. And
>>>>>>>>>>> even most interesting, now with the integration of Marmotta
>>>>>>>>>>> contributed by
>>>>>>>>>>> Rupert, it would be possible to use a whole graph for
>>>>>>>>>>> dereferencing, so
>>>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds
>>>>>>>>>>> to me like
>>>>>>>>>>> an interesting use case.
>>>>>>>>>>>
>>>>>>>>>>> wdyt?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Rafa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi people

Continuing with the project work , I have implemented some improvements to
chain and engine components to allow defining enhancer properties (like
enhancer.engines.dereference.ldpath) in the route component definition.
Example :
from(direct://test).to(engine://dereference-engine?enhancer.engines.dereference.ldpath=EXPRESSION).
As said in previous mails, the engines and chains have to be configured
through Felix console.

Regarding the last discussion about a new kind of ContentHub back to
Stanbol as an use case for the workflow integration, I have successfully
created a custom Camel processor to create the document with the content
and enhancement metadata in order to be sent to Solr. It takes the LDPath
expression (configured in the dereference engine component via
enhancer.engines.dereference.ldpath query parameter or camel component
parameter) to extract the metadata to be indexed. So using a route like
from().to(chain://Default).process(ContentItemProcessor).to(solr://localhost:8983/solr),
we can have new indexed documents in Solr containing the text and the
extracted enhancement metadata in order to be use in semantic searchs in
the external Solr. Of course, the Solr schema needs to be created in the
remote Solr beforehand. It is only a brief proof of concept of such
functionality.

My idea is to use an external Solr to store the content and semantic
metadata for semantic search purposes, as opposite of the old ContentHub
which was using an internal SolrYard, creating the schema from the
configured LDPath expression.

The next step in this task will be create a custom StanbolSolr component,
able to perform the functionality of the previous processor and Solr, but
allowing configuring the LDPath, fields and properties to be extracted and
put as metadata in the new Solr document. These properties will be applied
to the ContentItem metadata, so if an entity dereference engine is
configured with a different LDPath expression or fields, maybe the
properties to be extracted will not exist.
As future improvement of this component, we could add a new conf parameter
specifying a configured dereference engine to be used before applying the
configuration.

Stay tuned for further advances.

As always if you have any questions or comments, please drop some lines
here.

Regards

PS: The example routes used are very simple and lineals, but for some
scenarios, parallel executions of engines, multicast, aggregator, etc
(supported by camel) could be used to speed up the enhancement process.






On Tue, Jul 8, 2014 at 9:46 AM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi Rafa and all
>
> In my opinion, the Content Hub back in Stanbol for Semantic Search
> capabilities is a great use case to be implemented.
> Waiting for Florent's opinion, I could start first only with Solr (whose
> component already exists in Camel but it needs to be adapted like the
> ActiveMQ component) and creating a custom transformer bean for Camel to
> have the original Content Hub. After that, we could think to create the
> SIREn component and the new transformer for it, giving the users the
> possibility of use one of them.
>
> What do you think? Is It an interesting use case for the Camel integration
> application?
>
> Regards
>
>
> On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rh...@apache.org> wrote:
>
>> Hi guys,
>>
>> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>>
>>  Hi all
>>>
>>> Continuing with the project, I have managed successfully the integration
>>> of
>>> activemq camel component (and also jms) into the Stanbol Camel
>>> integration.
>>> This has been a hard task due to the dependencies needed by the component
>>> and also due to the fact that we had to provide an activemq component
>>> configurable through Felix web console.
>>>
>>> With this addition, we are in the position to integrate business logic
>>> into
>>> Stanbol routes through a message service provided by activemq (jms).
>>>
>> Nice Antonio, let's see is someone has an interesting use case to
>> implement in this context.
>>
>>
>>> As a first test, I have deployed a route which consumes messages
>>> (content)
>>> from an activemq queue, enhance them using the default chain and then
>>> write
>>> the result into a file. It's a simple test but it works quite well. In
>>> this
>>> case, Stanbol is working in a standalone mode, that is to say, we don't
>>> have to explicitly call Stanbol to enhance content but Stanbol is
>>> triggered
>>> based on some external events (a new queue message)
>>>
>>> As indicated in the previous mail, I still have some pending things to be
>>> done (because I couldn't do them last week) but in order to go forward
>>> with
>>> the project I ask you for some interesting use cases where to apply the
>>> new
>>> workflow component in order to give added value to it and also in order
>>> to
>>> develop and provide more workflow (camel) components useful for those and
>>> other use cases.
>>>
>> Awaiting for the community feedback and also for Florent's opinion
>> regarding the rest of the project, as I have expressed in recent emails,
>> I'm eager to see the Content Hub back in Stanbol. And this is because of,
>> from the point of view of the use of Stanbol in the enterprise, Semantic
>> Search is one of the most common use cases. So, to have an enterprise
>> search backend as the last component of a processing route in any
>> architecture where stanbol could be plugged sounds key for me. In recent
>> discussions at the Stanbol IRC channel, we have been analysing Siren (
>> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which major
>> advantage is the possibility to index tree structures, allowing then to
>> index structured data without losing full text search capabilities. To
>> refactor old ContentHub component to use Siren is out of scope of this
>> project but, in my opinion, an interesting use case could be to develop a
>> Siren Camel Component and a transformer from ContentItem to Siren Object or
>> whatever and integrate both in Stanbol.
>>
>> What do you guys think?
>>
>> Cheers,
>> Rafa
>>
>>
>>
>>
>>> Regards
>>>
>>>
>>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>  Hi Stanbolers
>>>>
>>>> The GSoC 2014 midterm is here and I want to give you a summary of the
>>>> work
>>>> already done so far:
>>>>
>>>> - Adapted previous Camel integration PoC done by Florent into Stanbol
>>>> 1.0
>>>> version.
>>>> - Improved EngineComponent used by Camel to execute Enhancement Engines
>>>> (configured through Stanbol web console as usual) using the engine://
>>>> uri
>>>> scheme in routes.
>>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>>> using
>>>> the chain:// uri scheme in routes (both Camel components are provided as
>>>> OSGI components, so the uri scheme can be changed through the Stanbol
>>>> web
>>>> console)
>>>> - Created a custom artifact for Apache Felix Fileinstall in order to be
>>>> able to install routes defined in Camel Spring XML DSL placing a route
>>>> file
>>>> (with 'route' extension) in the stanbol/fileinstall directory
>>>> - Created a custom archetype to ease the development of bundles
>>>> containing
>>>> route definitions in Java DSL. The archetype generates a class extending
>>>> 'RouteBuilder' which creates a default Camel direct endpoint used by
>>>> other
>>>> Stanbol Workflow components to execute the route.
>>>> - Created a first version of Workflow API, which contains different OSGI
>>>> components which allow registering Camel components/routes,
>>>> start/stop/execute routes, add/remove components used in routes, etc.
>>>> - REST endpoint is provided to test the execution of routes using REST
>>>> requests (/flow/{routeId} )
>>>> - Modified the PoC full launcher to use all the new bundles to support
>>>> the
>>>> workflow feature.
>>>> - Installed JBoss developer studio which comes with Camel support in
>>>> order
>>>> to create routes in a visual way with the possibility to be exported as
>>>> Spring XML DSL format
>>>>
>>>> Some pending things I will try to do during this week:
>>>> - Improve the web package to create the needed endpoints to query the
>>>> registered routes, registered camel components, etc
>>>> - Improve the web package to remove classes copied from Stanbol jersey
>>>> module used for testing
>>>> - Update README.md files in the repository with all the new information
>>>> - Document the installation and configuration of JBoss developer studio
>>>> for Camel routes creation
>>>> - Create all the JIRA issued related to the work already done
>>>>
>>>>
>>>> For the second part of the project, I would like to read some comments
>>>> about interesting use cases in order to develop the needed Stanbol and
>>>> Camel components to support them.
>>>>
>>>> If you have any comment, please drop some lines in order to discuss the
>>>> new things to be done.
>>>>
>>>> Regards
>>>>
>>>>
>>>>
>>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>  Hi guys
>>>>>
>>>>> Continuing with the project, and as part of the refactoring/new
>>>>> architecture I have started to modify some workflow components in
>>>>> order to
>>>>> create a better API and architecture based on OSGI components. As a
>>>>> first
>>>>> step and in order to have the same behavior than the current one
>>>>> (regarding
>>>>> enhancement process), a chain component has been created to simulate
>>>>> the
>>>>> chain behaviour. This new component uses internally the ChainManager
>>>>> and
>>>>> EnhancementJobManager component to perform the business logic. This
>>>>> way, a
>>>>> new protocol 'chain' can be used in the routes deployed in Stanbol. The
>>>>> chains are configured in the same way, using Stanbol admin console.
>>>>>
>>>>> Now, we can combine single engine executions with chains executions in
>>>>> routes deployed in Stanbol using the alternatives described in previous
>>>>> mails and in the issue [1]. Both engines and chains are configured
>>>>> through
>>>>> Stanbol admin console. You can see the refactoring advances in [2] (a
>>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>>> 1.0). Of
>>>>> course, the Camel EIP and other Camel components can be used in the
>>>>> deployed routes as well.
>>>>>
>>>>> With the new Camel routes support, we can have a Stanbol running and
>>>>> enhancing content without receiving any HTTP request to start the
>>>>> enhancement process, because the routes can be triggered by external
>>>>> events
>>>>> ocurred in a queue, database, etc. Moreover the semantic lifting
>>>>> process
>>>>> can be splitted and merged with some application steps, so the issue
>>>>> [3]
>>>>> requesting asynchronous call support for enhancement could be solved.
>>>>>
>>>>> Anyway, if some of you have any suggestions for new components to be
>>>>> deployed for the second part of the project, or another kind of
>>>>> suggestion,
>>>>> please drop here some lines to continue with the discussion.
>>>>>
>>>>> Regards
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>> [2]
>>>>> https://github.com/adperezmorales/stanbol-camel-
>>>>> workflow/tree/refactoring
>>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>>
>>>>>
>>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>  Hi people
>>>>>>
>>>>>> As part of the GSoC project for the midterm and according to the issue
>>>>>> [1], a custom Apache Felix Fileinstall artifact has been created in
>>>>>> order
>>>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file with
>>>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>>> bundle, the
>>>>>> created launcher has been modified to have that bundle in the OSGI
>>>>>> context
>>>>>> by default.
>>>>>>
>>>>>> So, once the current Camel integration POC has been integrated in
>>>>>> Stanbol 1.0 and extended to support the deployment of routes defined
>>>>>> by
>>>>>> Java DSL (through bundles) and XML (route files), the next step will
>>>>>> be
>>>>>> thinking and redesigning the current architecture trying to avoid the
>>>>>> duplicated code and providing a more extendable and easy to use
>>>>>> Workflow
>>>>>> API, because with the current integration only direct routes can be
>>>>>> triggered using REST API which means that the defined routes must be
>>>>>> configured properly using a direct endpoint consumer. Anyway, routes
>>>>>> starting in some other way like timers are triggered directly in the
>>>>>> deployment, so this has to be taken into account for the new API (and
>>>>>> REST
>>>>>> API).
>>>>>>
>>>>>> In parallel and for the second part, new Stanbol Camel components will
>>>>>> be developed in order to be used in new routes. So if any of you have
>>>>>> use
>>>>>> cases for this involving Stanbol components, please drop some lines
>>>>>> here in
>>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>>
>>>>>> Comments and suggestions are more than welcome
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi stanbolers
>>>>>>>
>>>>>>> As part of the issue [1] , I have created a maven archetype useful to
>>>>>>> generate Camel routes in Java DSL.
>>>>>>> The archetype generates a Java project with all the dependencies and
>>>>>>> one Java class with a method which has to be filled. In this method,
>>>>>>> Camel
>>>>>>> Java DSL syntax is used to create the route.
>>>>>>> By default and as a first approach, the class will use the route name
>>>>>>> given during the project creation to enable a Camel direct endpoint
>>>>>>> with
>>>>>>> such name.
>>>>>>> The code of the first archetype version can be found at [2].
>>>>>>>
>>>>>>> The next task will be providing a Felix custom artifact to be able to
>>>>>>> deploy XML-based routes in Stanbol, placing a custom file in the
>>>>>>> Stanbol
>>>>>>> datafiles directory.
>>>>>>> After that, it will be time to think and redesign the architecture to
>>>>>>> integrate Camel workflows inside Stanbol in a better way, more
>>>>>>> configurable
>>>>>>> and extendable.
>>>>>>>
>>>>>>> Comments and suggestions are more than welcome
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>
>>>>>>>
>>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi all
>>>>>>>>
>>>>>>>> After a hard fight this week, I managed to get it work the Florent's
>>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>>> The code is uploaded in my github account [3]. As I said in a
>>>>>>>> previous
>>>>>>>> mail, I prefer to do it separately and after the project, uploading
>>>>>>>> the
>>>>>>>> developed code into a Stanbol branch.
>>>>>>>>
>>>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>>>>> registered and also new classes and packages, so it was not a
>>>>>>>> trivial task
>>>>>>>> to make work the current proof of concept. Moreover I don't like to
>>>>>>>> simply
>>>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>>>> understand how the things work and how they are developed in order
>>>>>>>> to be
>>>>>>>> able to change/modify them or develop new code around them.
>>>>>>>>
>>>>>>>> The steps done to achieve it have been the following:
>>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>>>>>>> version levels
>>>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>>>> classes and using Java OSGI annotations instead of SCR annotations
>>>>>>>> in
>>>>>>>> Javadoc
>>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>>>> modified needed resources
>>>>>>>> - Added Java OSGI annotations to the route (WeightedChain) instead
>>>>>>>> of
>>>>>>>> SCR annotations in javadoc
>>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>>>> bundles
>>>>>>>>
>>>>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>>>>> For testing purposes, the /flow/{flowName} has been removed, because
>>>>>>>> all this code needs to be re-designed and re-implemented so I only
>>>>>>>> wanted
>>>>>>>> to make it work to have a first (simple) integration in Stanbol
>>>>>>>> 1.0. This
>>>>>>>> functionality will be added again to trigger custom routes once the
>>>>>>>> next
>>>>>>>> step (defined below) is developed.
>>>>>>>>
>>>>>>>> The next step [2] will be support to write and configure routes in
>>>>>>>> XML
>>>>>>>> format, putting the file in datafiles in order to be loaded by a
>>>>>>>> Felix
>>>>>>>> custom artifact (as Rupert pointed out in a previous mail) and
>>>>>>>> create a
>>>>>>>> Maven archetype to create bundles defining routes which will be
>>>>>>>> loaded
>>>>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>>>>> messages, a REST endpoint receiving routes in XML can be developed
>>>>>>>> as an
>>>>>>>> alternative to the first approach. This is my objective for the
>>>>>>>> midterm.
>>>>>>>>
>>>>>>>> After the midterm, the new Stanbol components for Apache Camel will
>>>>>>>> be
>>>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>>>
>>>>>>>> Comments on this and for use cases for Stanbol Camel components are
>>>>>>>> more than welcome.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>>
>>>>>>>>  Hi people
>>>>>>>>>
>>>>>>>>> I have already started to work on [1] to integrate current
>>>>>>>>> Florent's
>>>>>>>>> code into Stanbol 1.0.
>>>>>>>>> As a first approach, only changing the dependency versions to new
>>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>>   - Deprecated use of classes
>>>>>>>>>   - Classes which have changed from package
>>>>>>>>>   - Some classes not necessary now
>>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>>   - ...
>>>>>>>>>
>>>>>>>>> So now I'm trying to resolve all these problems to replicate the
>>>>>>>>> same
>>>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github
>>>>>>>>> repository
>>>>>>>>> in my account (which will be pushed later into a Stanbol branch
>>>>>>>>> after the
>>>>>>>>> project) in order to track the advances.
>>>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous
>>>>>>>>> message to find
>>>>>>>>> out the best way to deploy (and manage) route configurations (felix
>>>>>>>>> artifacts, watchservice java, rest endpoint to receive xml routes,
>>>>>>>>> etc).
>>>>>>>>>
>>>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>>
>>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>>
>>>>>>>>>>   As the result of Enhancement Routes is content + metadata I can
>>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> see what you want to "store" in the Entityhub that is about
>>>>>>>>>>> managing
>>>>>>>>>>> Entities.
>>>>>>>>>>>
>>>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>>>
>>>>>>>>>>>   >  - contenthub: To develop a new content-hub using
>>>>>>>>>>> chain/engine
>>>>>>>>>>>
>>>>>>>>>>>> components
>>>>>>>>>>>>
>>>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and
>>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>>
>>>>>>>>>>>> component
>>>>>>>>>>>>
>>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>>
>>>>>>>>>>>> IMO implementing a new Contenthub like component is outside the
>>>>>>>>>>> scope
>>>>>>>>>>> of this GSoC project. However If there is already
>>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>>> component it would be a really useful thing
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  Regarding this, in my opinion, the use case of an eventual
>>>>>>>>>> integration with a Content hub is probably one of the most clear
>>>>>>>>>> for this
>>>>>>>>>> project. I'm not sure if that is what Antonio was trying to
>>>>>>>>>> explain but,
>>>>>>>>>> with a single route using as last endpoint Solr or any other
>>>>>>>>>> backend
>>>>>>>>>> system, we would be almost cloning the same functionality than
>>>>>>>>>> the previous
>>>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be
>>>>>>>>>> dereferenced
>>>>>>>>>> using the EntityHub before storing the content along with the
>>>>>>>>>> metadata,
>>>>>>>>>> which is the point of integration of the EntityHub in such use
>>>>>>>>>> case. And
>>>>>>>>>> even most interesting, now with the integration of Marmotta
>>>>>>>>>> contributed by
>>>>>>>>>> Rupert, it would be possible to use a whole graph for
>>>>>>>>>> dereferencing, so
>>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds
>>>>>>>>>> to me like
>>>>>>>>>> an interesting use case.
>>>>>>>>>>
>>>>>>>>>> wdyt?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Rafa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Rafa and all

In my opinion, the Content Hub back in Stanbol for Semantic Search
capabilities is a great use case to be implemented.
Waiting for Florent's opinion, I could start first only with Solr (whose
component already exists in Camel but it needs to be adapted like the
ActiveMQ component) and creating a custom transformer bean for Camel to
have the original Content Hub. After that, we could think to create the
SIREn component and the new transformer for it, giving the users the
possibility of use one of them.

What do you think? Is It an interesting use case for the Camel integration
application?

Regards


On Mon, Jul 7, 2014 at 4:27 PM, Rafa Haro <rh...@apache.org> wrote:

> Hi guys,
>
> El 01/07/14 10:20, Antonio David Perez Morales escribió:
>
>  Hi all
>>
>> Continuing with the project, I have managed successfully the integration
>> of
>> activemq camel component (and also jms) into the Stanbol Camel
>> integration.
>> This has been a hard task due to the dependencies needed by the component
>> and also due to the fact that we had to provide an activemq component
>> configurable through Felix web console.
>>
>> With this addition, we are in the position to integrate business logic
>> into
>> Stanbol routes through a message service provided by activemq (jms).
>>
> Nice Antonio, let's see is someone has an interesting use case to
> implement in this context.
>
>
>> As a first test, I have deployed a route which consumes messages (content)
>> from an activemq queue, enhance them using the default chain and then
>> write
>> the result into a file. It's a simple test but it works quite well. In
>> this
>> case, Stanbol is working in a standalone mode, that is to say, we don't
>> have to explicitly call Stanbol to enhance content but Stanbol is
>> triggered
>> based on some external events (a new queue message)
>>
>> As indicated in the previous mail, I still have some pending things to be
>> done (because I couldn't do them last week) but in order to go forward
>> with
>> the project I ask you for some interesting use cases where to apply the
>> new
>> workflow component in order to give added value to it and also in order to
>> develop and provide more workflow (camel) components useful for those and
>> other use cases.
>>
> Awaiting for the community feedback and also for Florent's opinion
> regarding the rest of the project, as I have expressed in recent emails,
> I'm eager to see the Content Hub back in Stanbol. And this is because of,
> from the point of view of the use of Stanbol in the enterprise, Semantic
> Search is one of the most common use cases. So, to have an enterprise
> search backend as the last component of a processing route in any
> architecture where stanbol could be plugged sounds key for me. In recent
> discussions at the Stanbol IRC channel, we have been analysing Siren (
> https://github.com/rdelbru/SIREn), a Lucene/Solr extension which major
> advantage is the possibility to index tree structures, allowing then to
> index structured data without losing full text search capabilities. To
> refactor old ContentHub component to use Siren is out of scope of this
> project but, in my opinion, an interesting use case could be to develop a
> Siren Camel Component and a transformer from ContentItem to Siren Object or
> whatever and integrate both in Stanbol.
>
> What do you guys think?
>
> Cheers,
> Rafa
>
>
>
>
>> Regards
>>
>>
>> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>  Hi Stanbolers
>>>
>>> The GSoC 2014 midterm is here and I want to give you a summary of the
>>> work
>>> already done so far:
>>>
>>> - Adapted previous Camel integration PoC done by Florent into Stanbol 1.0
>>> version.
>>> - Improved EngineComponent used by Camel to execute Enhancement Engines
>>> (configured through Stanbol web console as usual) using the engine:// uri
>>> scheme in routes.
>>> - Created ChainComponent used by Camel to execute Enhancement Chains
>>> using
>>> the chain:// uri scheme in routes (both Camel components are provided as
>>> OSGI components, so the uri scheme can be changed through the Stanbol web
>>> console)
>>> - Created a custom artifact for Apache Felix Fileinstall in order to be
>>> able to install routes defined in Camel Spring XML DSL placing a route
>>> file
>>> (with 'route' extension) in the stanbol/fileinstall directory
>>> - Created a custom archetype to ease the development of bundles
>>> containing
>>> route definitions in Java DSL. The archetype generates a class extending
>>> 'RouteBuilder' which creates a default Camel direct endpoint used by
>>> other
>>> Stanbol Workflow components to execute the route.
>>> - Created a first version of Workflow API, which contains different OSGI
>>> components which allow registering Camel components/routes,
>>> start/stop/execute routes, add/remove components used in routes, etc.
>>> - REST endpoint is provided to test the execution of routes using REST
>>> requests (/flow/{routeId} )
>>> - Modified the PoC full launcher to use all the new bundles to support
>>> the
>>> workflow feature.
>>> - Installed JBoss developer studio which comes with Camel support in
>>> order
>>> to create routes in a visual way with the possibility to be exported as
>>> Spring XML DSL format
>>>
>>> Some pending things I will try to do during this week:
>>> - Improve the web package to create the needed endpoints to query the
>>> registered routes, registered camel components, etc
>>> - Improve the web package to remove classes copied from Stanbol jersey
>>> module used for testing
>>> - Update README.md files in the repository with all the new information
>>> - Document the installation and configuration of JBoss developer studio
>>> for Camel routes creation
>>> - Create all the JIRA issued related to the work already done
>>>
>>>
>>> For the second part of the project, I would like to read some comments
>>> about interesting use cases in order to develop the needed Stanbol and
>>> Camel components to support them.
>>>
>>> If you have any comment, please drop some lines in order to discuss the
>>> new things to be done.
>>>
>>> Regards
>>>
>>>
>>>
>>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>  Hi guys
>>>>
>>>> Continuing with the project, and as part of the refactoring/new
>>>> architecture I have started to modify some workflow components in order
>>>> to
>>>> create a better API and architecture based on OSGI components. As a
>>>> first
>>>> step and in order to have the same behavior than the current one
>>>> (regarding
>>>> enhancement process), a chain component has been created to simulate the
>>>> chain behaviour. This new component uses internally the ChainManager and
>>>> EnhancementJobManager component to perform the business logic. This
>>>> way, a
>>>> new protocol 'chain' can be used in the routes deployed in Stanbol. The
>>>> chains are configured in the same way, using Stanbol admin console.
>>>>
>>>> Now, we can combine single engine executions with chains executions in
>>>> routes deployed in Stanbol using the alternatives described in previous
>>>> mails and in the issue [1]. Both engines and chains are configured
>>>> through
>>>> Stanbol admin console. You can see the refactoring advances in [2] (a
>>>> branch used for refactoring the current PoC of Workflow in Stanbol
>>>> 1.0). Of
>>>> course, the Camel EIP and other Camel components can be used in the
>>>> deployed routes as well.
>>>>
>>>> With the new Camel routes support, we can have a Stanbol running and
>>>> enhancing content without receiving any HTTP request to start the
>>>> enhancement process, because the routes can be triggered by external
>>>> events
>>>> ocurred in a queue, database, etc. Moreover the semantic lifting process
>>>> can be splitted and merged with some application steps, so the issue [3]
>>>> requesting asynchronous call support for enhancement could be solved.
>>>>
>>>> Anyway, if some of you have any suggestions for new components to be
>>>> deployed for the second part of the project, or another kind of
>>>> suggestion,
>>>> please drop here some lines to continue with the discussion.
>>>>
>>>> Regards
>>>>
>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>> [2]
>>>> https://github.com/adperezmorales/stanbol-camel-
>>>> workflow/tree/refactoring
>>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>>
>>>>
>>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>  Hi people
>>>>>
>>>>> As part of the GSoC project for the midterm and according to the issue
>>>>> [1], a custom Apache Felix Fileinstall artifact has been created in
>>>>> order
>>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file with
>>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>>> directory). Moreover since this artifact depends on Fileinstall
>>>>> bundle, the
>>>>> created launcher has been modified to have that bundle in the OSGI
>>>>> context
>>>>> by default.
>>>>>
>>>>> So, once the current Camel integration POC has been integrated in
>>>>> Stanbol 1.0 and extended to support the deployment of routes defined by
>>>>> Java DSL (through bundles) and XML (route files), the next step will be
>>>>> thinking and redesigning the current architecture trying to avoid the
>>>>> duplicated code and providing a more extendable and easy to use
>>>>> Workflow
>>>>> API, because with the current integration only direct routes can be
>>>>> triggered using REST API which means that the defined routes must be
>>>>> configured properly using a direct endpoint consumer. Anyway, routes
>>>>> starting in some other way like timers are triggered directly in the
>>>>> deployment, so this has to be taken into account for the new API (and
>>>>> REST
>>>>> API).
>>>>>
>>>>> In parallel and for the second part, new Stanbol Camel components will
>>>>> be developed in order to be used in new routes. So if any of you have
>>>>> use
>>>>> cases for this involving Stanbol components, please drop some lines
>>>>> here in
>>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>>
>>>>> Comments and suggestions are more than welcome
>>>>>
>>>>> Regards
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>  Hi stanbolers
>>>>>>
>>>>>> As part of the issue [1] , I have created a maven archetype useful to
>>>>>> generate Camel routes in Java DSL.
>>>>>> The archetype generates a Java project with all the dependencies and
>>>>>> one Java class with a method which has to be filled. In this method,
>>>>>> Camel
>>>>>> Java DSL syntax is used to create the route.
>>>>>> By default and as a first approach, the class will use the route name
>>>>>> given during the project creation to enable a Camel direct endpoint
>>>>>> with
>>>>>> such name.
>>>>>> The code of the first archetype version can be found at [2].
>>>>>>
>>>>>> The next task will be providing a Felix custom artifact to be able to
>>>>>> deploy XML-based routes in Stanbol, placing a custom file in the
>>>>>> Stanbol
>>>>>> datafiles directory.
>>>>>> After that, it will be time to think and redesign the architecture to
>>>>>> integrate Camel workflows inside Stanbol in a better way, more
>>>>>> configurable
>>>>>> and extendable.
>>>>>>
>>>>>> Comments and suggestions are more than welcome
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>
>>>>>>
>>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>  Hi all
>>>>>>>
>>>>>>> After a hard fight this week, I managed to get it work the Florent's
>>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>>> The code is uploaded in my github account [3]. As I said in a
>>>>>>> previous
>>>>>>> mail, I prefer to do it separately and after the project, uploading
>>>>>>> the
>>>>>>> developed code into a Stanbol branch.
>>>>>>>
>>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>>>> registered and also new classes and packages, so it was not a
>>>>>>> trivial task
>>>>>>> to make work the current proof of concept. Moreover I don't like to
>>>>>>> simply
>>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>>> understand how the things work and how they are developed in order
>>>>>>> to be
>>>>>>> able to change/modify them or develop new code around them.
>>>>>>>
>>>>>>> The steps done to achieve it have been the following:
>>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>>>>>> version levels
>>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>>> classes and using Java OSGI annotations instead of SCR annotations in
>>>>>>> Javadoc
>>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>>> modified needed resources
>>>>>>> - Added Java OSGI annotations to the route (WeightedChain) instead of
>>>>>>> SCR annotations in javadoc
>>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>>> bundles
>>>>>>>
>>>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>>>> For testing purposes, the /flow/{flowName} has been removed, because
>>>>>>> all this code needs to be re-designed and re-implemented so I only
>>>>>>> wanted
>>>>>>> to make it work to have a first (simple) integration in Stanbol 1.0.
>>>>>>> This
>>>>>>> functionality will be added again to trigger custom routes once the
>>>>>>> next
>>>>>>> step (defined below) is developed.
>>>>>>>
>>>>>>> The next step [2] will be support to write and configure routes in
>>>>>>> XML
>>>>>>> format, putting the file in datafiles in order to be loaded by a
>>>>>>> Felix
>>>>>>> custom artifact (as Rupert pointed out in a previous mail) and
>>>>>>> create a
>>>>>>> Maven archetype to create bundles defining routes which will be
>>>>>>> loaded
>>>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>>>> messages, a REST endpoint receiving routes in XML can be developed
>>>>>>> as an
>>>>>>> alternative to the first approach. This is my objective for the
>>>>>>> midterm.
>>>>>>>
>>>>>>> After the midterm, the new Stanbol components for Apache Camel will
>>>>>>> be
>>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>>
>>>>>>> Comments on this and for use cases for Stanbol Camel components are
>>>>>>> more than welcome.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>>> aperez@zaizi.com> wrote:
>>>>>>>
>>>>>>>  Hi people
>>>>>>>>
>>>>>>>> I have already started to work on [1] to integrate current Florent's
>>>>>>>> code into Stanbol 1.0.
>>>>>>>> As a first approach, only changing the dependency versions to new
>>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>>   - Deprecated use of classes
>>>>>>>>   - Classes which have changed from package
>>>>>>>>   - Some classes not necessary now
>>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>>   - ...
>>>>>>>>
>>>>>>>> So now I'm trying to resolve all these problems to replicate the
>>>>>>>> same
>>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github
>>>>>>>> repository
>>>>>>>> in my account (which will be pushed later into a Stanbol branch
>>>>>>>> after the
>>>>>>>> project) in order to track the advances.
>>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous message
>>>>>>>> to find
>>>>>>>> out the best way to deploy (and manage) route configurations (felix
>>>>>>>> artifacts, watchservice java, rest endpoint to receive xml routes,
>>>>>>>> etc).
>>>>>>>>
>>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>  Hi Rupert, Florent and Antonio
>>>>>>>>>
>>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>>
>>>>>>>>>   As the result of Enhancement Routes is content + metadata I can
>>>>>>>>> not
>>>>>>>>>
>>>>>>>>>> see what you want to "store" in the Entityhub that is about
>>>>>>>>>> managing
>>>>>>>>>> Entities.
>>>>>>>>>>
>>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>>
>>>>>>>>>>   >  - contenthub: To develop a new content-hub using chain/engine
>>>>>>>>>>
>>>>>>>>>>> components
>>>>>>>>>>>
>>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and
>>>>>>>>>>>> elasticsearch
>>>>>>>>>>>>
>>>>>>>>>>> component
>>>>>>>>>>>
>>>>>>>>>>>> already exist in Camel)
>>>>>>>>>>>>
>>>>>>>>>>> IMO implementing a new Contenthub like component is outside the
>>>>>>>>>> scope
>>>>>>>>>> of this GSoC project. However If there is already
>>>>>>>>>> Solr/Elasticsearch
>>>>>>>>>> component it would be a really useful thing
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  Regarding this, in my opinion, the use case of an eventual
>>>>>>>>> integration with a Content hub is probably one of the most clear
>>>>>>>>> for this
>>>>>>>>> project. I'm not sure if that is what Antonio was trying to
>>>>>>>>> explain but,
>>>>>>>>> with a single route using as last endpoint Solr or any other
>>>>>>>>> backend
>>>>>>>>> system, we would be almost cloning the same functionality than the
>>>>>>>>> previous
>>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be
>>>>>>>>> dereferenced
>>>>>>>>> using the EntityHub before storing the content along with the
>>>>>>>>> metadata,
>>>>>>>>> which is the point of integration of the EntityHub in such use
>>>>>>>>> case. And
>>>>>>>>> even most interesting, now with the integration of Marmotta
>>>>>>>>> contributed by
>>>>>>>>> Rupert, it would be possible to use a whole graph for
>>>>>>>>> dereferencing, so
>>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds
>>>>>>>>> to me like
>>>>>>>>> an interesting use case.
>>>>>>>>>
>>>>>>>>> wdyt?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Rafa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rafa Haro <rh...@apache.org>.
Hi guys,

El 01/07/14 10:20, Antonio David Perez Morales escribió:
> Hi all
>
> Continuing with the project, I have managed successfully the integration of
> activemq camel component (and also jms) into the Stanbol Camel integration.
> This has been a hard task due to the dependencies needed by the component
> and also due to the fact that we had to provide an activemq component
> configurable through Felix web console.
>
> With this addition, we are in the position to integrate business logic into
> Stanbol routes through a message service provided by activemq (jms).
Nice Antonio, let's see is someone has an interesting use case to 
implement in this context.
>
> As a first test, I have deployed a route which consumes messages (content)
> from an activemq queue, enhance them using the default chain and then write
> the result into a file. It's a simple test but it works quite well. In this
> case, Stanbol is working in a standalone mode, that is to say, we don't
> have to explicitly call Stanbol to enhance content but Stanbol is triggered
> based on some external events (a new queue message)
>
> As indicated in the previous mail, I still have some pending things to be
> done (because I couldn't do them last week) but in order to go forward with
> the project I ask you for some interesting use cases where to apply the new
> workflow component in order to give added value to it and also in order to
> develop and provide more workflow (camel) components useful for those and
> other use cases.
Awaiting for the community feedback and also for Florent's opinion 
regarding the rest of the project, as I have expressed in recent emails, 
I'm eager to see the Content Hub back in Stanbol. And this is because 
of, from the point of view of the use of Stanbol in the enterprise, 
Semantic Search is one of the most common use cases. So, to have an 
enterprise search backend as the last component of a processing route in 
any architecture where stanbol could be plugged sounds key for me. In 
recent discussions at the Stanbol IRC channel, we have been analysing 
Siren (https://github.com/rdelbru/SIREn), a Lucene/Solr extension which 
major advantage is the possibility to index tree structures, allowing 
then to index structured data without losing full text search 
capabilities. To refactor old ContentHub component to use Siren is out 
of scope of this project but, in my opinion, an interesting use case 
could be to develop a Siren Camel Component and a transformer from 
ContentItem to Siren Object or whatever and integrate both in Stanbol.

What do you guys think?

Cheers,
Rafa


>
> Regards
>
>
> On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi Stanbolers
>>
>> The GSoC 2014 midterm is here and I want to give you a summary of the work
>> already done so far:
>>
>> - Adapted previous Camel integration PoC done by Florent into Stanbol 1.0
>> version.
>> - Improved EngineComponent used by Camel to execute Enhancement Engines
>> (configured through Stanbol web console as usual) using the engine:// uri
>> scheme in routes.
>> - Created ChainComponent used by Camel to execute Enhancement Chains using
>> the chain:// uri scheme in routes (both Camel components are provided as
>> OSGI components, so the uri scheme can be changed through the Stanbol web
>> console)
>> - Created a custom artifact for Apache Felix Fileinstall in order to be
>> able to install routes defined in Camel Spring XML DSL placing a route file
>> (with 'route' extension) in the stanbol/fileinstall directory
>> - Created a custom archetype to ease the development of bundles containing
>> route definitions in Java DSL. The archetype generates a class extending
>> 'RouteBuilder' which creates a default Camel direct endpoint used by other
>> Stanbol Workflow components to execute the route.
>> - Created a first version of Workflow API, which contains different OSGI
>> components which allow registering Camel components/routes,
>> start/stop/execute routes, add/remove components used in routes, etc.
>> - REST endpoint is provided to test the execution of routes using REST
>> requests (/flow/{routeId} )
>> - Modified the PoC full launcher to use all the new bundles to support the
>> workflow feature.
>> - Installed JBoss developer studio which comes with Camel support in order
>> to create routes in a visual way with the possibility to be exported as
>> Spring XML DSL format
>>
>> Some pending things I will try to do during this week:
>> - Improve the web package to create the needed endpoints to query the
>> registered routes, registered camel components, etc
>> - Improve the web package to remove classes copied from Stanbol jersey
>> module used for testing
>> - Update README.md files in the repository with all the new information
>> - Document the installation and configuration of JBoss developer studio
>> for Camel routes creation
>> - Create all the JIRA issued related to the work already done
>>
>>
>> For the second part of the project, I would like to read some comments
>> about interesting use cases in order to develop the needed Stanbol and
>> Camel components to support them.
>>
>> If you have any comment, please drop some lines in order to discuss the
>> new things to be done.
>>
>> Regards
>>
>>
>>
>> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi guys
>>>
>>> Continuing with the project, and as part of the refactoring/new
>>> architecture I have started to modify some workflow components in order to
>>> create a better API and architecture based on OSGI components. As a first
>>> step and in order to have the same behavior than the current one (regarding
>>> enhancement process), a chain component has been created to simulate the
>>> chain behaviour. This new component uses internally the ChainManager and
>>> EnhancementJobManager component to perform the business logic. This way, a
>>> new protocol 'chain' can be used in the routes deployed in Stanbol. The
>>> chains are configured in the same way, using Stanbol admin console.
>>>
>>> Now, we can combine single engine executions with chains executions in
>>> routes deployed in Stanbol using the alternatives described in previous
>>> mails and in the issue [1]. Both engines and chains are configured through
>>> Stanbol admin console. You can see the refactoring advances in [2] (a
>>> branch used for refactoring the current PoC of Workflow in Stanbol 1.0). Of
>>> course, the Camel EIP and other Camel components can be used in the
>>> deployed routes as well.
>>>
>>> With the new Camel routes support, we can have a Stanbol running and
>>> enhancing content without receiving any HTTP request to start the
>>> enhancement process, because the routes can be triggered by external events
>>> ocurred in a queue, database, etc. Moreover the semantic lifting process
>>> can be splitted and merged with some application steps, so the issue [3]
>>> requesting asynchronous call support for enhancement could be solved.
>>>
>>> Anyway, if some of you have any suggestions for new components to be
>>> deployed for the second part of the project, or another kind of suggestion,
>>> please drop here some lines to continue with the discussion.
>>>
>>> Regards
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>> [2]
>>> https://github.com/adperezmorales/stanbol-camel-workflow/tree/refactoring
>>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>>
>>>
>>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>> Hi people
>>>>
>>>> As part of the GSoC project for the midterm and according to the issue
>>>> [1], a custom Apache Felix Fileinstall artifact has been created in order
>>>> to deploy Camel routes defined in XML (Spring DSL) placing a file with
>>>> .route extension in a configured directory (like stanbol/fileinstall
>>>> directory). Moreover since this artifact depends on Fileinstall bundle, the
>>>> created launcher has been modified to have that bundle in the OSGI context
>>>> by default.
>>>>
>>>> So, once the current Camel integration POC has been integrated in
>>>> Stanbol 1.0 and extended to support the deployment of routes defined by
>>>> Java DSL (through bundles) and XML (route files), the next step will be
>>>> thinking and redesigning the current architecture trying to avoid the
>>>> duplicated code and providing a more extendable and easy to use Workflow
>>>> API, because with the current integration only direct routes can be
>>>> triggered using REST API which means that the defined routes must be
>>>> configured properly using a direct endpoint consumer. Anyway, routes
>>>> starting in some other way like timers are triggered directly in the
>>>> deployment, so this has to be taken into account for the new API (and REST
>>>> API).
>>>>
>>>> In parallel and for the second part, new Stanbol Camel components will
>>>> be developed in order to be used in new routes. So if any of you have use
>>>> cases for this involving Stanbol components, please drop some lines here in
>>>> order to prioritize the Stanbol Camel components to be developed.
>>>>
>>>> Comments and suggestions are more than welcome
>>>>
>>>> Regards
>>>>
>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>
>>>>
>>>>
>>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>> Hi stanbolers
>>>>>
>>>>> As part of the issue [1] , I have created a maven archetype useful to
>>>>> generate Camel routes in Java DSL.
>>>>> The archetype generates a Java project with all the dependencies and
>>>>> one Java class with a method which has to be filled. In this method, Camel
>>>>> Java DSL syntax is used to create the route.
>>>>> By default and as a first approach, the class will use the route name
>>>>> given during the project creation to enable a Camel direct endpoint with
>>>>> such name.
>>>>> The code of the first archetype version can be found at [2].
>>>>>
>>>>> The next task will be providing a Felix custom artifact to be able to
>>>>> deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
>>>>> datafiles directory.
>>>>> After that, it will be time to think and redesign the architecture to
>>>>> integrate Camel workflows inside Stanbol in a better way, more configurable
>>>>> and extendable.
>>>>>
>>>>> Comments and suggestions are more than welcome
>>>>>
>>>>> Regards
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>
>>>>>
>>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>> Hi all
>>>>>>
>>>>>> After a hard fight this week, I managed to get it work the Florent's
>>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>>> The code is uploaded in my github account [3]. As I said in a previous
>>>>>> mail, I prefer to do it separately and after the project, uploading the
>>>>>> developed code into a Stanbol branch.
>>>>>>
>>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>>> registered and also new classes and packages, so it was not a trivial task
>>>>>> to make work the current proof of concept. Moreover I don't like to simply
>>>>>> copy and paste code and make the needed changes. I always want to
>>>>>> understand how the things work and how they are developed in order to be
>>>>>> able to change/modify them or develop new code around them.
>>>>>>
>>>>>> The steps done to achieve it have been the following:
>>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>>>>> version levels
>>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>>> classes and using Java OSGI annotations instead of SCR annotations in
>>>>>> Javadoc
>>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>>> modified needed resources
>>>>>> - Added Java OSGI annotations to the route (WeightedChain) instead of
>>>>>> SCR annotations in javadoc
>>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>>> bundles
>>>>>>
>>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>>> For testing purposes, the /flow/{flowName} has been removed, because
>>>>>> all this code needs to be re-designed and re-implemented so I only wanted
>>>>>> to make it work to have a first (simple) integration in Stanbol 1.0. This
>>>>>> functionality will be added again to trigger custom routes once the next
>>>>>> step (defined below) is developed.
>>>>>>
>>>>>> The next step [2] will be support to write and configure routes in XML
>>>>>> format, putting the file in datafiles in order to be loaded by a Felix
>>>>>> custom artifact (as Rupert pointed out in a previous mail) and create a
>>>>>> Maven archetype to create bundles defining routes which will be loaded
>>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>>> messages, a REST endpoint receiving routes in XML can be developed as an
>>>>>> alternative to the first approach. This is my objective for the midterm.
>>>>>>
>>>>>> After the midterm, the new Stanbol components for Apache Camel will be
>>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>>
>>>>>> Comments on this and for use cases for Stanbol Camel components are
>>>>>> more than welcome.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>>
>>>>>>
>>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>>> aperez@zaizi.com> wrote:
>>>>>>
>>>>>>> Hi people
>>>>>>>
>>>>>>> I have already started to work on [1] to integrate current Florent's
>>>>>>> code into Stanbol 1.0.
>>>>>>> As a first approach, only changing the dependency versions to new
>>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>>   - Deprecated use of classes
>>>>>>>   - Classes which have changed from package
>>>>>>>   - Some classes not necessary now
>>>>>>>   - Classes not used which were causing conflicts
>>>>>>>   - ...
>>>>>>>
>>>>>>> So now I'm trying to resolve all these problems to replicate the same
>>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>>>>>>> in my account (which will be pushed later into a Stanbol branch after the
>>>>>>> project) in order to track the advances.
>>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous message to find
>>>>>>> out the best way to deploy (and manage) route configurations (felix
>>>>>>> artifacts, watchservice java, rest endpoint to receive xml routes, etc).
>>>>>>>
>>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>>>>>>
>>>>>>>> Hi Rupert, Florent and Antonio
>>>>>>>>
>>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>>
>>>>>>>>   As the result of Enhancement Routes is content + metadata I can not
>>>>>>>>> see what you want to "store" in the Entityhub that is about managing
>>>>>>>>> Entities.
>>>>>>>>>
>>>>>>>>>   >  - entityhub: To query/update the entityhub component
>>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>>
>>>>>>>>>   >  - contenthub: To develop a new content-hub using chain/engine
>>>>>>>>>> components
>>>>>>>>>>> and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>>>>>>> component
>>>>>>>>>>> already exist in Camel)
>>>>>>>>> IMO implementing a new Contenthub like component is outside the
>>>>>>>>> scope
>>>>>>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>>>>>>> component it would be a really useful thing
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Regarding this, in my opinion, the use case of an eventual
>>>>>>>> integration with a Content hub is probably one of the most clear for this
>>>>>>>> project. I'm not sure if that is what Antonio was trying to explain but,
>>>>>>>> with a single route using as last endpoint Solr or any other backend
>>>>>>>> system, we would be almost cloning the same functionality than the previous
>>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be dereferenced
>>>>>>>> using the EntityHub before storing the content along with the metadata,
>>>>>>>> which is the point of integration of the EntityHub in such use case. And
>>>>>>>> even most interesting, now with the integration of Marmotta contributed by
>>>>>>>> Rupert, it would be possible to use a whole graph for dereferencing, so
>>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds to me like
>>>>>>>> an interesting use case.
>>>>>>>>
>>>>>>>> wdyt?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Rafa
>>>>>>>>
>>>>>>>


Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all

Continuing with the project, I have managed successfully the integration of
activemq camel component (and also jms) into the Stanbol Camel integration.
This has been a hard task due to the dependencies needed by the component
and also due to the fact that we had to provide an activemq component
configurable through Felix web console.

With this addition, we are in the position to integrate business logic into
Stanbol routes through a message service provided by activemq (jms).

As a first test, I have deployed a route which consumes messages (content)
from an activemq queue, enhance them using the default chain and then write
the result into a file. It's a simple test but it works quite well. In this
case, Stanbol is working in a standalone mode, that is to say, we don't
have to explicitly call Stanbol to enhance content but Stanbol is triggered
based on some external events (a new queue message)

As indicated in the previous mail, I still have some pending things to be
done (because I couldn't do them last week) but in order to go forward with
the project I ask you for some interesting use cases where to apply the new
workflow component in order to give added value to it and also in order to
develop and provide more workflow (camel) components useful for those and
other use cases.

Regards


On Mon, Jun 23, 2014 at 6:16 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi Stanbolers
>
> The GSoC 2014 midterm is here and I want to give you a summary of the work
> already done so far:
>
> - Adapted previous Camel integration PoC done by Florent into Stanbol 1.0
> version.
> - Improved EngineComponent used by Camel to execute Enhancement Engines
> (configured through Stanbol web console as usual) using the engine:// uri
> scheme in routes.
> - Created ChainComponent used by Camel to execute Enhancement Chains using
> the chain:// uri scheme in routes (both Camel components are provided as
> OSGI components, so the uri scheme can be changed through the Stanbol web
> console)
> - Created a custom artifact for Apache Felix Fileinstall in order to be
> able to install routes defined in Camel Spring XML DSL placing a route file
> (with 'route' extension) in the stanbol/fileinstall directory
> - Created a custom archetype to ease the development of bundles containing
> route definitions in Java DSL. The archetype generates a class extending
> 'RouteBuilder' which creates a default Camel direct endpoint used by other
> Stanbol Workflow components to execute the route.
> - Created a first version of Workflow API, which contains different OSGI
> components which allow registering Camel components/routes,
> start/stop/execute routes, add/remove components used in routes, etc.
> - REST endpoint is provided to test the execution of routes using REST
> requests (/flow/{routeId} )
> - Modified the PoC full launcher to use all the new bundles to support the
> workflow feature.
> - Installed JBoss developer studio which comes with Camel support in order
> to create routes in a visual way with the possibility to be exported as
> Spring XML DSL format
>
> Some pending things I will try to do during this week:
> - Improve the web package to create the needed endpoints to query the
> registered routes, registered camel components, etc
> - Improve the web package to remove classes copied from Stanbol jersey
> module used for testing
> - Update README.md files in the repository with all the new information
> - Document the installation and configuration of JBoss developer studio
> for Camel routes creation
> - Create all the JIRA issued related to the work already done
>
>
> For the second part of the project, I would like to read some comments
> about interesting use cases in order to develop the needed Stanbol and
> Camel components to support them.
>
> If you have any comment, please drop some lines in order to discuss the
> new things to be done.
>
> Regards
>
>
>
> On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi guys
>>
>> Continuing with the project, and as part of the refactoring/new
>> architecture I have started to modify some workflow components in order to
>> create a better API and architecture based on OSGI components. As a first
>> step and in order to have the same behavior than the current one (regarding
>> enhancement process), a chain component has been created to simulate the
>> chain behaviour. This new component uses internally the ChainManager and
>> EnhancementJobManager component to perform the business logic. This way, a
>> new protocol 'chain' can be used in the routes deployed in Stanbol. The
>> chains are configured in the same way, using Stanbol admin console.
>>
>> Now, we can combine single engine executions with chains executions in
>> routes deployed in Stanbol using the alternatives described in previous
>> mails and in the issue [1]. Both engines and chains are configured through
>> Stanbol admin console. You can see the refactoring advances in [2] (a
>> branch used for refactoring the current PoC of Workflow in Stanbol 1.0). Of
>> course, the Camel EIP and other Camel components can be used in the
>> deployed routes as well.
>>
>> With the new Camel routes support, we can have a Stanbol running and
>> enhancing content without receiving any HTTP request to start the
>> enhancement process, because the routes can be triggered by external events
>> ocurred in a queue, database, etc. Moreover the semantic lifting process
>> can be splitted and merged with some application steps, so the issue [3]
>> requesting asynchronous call support for enhancement could be solved.
>>
>> Anyway, if some of you have any suggestions for new components to be
>> deployed for the second part of the project, or another kind of suggestion,
>> please drop here some lines to continue with the discussion.
>>
>> Regards
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>> [2]
>> https://github.com/adperezmorales/stanbol-camel-workflow/tree/refactoring
>> [3] https://issues.apache.org/jira/browse/STANBOL-263
>>
>>
>> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi people
>>>
>>> As part of the GSoC project for the midterm and according to the issue
>>> [1], a custom Apache Felix Fileinstall artifact has been created in order
>>> to deploy Camel routes defined in XML (Spring DSL) placing a file with
>>> .route extension in a configured directory (like stanbol/fileinstall
>>> directory). Moreover since this artifact depends on Fileinstall bundle, the
>>> created launcher has been modified to have that bundle in the OSGI context
>>> by default.
>>>
>>> So, once the current Camel integration POC has been integrated in
>>> Stanbol 1.0 and extended to support the deployment of routes defined by
>>> Java DSL (through bundles) and XML (route files), the next step will be
>>> thinking and redesigning the current architecture trying to avoid the
>>> duplicated code and providing a more extendable and easy to use Workflow
>>> API, because with the current integration only direct routes can be
>>> triggered using REST API which means that the defined routes must be
>>> configured properly using a direct endpoint consumer. Anyway, routes
>>> starting in some other way like timers are triggered directly in the
>>> deployment, so this has to be taken into account for the new API (and REST
>>> API).
>>>
>>> In parallel and for the second part, new Stanbol Camel components will
>>> be developed in order to be used in new routes. So if any of you have use
>>> cases for this involving Stanbol components, please drop some lines here in
>>> order to prioritize the Stanbol Camel components to be developed.
>>>
>>> Comments and suggestions are more than welcome
>>>
>>> Regards
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>
>>>
>>>
>>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>> Hi stanbolers
>>>>
>>>> As part of the issue [1] , I have created a maven archetype useful to
>>>> generate Camel routes in Java DSL.
>>>> The archetype generates a Java project with all the dependencies and
>>>> one Java class with a method which has to be filled. In this method, Camel
>>>> Java DSL syntax is used to create the route.
>>>> By default and as a first approach, the class will use the route name
>>>> given during the project creation to enable a Camel direct endpoint with
>>>> such name.
>>>> The code of the first archetype version can be found at [2].
>>>>
>>>> The next task will be providing a Felix custom artifact to be able to
>>>> deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
>>>> datafiles directory.
>>>> After that, it will be time to think and redesign the architecture to
>>>> integrate Camel workflows inside Stanbol in a better way, more configurable
>>>> and extendable.
>>>>
>>>> Comments and suggestions are more than welcome
>>>>
>>>> Regards
>>>>
>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>
>>>>
>>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>> Hi all
>>>>>
>>>>> After a hard fight this week, I managed to get it work the Florent's
>>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>>> The code is uploaded in my github account [3]. As I said in a previous
>>>>> mail, I prefer to do it separately and after the project, uploading the
>>>>> developed code into a Stanbol branch.
>>>>>
>>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>>> registered and also new classes and packages, so it was not a trivial task
>>>>> to make work the current proof of concept. Moreover I don't like to simply
>>>>> copy and paste code and make the needed changes. I always want to
>>>>> understand how the things work and how they are developed in order to be
>>>>> able to change/modify them or develop new code around them.
>>>>>
>>>>> The steps done to achieve it have been the following:
>>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>>>> version levels
>>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>>> classes and using Java OSGI annotations instead of SCR annotations in
>>>>> Javadoc
>>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>>> modified needed resources
>>>>> - Added Java OSGI annotations to the route (WeightedChain) instead of
>>>>> SCR annotations in javadoc
>>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed
>>>>> bundles
>>>>>
>>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>>> Camel route (defined by WeightedChain) to call all the registered
>>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>>> For testing purposes, the /flow/{flowName} has been removed, because
>>>>> all this code needs to be re-designed and re-implemented so I only wanted
>>>>> to make it work to have a first (simple) integration in Stanbol 1.0. This
>>>>> functionality will be added again to trigger custom routes once the next
>>>>> step (defined below) is developed.
>>>>>
>>>>> The next step [2] will be support to write and configure routes in XML
>>>>> format, putting the file in datafiles in order to be loaded by a Felix
>>>>> custom artifact (as Rupert pointed out in a previous mail) and create a
>>>>> Maven archetype to create bundles defining routes which will be loaded
>>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>>> messages, a REST endpoint receiving routes in XML can be developed as an
>>>>> alternative to the first approach. This is my objective for the midterm.
>>>>>
>>>>> After the midterm, the new Stanbol components for Apache Camel will be
>>>>> developed and also the new architecture for Camel in Stanbol.
>>>>>
>>>>> Comments on this and for use cases for Stanbol Camel components are
>>>>> more than welcome.
>>>>>
>>>>> Regards
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>>
>>>>>
>>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>>> aperez@zaizi.com> wrote:
>>>>>
>>>>>> Hi people
>>>>>>
>>>>>> I have already started to work on [1] to integrate current Florent's
>>>>>> code into Stanbol 1.0.
>>>>>> As a first approach, only changing the dependency versions to new
>>>>>> Stanbol 1.0, many issues have arisen:
>>>>>>  - Deprecated use of classes
>>>>>>  - Classes which have changed from package
>>>>>>  - Some classes not necessary now
>>>>>>  - Classes not used which were causing conflicts
>>>>>>  - ...
>>>>>>
>>>>>> So now I'm trying to resolve all these problems to replicate the same
>>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>>>>>> in my account (which will be pushed later into a Stanbol branch after the
>>>>>> project) in order to track the advances.
>>>>>> Once I can resolve all these problems, I will take a look to the
>>>>>> Felix Custom Artifacts poiinted out by Rupert in a previous message to find
>>>>>> out the best way to deploy (and manage) route configurations (felix
>>>>>> artifacts, watchservice java, rest endpoint to receive xml routes, etc).
>>>>>>
>>>>>> Comments on this and future tasks are more than welcome.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>>
>>>>>>
>>>>>>
>>>>>>  On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>>>>>
>>>>>>> Hi Rupert, Florent and Antonio
>>>>>>>
>>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>>
>>>>>>>  As the result of Enhancement Routes is content + metadata I can not
>>>>>>>> see what you want to "store" in the Entityhub that is about managing
>>>>>>>> Entities.
>>>>>>>>
>>>>>>>>  >  - entityhub: To query/update the entityhub component
>>>>>>>>>
>>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>>
>>>>>>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>>>>>>> components
>>>>>>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>>>>>> component
>>>>>>>>> >already exist in Camel)
>>>>>>>>>
>>>>>>>> IMO implementing a new Contenthub like component is outside the
>>>>>>>> scope
>>>>>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>>>>>> component it would be a really useful thing
>>>>>>>>
>>>>>>>>
>>>>>>> Regarding this, in my opinion, the use case of an eventual
>>>>>>> integration with a Content hub is probably one of the most clear for this
>>>>>>> project. I'm not sure if that is what Antonio was trying to explain but,
>>>>>>> with a single route using as last endpoint Solr or any other backend
>>>>>>> system, we would be almost cloning the same functionality than the previous
>>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be dereferenced
>>>>>>> using the EntityHub before storing the content along with the metadata,
>>>>>>> which is the point of integration of the EntityHub in such use case. And
>>>>>>> even most interesting, now with the integration of Marmotta contributed by
>>>>>>> Rupert, it would be possible to use a whole graph for dereferencing, so
>>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds to me like
>>>>>>> an interesting use case.
>>>>>>>
>>>>>>> wdyt?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Rafa
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Stanbolers

The GSoC 2014 midterm is here and I want to give you a summary of the work
already done so far:

- Adapted previous Camel integration PoC done by Florent into Stanbol 1.0
version.
- Improved EngineComponent used by Camel to execute Enhancement Engines
(configured through Stanbol web console as usual) using the engine:// uri
scheme in routes.
- Created ChainComponent used by Camel to execute Enhancement Chains using
the chain:// uri scheme in routes (both Camel components are provided as
OSGI components, so the uri scheme can be changed through the Stanbol web
console)
- Created a custom artifact for Apache Felix Fileinstall in order to be
able to install routes defined in Camel Spring XML DSL placing a route file
(with 'route' extension) in the stanbol/fileinstall directory
- Created a custom archetype to ease the development of bundles containing
route definitions in Java DSL. The archetype generates a class extending
'RouteBuilder' which creates a default Camel direct endpoint used by other
Stanbol Workflow components to execute the route.
- Created a first version of Workflow API, which contains different OSGI
components which allow registering Camel components/routes,
start/stop/execute routes, add/remove components used in routes, etc.
- REST endpoint is provided to test the execution of routes using REST
requests (/flow/{routeId} )
- Modified the PoC full launcher to use all the new bundles to support the
workflow feature.
- Installed JBoss developer studio which comes with Camel support in order
to create routes in a visual way with the possibility to be exported as
Spring XML DSL format

Some pending things I will try to do during this week:
- Improve the web package to create the needed endpoints to query the
registered routes, registered camel components, etc
- Improve the web package to remove classes copied from Stanbol jersey
module used for testing
- Update README.md files in the repository with all the new information
- Document the installation and configuration of JBoss developer studio for
Camel routes creation
- Create all the JIRA issued related to the work already done


For the second part of the project, I would like to read some comments
about interesting use cases in order to develop the needed Stanbol and
Camel components to support them.

If you have any comment, please drop some lines in order to discuss the new
things to be done.

Regards



On Sat, Jun 14, 2014 at 3:39 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi guys
>
> Continuing with the project, and as part of the refactoring/new
> architecture I have started to modify some workflow components in order to
> create a better API and architecture based on OSGI components. As a first
> step and in order to have the same behavior than the current one (regarding
> enhancement process), a chain component has been created to simulate the
> chain behaviour. This new component uses internally the ChainManager and
> EnhancementJobManager component to perform the business logic. This way, a
> new protocol 'chain' can be used in the routes deployed in Stanbol. The
> chains are configured in the same way, using Stanbol admin console.
>
> Now, we can combine single engine executions with chains executions in
> routes deployed in Stanbol using the alternatives described in previous
> mails and in the issue [1]. Both engines and chains are configured through
> Stanbol admin console. You can see the refactoring advances in [2] (a
> branch used for refactoring the current PoC of Workflow in Stanbol 1.0). Of
> course, the Camel EIP and other Camel components can be used in the
> deployed routes as well.
>
> With the new Camel routes support, we can have a Stanbol running and
> enhancing content without receiving any HTTP request to start the
> enhancement process, because the routes can be triggered by external events
> ocurred in a queue, database, etc. Moreover the semantic lifting process
> can be splitted and merged with some application steps, so the issue [3]
> requesting asynchronous call support for enhancement could be solved.
>
> Anyway, if some of you have any suggestions for new components to be
> deployed for the second part of the project, or another kind of suggestion,
> please drop here some lines to continue with the discussion.
>
> Regards
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1348
> [2]
> https://github.com/adperezmorales/stanbol-camel-workflow/tree/refactoring
> [3] https://issues.apache.org/jira/browse/STANBOL-263
>
>
> On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi people
>>
>> As part of the GSoC project for the midterm and according to the issue
>> [1], a custom Apache Felix Fileinstall artifact has been created in order
>> to deploy Camel routes defined in XML (Spring DSL) placing a file with
>> .route extension in a configured directory (like stanbol/fileinstall
>> directory). Moreover since this artifact depends on Fileinstall bundle, the
>> created launcher has been modified to have that bundle in the OSGI context
>> by default.
>>
>> So, once the current Camel integration POC has been integrated in Stanbol
>> 1.0 and extended to support the deployment of routes defined by Java DSL
>> (through bundles) and XML (route files), the next step will be thinking and
>> redesigning the current architecture trying to avoid the duplicated code
>> and providing a more extendable and easy to use Workflow API, because with
>> the current integration only direct routes can be triggered using REST API
>> which means that the defined routes must be configured properly using a
>> direct endpoint consumer. Anyway, routes starting in some other way like
>> timers are triggered directly in the deployment, so this has to be taken
>> into account for the new API (and REST API).
>>
>> In parallel and for the second part, new Stanbol Camel components will be
>> developed in order to be used in new routes. So if any of you have use
>> cases for this involving Stanbol components, please drop some lines here in
>> order to prioritize the Stanbol Camel components to be developed.
>>
>> Comments and suggestions are more than welcome
>>
>> Regards
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>
>>
>>
>> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi stanbolers
>>>
>>> As part of the issue [1] , I have created a maven archetype useful to
>>> generate Camel routes in Java DSL.
>>> The archetype generates a Java project with all the dependencies and one
>>> Java class with a method which has to be filled. In this method, Camel Java
>>> DSL syntax is used to create the route.
>>> By default and as a first approach, the class will use the route name
>>> given during the project creation to enable a Camel direct endpoint with
>>> such name.
>>> The code of the first archetype version can be found at [2].
>>>
>>> The next task will be providing a Felix custom artifact to be able to
>>> deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
>>> datafiles directory.
>>> After that, it will be time to think and redesign the architecture to
>>> integrate Camel workflows inside Stanbol in a better way, more configurable
>>> and extendable.
>>>
>>> Comments and suggestions are more than welcome
>>>
>>> Regards
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>
>>>
>>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>> Hi all
>>>>
>>>> After a hard fight this week, I managed to get it work the Florent's
>>>> proof of concept code in the Stanbol 1.0 branch [1]
>>>> The code is uploaded in my github account [3]. As I said in a previous
>>>> mail, I prefer to do it separately and after the project, uploading the
>>>> developed code into a Stanbol branch.
>>>>
>>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>>> registered and also new classes and packages, so it was not a trivial task
>>>> to make work the current proof of concept. Moreover I don't like to simply
>>>> copy and paste code and make the needed changes. I always want to
>>>> understand how the things work and how they are developed in order to be
>>>> able to change/modify them or develop new code around them.
>>>>
>>>> The steps done to achieve it have been the following:
>>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>>> version levels
>>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT
>>>> classes and using Java OSGI annotations instead of SCR annotations in
>>>> Javadoc
>>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>>> modified needed resources
>>>> - Added Java OSGI annotations to the route (WeightedChain) instead of
>>>> SCR annotations in javadoc
>>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed bundles
>>>>
>>>> So now, the http://localhost:8080/flow endpoint will use the only
>>>> Camel route (defined by WeightedChain) to call all the registered
>>>> Enhancement Engines (ordered by EnhancementEngine order property).
>>>> For testing purposes, the /flow/{flowName} has been removed, because
>>>> all this code needs to be re-designed and re-implemented so I only wanted
>>>> to make it work to have a first (simple) integration in Stanbol 1.0. This
>>>> functionality will be added again to trigger custom routes once the next
>>>> step (defined below) is developed.
>>>>
>>>> The next step [2] will be support to write and configure routes in XML
>>>> format, putting the file in datafiles in order to be loaded by a Felix
>>>> custom artifact (as Rupert pointed out in a previous mail) and create a
>>>> Maven archetype to create bundles defining routes which will be loaded
>>>> using the Felix bundle tab. If necessary, as we talked in previous
>>>> messages, a REST endpoint receiving routes in XML can be developed as an
>>>> alternative to the first approach. This is my objective for the midterm.
>>>>
>>>> After the midterm, the new Stanbol components for Apache Camel will be
>>>> developed and also the new architecture for Camel in Stanbol.
>>>>
>>>> Comments on this and for use cases for Stanbol Camel components are
>>>> more than welcome.
>>>>
>>>> Regards
>>>>
>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>>
>>>>
>>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>>> aperez@zaizi.com> wrote:
>>>>
>>>>> Hi people
>>>>>
>>>>> I have already started to work on [1] to integrate current Florent's
>>>>> code into Stanbol 1.0.
>>>>> As a first approach, only changing the dependency versions to new
>>>>> Stanbol 1.0, many issues have arisen:
>>>>>  - Deprecated use of classes
>>>>>  - Classes which have changed from package
>>>>>  - Some classes not necessary now
>>>>>  - Classes not used which were causing conflicts
>>>>>  - ...
>>>>>
>>>>> So now I'm trying to resolve all these problems to replicate the same
>>>>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>>>>> in my account (which will be pushed later into a Stanbol branch after the
>>>>> project) in order to track the advances.
>>>>> Once I can resolve all these problems, I will take a look to the Felix
>>>>> Custom Artifacts poiinted out by Rupert in a previous message to find out
>>>>> the best way to deploy (and manage) route configurations (felix artifacts,
>>>>> watchservice java, rest endpoint to receive xml routes, etc).
>>>>>
>>>>> Comments on this and future tasks are more than welcome.
>>>>>
>>>>> Regards
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>>
>>>>>
>>>>>
>>>>>  On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>>>>
>>>>>> Hi Rupert, Florent and Antonio
>>>>>>
>>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>>
>>>>>>  As the result of Enhancement Routes is content + metadata I can not
>>>>>>> see what you want to "store" in the Entityhub that is about managing
>>>>>>> Entities.
>>>>>>>
>>>>>>>  >  - entityhub: To query/update the entityhub component
>>>>>>>>
>>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>>
>>>>>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>>>>>> components
>>>>>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>>>>> component
>>>>>>>> >already exist in Camel)
>>>>>>>>
>>>>>>> IMO implementing a new Contenthub like component is outside the scope
>>>>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>>>>> component it would be a really useful thing
>>>>>>>
>>>>>>>
>>>>>> Regarding this, in my opinion, the use case of an eventual
>>>>>> integration with a Content hub is probably one of the most clear for this
>>>>>> project. I'm not sure if that is what Antonio was trying to explain but,
>>>>>> with a single route using as last endpoint Solr or any other backend
>>>>>> system, we would be almost cloning the same functionality than the previous
>>>>>> ContentHub implementation (Stanbol 0.12). Entities could be dereferenced
>>>>>> using the EntityHub before storing the content along with the metadata,
>>>>>> which is the point of integration of the EntityHub in such use case. And
>>>>>> even most interesting, now with the integration of Marmotta contributed by
>>>>>> Rupert, it would be possible to use a whole graph for dereferencing, so
>>>>>> "simply" routing components like Enhancer->Marmotta->Solr sounds to me like
>>>>>> an interesting use case.
>>>>>>
>>>>>> wdyt?
>>>>>>
>>>>>> Cheers,
>>>>>> Rafa
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi guys

Continuing with the project, and as part of the refactoring/new
architecture I have started to modify some workflow components in order to
create a better API and architecture based on OSGI components. As a first
step and in order to have the same behavior than the current one (regarding
enhancement process), a chain component has been created to simulate the
chain behaviour. This new component uses internally the ChainManager and
EnhancementJobManager component to perform the business logic. This way, a
new protocol 'chain' can be used in the routes deployed in Stanbol. The
chains are configured in the same way, using Stanbol admin console.

Now, we can combine single engine executions with chains executions in
routes deployed in Stanbol using the alternatives described in previous
mails and in the issue [1]. Both engines and chains are configured through
Stanbol admin console. You can see the refactoring advances in [2] (a
branch used for refactoring the current PoC of Workflow in Stanbol 1.0). Of
course, the Camel EIP and other Camel components can be used in the
deployed routes as well.

With the new Camel routes support, we can have a Stanbol running and
enhancing content without receiving any HTTP request to start the
enhancement process, because the routes can be triggered by external events
ocurred in a queue, database, etc. Moreover the semantic lifting process
can be splitted and merged with some application steps, so the issue [3]
requesting asynchronous call support for enhancement could be solved.

Anyway, if some of you have any suggestions for new components to be
deployed for the second part of the project, or another kind of suggestion,
please drop here some lines to continue with the discussion.

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1348
[2]
https://github.com/adperezmorales/stanbol-camel-workflow/tree/refactoring
[3] https://issues.apache.org/jira/browse/STANBOL-263


On Wed, Jun 11, 2014 at 10:01 AM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi people
>
> As part of the GSoC project for the midterm and according to the issue
> [1], a custom Apache Felix Fileinstall artifact has been created in order
> to deploy Camel routes defined in XML (Spring DSL) placing a file with
> .route extension in a configured directory (like stanbol/fileinstall
> directory). Moreover since this artifact depends on Fileinstall bundle, the
> created launcher has been modified to have that bundle in the OSGI context
> by default.
>
> So, once the current Camel integration POC has been integrated in Stanbol
> 1.0 and extended to support the deployment of routes defined by Java DSL
> (through bundles) and XML (route files), the next step will be thinking and
> redesigning the current architecture trying to avoid the duplicated code
> and providing a more extendable and easy to use Workflow API, because with
> the current integration only direct routes can be triggered using REST API
> which means that the defined routes must be configured properly using a
> direct endpoint consumer. Anyway, routes starting in some other way like
> timers are triggered directly in the deployment, so this has to be taken
> into account for the new API (and REST API).
>
> In parallel and for the second part, new Stanbol Camel components will be
> developed in order to be used in new routes. So if any of you have use
> cases for this involving Stanbol components, please drop some lines here in
> order to prioritize the Stanbol Camel components to be developed.
>
> Comments and suggestions are more than welcome
>
> Regards
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1348
> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>
>
>
> On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi stanbolers
>>
>> As part of the issue [1] , I have created a maven archetype useful to
>> generate Camel routes in Java DSL.
>> The archetype generates a Java project with all the dependencies and one
>> Java class with a method which has to be filled. In this method, Camel Java
>> DSL syntax is used to create the route.
>> By default and as a first approach, the class will use the route name
>> given during the project creation to enable a Camel direct endpoint with
>> such name.
>> The code of the first archetype version can be found at [2].
>>
>> The next task will be providing a Felix custom artifact to be able to
>> deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
>> datafiles directory.
>> After that, it will be time to think and redesign the architecture to
>> integrate Camel workflows inside Stanbol in a better way, more configurable
>> and extendable.
>>
>> Comments and suggestions are more than welcome
>>
>> Regards
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1348
>> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>>
>>
>> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi all
>>>
>>> After a hard fight this week, I managed to get it work the Florent's
>>> proof of concept code in the Stanbol 1.0 branch [1]
>>> The code is uploaded in my github account [3]. As I said in a previous
>>> mail, I prefer to do it separately and after the project, uploading the
>>> developed code into a Stanbol branch.
>>>
>>> The 1.0.0 version has some changes in how the Jersey endpoints are
>>> registered and also new classes and packages, so it was not a trivial task
>>> to make work the current proof of concept. Moreover I don't like to simply
>>> copy and paste code and make the needed changes. I always want to
>>> understand how the things work and how they are developed in order to be
>>> able to change/modify them or develop new code around them.
>>>
>>> The steps done to achieve it have been the following:
>>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>>> version levels
>>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT classes
>>> and using Java OSGI annotations instead of SCR annotations in Javadoc
>>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and
>>> modified needed resources
>>> - Added Java OSGI annotations to the route (WeightedChain) instead of
>>> SCR annotations in javadoc
>>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed bundles
>>>
>>> So now, the http://localhost:8080/flow endpoint will use the only Camel
>>> route (defined by WeightedChain) to call all the registered Enhancement
>>> Engines (ordered by EnhancementEngine order property).
>>> For testing purposes, the /flow/{flowName} has been removed, because all
>>> this code needs to be re-designed and re-implemented so I only wanted to
>>> make it work to have a first (simple) integration in Stanbol 1.0. This
>>> functionality will be added again to trigger custom routes once the next
>>> step (defined below) is developed.
>>>
>>> The next step [2] will be support to write and configure routes in XML
>>> format, putting the file in datafiles in order to be loaded by a Felix
>>> custom artifact (as Rupert pointed out in a previous mail) and create a
>>> Maven archetype to create bundles defining routes which will be loaded
>>> using the Felix bundle tab. If necessary, as we talked in previous
>>> messages, a REST endpoint receiving routes in XML can be developed as an
>>> alternative to the first approach. This is my objective for the midterm.
>>>
>>> After the midterm, the new Stanbol components for Apache Camel will be
>>> developed and also the new architecture for Camel in Stanbol.
>>>
>>> Comments on this and for use cases for Stanbol Camel components are more
>>> than welcome.
>>>
>>> Regards
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>>
>>>
>>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>>> aperez@zaizi.com> wrote:
>>>
>>>> Hi people
>>>>
>>>> I have already started to work on [1] to integrate current Florent's
>>>> code into Stanbol 1.0.
>>>> As a first approach, only changing the dependency versions to new
>>>> Stanbol 1.0, many issues have arisen:
>>>>  - Deprecated use of classes
>>>>  - Classes which have changed from package
>>>>  - Some classes not necessary now
>>>>  - Classes not used which were causing conflicts
>>>>  - ...
>>>>
>>>> So now I'm trying to resolve all these problems to replicate the same
>>>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>>>> in my account (which will be pushed later into a Stanbol branch after the
>>>> project) in order to track the advances.
>>>> Once I can resolve all these problems, I will take a look to the Felix
>>>> Custom Artifacts poiinted out by Rupert in a previous message to find out
>>>> the best way to deploy (and manage) route configurations (felix artifacts,
>>>> watchservice java, rest endpoint to receive xml routes, etc).
>>>>
>>>> Comments on this and future tasks are more than welcome.
>>>>
>>>> Regards
>>>>
>>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>>
>>>>
>>>>
>>>>  On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>>>
>>>>> Hi Rupert, Florent and Antonio
>>>>>
>>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>>
>>>>>  As the result of Enhancement Routes is content + metadata I can not
>>>>>> see what you want to "store" in the Entityhub that is about managing
>>>>>> Entities.
>>>>>>
>>>>>>  >  - entityhub: To query/update the entityhub component
>>>>>>>
>>>>>> Maybe. If you can come up with a good use case ^^
>>>>>>
>>>>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>>>>> components
>>>>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>>>> component
>>>>>>> >already exist in Camel)
>>>>>>>
>>>>>> IMO implementing a new Contenthub like component is outside the scope
>>>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>>>> component it would be a really useful thing
>>>>>>
>>>>>>
>>>>> Regarding this, in my opinion, the use case of an eventual integration
>>>>> with a Content hub is probably one of the most clear for this project. I'm
>>>>> not sure if that is what Antonio was trying to explain but, with a single
>>>>> route using as last endpoint Solr or any other backend system, we would be
>>>>> almost cloning the same functionality than the previous ContentHub
>>>>> implementation (Stanbol 0.12). Entities could be dereferenced using the
>>>>> EntityHub before storing the content along with the metadata, which is the
>>>>> point of integration of the EntityHub in such use case. And even most
>>>>> interesting, now with the integration of Marmotta contributed by Rupert, it
>>>>> would be possible to use a whole graph for dereferencing, so "simply"
>>>>> routing components like Enhancer->Marmotta->Solr sounds to me like an
>>>>> interesting use case.
>>>>>
>>>>> wdyt?
>>>>>
>>>>> Cheers,
>>>>> Rafa
>>>>>
>>>>
>>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi people

As part of the GSoC project for the midterm and according to the issue [1],
a custom Apache Felix Fileinstall artifact has been created in order to
deploy Camel routes defined in XML (Spring DSL) placing a file with .route
extension in a configured directory (like stanbol/fileinstall directory).
Moreover since this artifact depends on Fileinstall bundle, the created
launcher has been modified to have that bundle in the OSGI context by
default.

So, once the current Camel integration POC has been integrated in Stanbol
1.0 and extended to support the deployment of routes defined by Java DSL
(through bundles) and XML (route files), the next step will be thinking and
redesigning the current architecture trying to avoid the duplicated code
and providing a more extendable and easy to use Workflow API, because with
the current integration only direct routes can be triggered using REST API
which means that the defined routes must be configured properly using a
direct endpoint consumer. Anyway, routes starting in some other way like
timers are triggered directly in the deployment, so this has to be taken
into account for the new API (and REST API).

In parallel and for the second part, new Stanbol Camel components will be
developed in order to be used in new routes. So if any of you have use
cases for this involving Stanbol components, please drop some lines here in
order to prioritize the Stanbol Camel components to be developed.

Comments and suggestions are more than welcome

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1348
[2] https://github.com/adperezmorales/stanbol-camel-workflow/



On Mon, Jun 2, 2014 at 7:00 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi stanbolers
>
> As part of the issue [1] , I have created a maven archetype useful to
> generate Camel routes in Java DSL.
> The archetype generates a Java project with all the dependencies and one
> Java class with a method which has to be filled. In this method, Camel Java
> DSL syntax is used to create the route.
> By default and as a first approach, the class will use the route name
> given during the project creation to enable a Camel direct endpoint with
> such name.
> The code of the first archetype version can be found at [2].
>
> The next task will be providing a Felix custom artifact to be able to
> deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
> datafiles directory.
> After that, it will be time to think and redesign the architecture to
> integrate Camel workflows inside Stanbol in a better way, more configurable
> and extendable.
>
> Comments and suggestions are more than welcome
>
> Regards
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1348
> [2] https://github.com/adperezmorales/stanbol-camel-workflow/
>
>
> On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi all
>>
>> After a hard fight this week, I managed to get it work the Florent's
>> proof of concept code in the Stanbol 1.0 branch [1]
>> The code is uploaded in my github account [3]. As I said in a previous
>> mail, I prefer to do it separately and after the project, uploading the
>> developed code into a Stanbol branch.
>>
>> The 1.0.0 version has some changes in how the Jersey endpoints are
>> registered and also new classes and packages, so it was not a trivial task
>> to make work the current proof of concept. Moreover I don't like to simply
>> copy and paste code and make the needed changes. I always want to
>> understand how the things work and how they are developed in order to be
>> able to change/modify them or develop new code around them.
>>
>> The steps done to achieve it have been the following:
>> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
>> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
>> version levels
>> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT classes
>> and using Java OSGI annotations instead of SCR annotations in Javadoc
>> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and modified
>> needed resources
>> - Added Java OSGI annotations to the route (WeightedChain) instead of SCR
>> annotations in javadoc
>> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed bundles
>>
>> So now, the http://localhost:8080/flow endpoint will use the only Camel
>> route (defined by WeightedChain) to call all the registered Enhancement
>> Engines (ordered by EnhancementEngine order property).
>> For testing purposes, the /flow/{flowName} has been removed, because all
>> this code needs to be re-designed and re-implemented so I only wanted to
>> make it work to have a first (simple) integration in Stanbol 1.0. This
>> functionality will be added again to trigger custom routes once the next
>> step (defined below) is developed.
>>
>> The next step [2] will be support to write and configure routes in XML
>> format, putting the file in datafiles in order to be loaded by a Felix
>> custom artifact (as Rupert pointed out in a previous mail) and create a
>> Maven archetype to create bundles defining routes which will be loaded
>> using the Felix bundle tab. If necessary, as we talked in previous
>> messages, a REST endpoint receiving routes in XML can be developed as an
>> alternative to the first approach. This is my objective for the midterm.
>>
>> After the midterm, the new Stanbol components for Apache Camel will be
>> developed and also the new architecture for Camel in Stanbol.
>>
>> Comments on this and for use cases for Stanbol Camel components are more
>> than welcome.
>>
>> Regards
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>> [2] https://issues.apache.org/jira/browse/STANBOL-1348
>> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>>
>>
>> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
>> aperez@zaizi.com> wrote:
>>
>>> Hi people
>>>
>>> I have already started to work on [1] to integrate current Florent's
>>> code into Stanbol 1.0.
>>> As a first approach, only changing the dependency versions to new
>>> Stanbol 1.0, many issues have arisen:
>>>  - Deprecated use of classes
>>>  - Classes which have changed from package
>>>  - Some classes not necessary now
>>>  - Classes not used which were causing conflicts
>>>  - ...
>>>
>>> So now I'm trying to resolve all these problems to replicate the same
>>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>>> in my account (which will be pushed later into a Stanbol branch after the
>>> project) in order to track the advances.
>>> Once I can resolve all these problems, I will take a look to the Felix
>>> Custom Artifacts poiinted out by Rupert in a previous message to find out
>>> the best way to deploy (and manage) route configurations (felix artifacts,
>>> watchservice java, rest endpoint to receive xml routes, etc).
>>>
>>> Comments on this and future tasks are more than welcome.
>>>
>>> Regards
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>>
>>>
>>>
>>>  On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>>
>>>> Hi Rupert, Florent and Antonio
>>>>
>>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>>
>>>>  As the result of Enhancement Routes is content + metadata I can not
>>>>> see what you want to "store" in the Entityhub that is about managing
>>>>> Entities.
>>>>>
>>>>>  >  - entityhub: To query/update the entityhub component
>>>>>>
>>>>> Maybe. If you can come up with a good use case ^^
>>>>>
>>>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>>>> components
>>>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>>> component
>>>>>> >already exist in Camel)
>>>>>>
>>>>> IMO implementing a new Contenthub like component is outside the scope
>>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>>> component it would be a really useful thing
>>>>>
>>>>>
>>>> Regarding this, in my opinion, the use case of an eventual integration
>>>> with a Content hub is probably one of the most clear for this project. I'm
>>>> not sure if that is what Antonio was trying to explain but, with a single
>>>> route using as last endpoint Solr or any other backend system, we would be
>>>> almost cloning the same functionality than the previous ContentHub
>>>> implementation (Stanbol 0.12). Entities could be dereferenced using the
>>>> EntityHub before storing the content along with the metadata, which is the
>>>> point of integration of the EntityHub in such use case. And even most
>>>> interesting, now with the integration of Marmotta contributed by Rupert, it
>>>> would be possible to use a whole graph for dereferencing, so "simply"
>>>> routing components like Enhancer->Marmotta->Solr sounds to me like an
>>>> interesting use case.
>>>>
>>>> wdyt?
>>>>
>>>> Cheers,
>>>> Rafa
>>>>
>>>
>>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi stanbolers

As part of the issue [1] , I have created a maven archetype useful to
generate Camel routes in Java DSL.
The archetype generates a Java project with all the dependencies and one
Java class with a method which has to be filled. In this method, Camel Java
DSL syntax is used to create the route.
By default and as a first approach, the class will use the route name given
during the project creation to enable a Camel direct endpoint with such
name.
The code of the first archetype version can be found at [2].

The next task will be providing a Felix custom artifact to be able to
deploy XML-based routes in Stanbol, placing a custom file in the Stanbol
datafiles directory.
After that, it will be time to think and redesign the architecture to
integrate Camel workflows inside Stanbol in a better way, more configurable
and extendable.

Comments and suggestions are more than welcome

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1348
[2] https://github.com/adperezmorales/stanbol-camel-workflow/


On Fri, May 30, 2014 at 8:03 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi all
>
> After a hard fight this week, I managed to get it work the Florent's proof
> of concept code in the Stanbol 1.0 branch [1]
> The code is uploaded in my github account [3]. As I said in a previous
> mail, I prefer to do it separately and after the project, uploading the
> developed code into a Stanbol branch.
>
> The 1.0.0 version has some changes in how the Jersey endpoints are
> registered and also new classes and packages, so it was not a trivial task
> to make work the current proof of concept. Moreover I don't like to simply
> copy and paste code and make the needed changes. I always want to
> understand how the things work and how they are developed in order to be
> able to change/modify them or develop new code around them.
>
> The steps done to achieve it have been the following:
> - Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
> - Updated bundle levels in bundlelist package to fit the Stanbol 1.0
> version levels
> - Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT classes
> and using Java OSGI annotations instead of SCR annotations in Javadoc
> - Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and modified
> needed resources
> - Added Java OSGI annotations to the route (WeightedChain) instead of SCR
> annotations in javadoc
> - Updated launcher to use the 1.0.0-SNAPSHOT packages and needed bundles
>
> So now, the http://localhost:8080/flow endpoint will use the only Camel
> route (defined by WeightedChain) to call all the registered Enhancement
> Engines (ordered by EnhancementEngine order property).
> For testing purposes, the /flow/{flowName} has been removed, because all
> this code needs to be re-designed and re-implemented so I only wanted to
> make it work to have a first (simple) integration in Stanbol 1.0. This
> functionality will be added again to trigger custom routes once the next
> step (defined below) is developed.
>
> The next step [2] will be support to write and configure routes in XML
> format, putting the file in datafiles in order to be loaded by a Felix
> custom artifact (as Rupert pointed out in a previous mail) and create a
> Maven archetype to create bundles defining routes which will be loaded
> using the Felix bundle tab. If necessary, as we talked in previous
> messages, a REST endpoint receiving routes in XML can be developed as an
> alternative to the first approach. This is my objective for the midterm.
>
> After the midterm, the new Stanbol components for Apache Camel will be
> developed and also the new architecture for Camel in Stanbol.
>
> Comments on this and for use cases for Stanbol Camel components are more
> than welcome.
>
> Regards
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1347
> [2] https://issues.apache.org/jira/browse/STANBOL-1348
> [3] https://github.com/adperezmorales/stanbol-camel-workflow/
>
>
> On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
>> Hi people
>>
>> I have already started to work on [1] to integrate current Florent's code
>> into Stanbol 1.0.
>> As a first approach, only changing the dependency versions to new Stanbol
>> 1.0, many issues have arisen:
>>  - Deprecated use of classes
>>  - Classes which have changed from package
>>  - Some classes not necessary now
>>  - Classes not used which were causing conflicts
>>  - ...
>>
>> So now I'm trying to resolve all these problems to replicate the same
>> behavior from 0.9 into 1.0. I will upload the code to a Github repository
>> in my account (which will be pushed later into a Stanbol branch after the
>> project) in order to track the advances.
>> Once I can resolve all these problems, I will take a look to the Felix
>> Custom Artifacts poiinted out by Rupert in a previous message to find out
>> the best way to deploy (and manage) route configurations (felix artifacts,
>> watchservice java, rest endpoint to receive xml routes, etc).
>>
>> Comments on this and future tasks are more than welcome.
>>
>> Regards
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>>
>>
>>
>>  On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>>
>>> Hi Rupert, Florent and Antonio
>>>
>>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>>
>>>  As the result of Enhancement Routes is content + metadata I can not
>>>> see what you want to "store" in the Entityhub that is about managing
>>>> Entities.
>>>>
>>>>  >  - entityhub: To query/update the entityhub component
>>>>>
>>>> Maybe. If you can come up with a good use case ^^
>>>>
>>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>>> components
>>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>>> component
>>>>> >already exist in Camel)
>>>>>
>>>> IMO implementing a new Contenthub like component is outside the scope
>>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>>> component it would be a really useful thing
>>>>
>>>>
>>> Regarding this, in my opinion, the use case of an eventual integration
>>> with a Content hub is probably one of the most clear for this project. I'm
>>> not sure if that is what Antonio was trying to explain but, with a single
>>> route using as last endpoint Solr or any other backend system, we would be
>>> almost cloning the same functionality than the previous ContentHub
>>> implementation (Stanbol 0.12). Entities could be dereferenced using the
>>> EntityHub before storing the content along with the metadata, which is the
>>> point of integration of the EntityHub in such use case. And even most
>>> interesting, now with the integration of Marmotta contributed by Rupert, it
>>> would be possible to use a whole graph for dereferencing, so "simply"
>>> routing components like Enhancer->Marmotta->Solr sounds to me like an
>>> interesting use case.
>>>
>>> wdyt?
>>>
>>> Cheers,
>>> Rafa
>>>
>>
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all

After a hard fight this week, I managed to get it work the Florent's proof
of concept code in the Stanbol 1.0 branch [1]
The code is uploaded in my github account [3]. As I said in a previous
mail, I prefer to do it separately and after the project, uploading the
developed code into a Stanbol branch.

The 1.0.0 version has some changes in how the Jersey endpoints are
registered and also new classes and packages, so it was not a trivial task
to make work the current proof of concept. Moreover I don't like to simply
copy and paste code and make the needed changes. I always want to
understand how the things work and how they are developed in order to be
able to change/modify them or develop new code around them.

The steps done to achieve it have been the following:
- Updated pom files to the Stanbol 1.0.0-SNAPSHOT version
- Updated bundle levels in bundlelist package to fit the Stanbol 1.0
version levels
- Adapted cameljobmanager package code to Stanbol 1.0.0-SNAPSHOT classes
and using Java OSGI annotations instead of SCR annotations in Javadoc
- Updated flow web package to Stanbol 1.0.0-SNAPSHOT classes and modified
needed resources
- Added Java OSGI annotations to the route (WeightedChain) instead of SCR
annotations in javadoc
- Updated launcher to use the 1.0.0-SNAPSHOT packages and needed bundles

So now, the http://localhost:8080/flow endpoint will use the only Camel
route (defined by WeightedChain) to call all the registered Enhancement
Engines (ordered by EnhancementEngine order property).
For testing purposes, the /flow/{flowName} has been removed, because all
this code needs to be re-designed and re-implemented so I only wanted to
make it work to have a first (simple) integration in Stanbol 1.0. This
functionality will be added again to trigger custom routes once the next
step (defined below) is developed.

The next step [2] will be support to write and configure routes in XML
format, putting the file in datafiles in order to be loaded by a Felix
custom artifact (as Rupert pointed out in a previous mail) and create a
Maven archetype to create bundles defining routes which will be loaded
using the Felix bundle tab. If necessary, as we talked in previous
messages, a REST endpoint receiving routes in XML can be developed as an
alternative to the first approach. This is my objective for the midterm.

After the midterm, the new Stanbol components for Apache Camel will be
developed and also the new architecture for Camel in Stanbol.

Comments on this and for use cases for Stanbol Camel components are more
than welcome.

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1347
[2] https://issues.apache.org/jira/browse/STANBOL-1348
[3] https://github.com/adperezmorales/stanbol-camel-workflow/


On Tue, May 27, 2014 at 6:18 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi people
>
> I have already started to work on [1] to integrate current Florent's code
> into Stanbol 1.0.
> As a first approach, only changing the dependency versions to new Stanbol
> 1.0, many issues have arisen:
>  - Deprecated use of classes
>  - Classes which have changed from package
>  - Some classes not necessary now
>  - Classes not used which were causing conflicts
>  - ...
>
> So now I'm trying to resolve all these problems to replicate the same
> behavior from 0.9 into 1.0. I will upload the code to a Github repository
> in my account (which will be pushed later into a Stanbol branch after the
> project) in order to track the advances.
> Once I can resolve all these problems, I will take a look to the Felix
> Custom Artifacts poiinted out by Rupert in a previous message to find out
> the best way to deploy (and manage) route configurations (felix artifacts,
> watchservice java, rest endpoint to receive xml routes, etc).
>
> Comments on this and future tasks are more than welcome.
>
> Regards
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1347
>
>
>
> On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:
>
>> Hi Rupert, Florent and Antonio
>>
>> El 27/05/14 08:51, Rupert Westenthaler escribió:
>>
>>  As the result of Enhancement Routes is content + metadata I can not
>>> see what you want to "store" in the Entityhub that is about managing
>>> Entities.
>>>
>>>  >  - entityhub: To query/update the entityhub component
>>>>
>>> Maybe. If you can come up with a good use case ^^
>>>
>>>  >  - contenthub: To develop a new content-hub using chain/engine
>>>> components
>>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>>> component
>>>> >already exist in Camel)
>>>>
>>> IMO implementing a new Contenthub like component is outside the scope
>>> of this GSoC project. However If there is already Solr/Elasticsearch
>>> component it would be a really useful thing
>>>
>>>
>> Regarding this, in my opinion, the use case of an eventual integration
>> with a Content hub is probably one of the most clear for this project. I'm
>> not sure if that is what Antonio was trying to explain but, with a single
>> route using as last endpoint Solr or any other backend system, we would be
>> almost cloning the same functionality than the previous ContentHub
>> implementation (Stanbol 0.12). Entities could be dereferenced using the
>> EntityHub before storing the content along with the metadata, which is the
>> point of integration of the EntityHub in such use case. And even most
>> interesting, now with the integration of Marmotta contributed by Rupert, it
>> would be possible to use a whole graph for dereferencing, so "simply"
>> routing components like Enhancer->Marmotta->Solr sounds to me like an
>> interesting use case.
>>
>> wdyt?
>>
>> Cheers,
>> Rafa
>>
>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi people

I have already started to work on [1] to integrate current Florent's code
into Stanbol 1.0.
As a first approach, only changing the dependency versions to new Stanbol
1.0, many issues have arisen:
 - Deprecated use of classes
 - Classes which have changed from package
 - Some classes not necessary now
 - Classes not used which were causing conflicts
 - ...

So now I'm trying to resolve all these problems to replicate the same
behavior from 0.9 into 1.0. I will upload the code to a Github repository
in my account (which will be pushed later into a Stanbol branch after the
project) in order to track the advances.
Once I can resolve all these problems, I will take a look to the Felix
Custom Artifacts poiinted out by Rupert in a previous message to find out
the best way to deploy (and manage) route configurations (felix artifacts,
watchservice java, rest endpoint to receive xml routes, etc).

Comments on this and future tasks are more than welcome.

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1347


On Tue, May 27, 2014 at 9:53 AM, Rafa Haro <rh...@apache.org> wrote:

> Hi Rupert, Florent and Antonio
>
> El 27/05/14 08:51, Rupert Westenthaler escribió:
>
>  As the result of Enhancement Routes is content + metadata I can not
>> see what you want to "store" in the Entityhub that is about managing
>> Entities.
>>
>>  >  - entityhub: To query/update the entityhub component
>>>
>> Maybe. If you can come up with a good use case ^^
>>
>>  >  - contenthub: To develop a new content-hub using chain/engine
>>> components
>>> >and solr/elasticsearch/whatever component (solr and elasticsearch
>>> component
>>> >already exist in Camel)
>>>
>> IMO implementing a new Contenthub like component is outside the scope
>> of this GSoC project. However If there is already Solr/Elasticsearch
>> component it would be a really useful thing
>>
>>
> Regarding this, in my opinion, the use case of an eventual integration
> with a Content hub is probably one of the most clear for this project. I'm
> not sure if that is what Antonio was trying to explain but, with a single
> route using as last endpoint Solr or any other backend system, we would be
> almost cloning the same functionality than the previous ContentHub
> implementation (Stanbol 0.12). Entities could be dereferenced using the
> EntityHub before storing the content along with the metadata, which is the
> point of integration of the EntityHub in such use case. And even most
> interesting, now with the integration of Marmotta contributed by Rupert, it
> would be possible to use a whole graph for dereferencing, so "simply"
> routing components like Enhancer->Marmotta->Solr sounds to me like an
> interesting use case.
>
> wdyt?
>
> Cheers,
> Rafa
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rafa Haro <rh...@apache.org>.
Hi Rupert, Florent and Antonio

El 27/05/14 08:51, Rupert Westenthaler escribió:
> As the result of Enhancement Routes is content + metadata I can not
> see what you want to "store" in the Entityhub that is about managing
> Entities.
>
>> >  - entityhub: To query/update the entityhub component
> Maybe. If you can come up with a good use case ^^
>
>> >  - contenthub: To develop a new content-hub using chain/engine components
>> >and solr/elasticsearch/whatever component (solr and elasticsearch component
>> >already exist in Camel)
> IMO implementing a new Contenthub like component is outside the scope
> of this GSoC project. However If there is already Solr/Elasticsearch
> component it would be a really useful thing
>

Regarding this, in my opinion, the use case of an eventual integration 
with a Content hub is probably one of the most clear for this project. 
I'm not sure if that is what Antonio was trying to explain but, with a 
single route using as last endpoint Solr or any other backend system, we 
would be almost cloning the same functionality than the previous 
ContentHub implementation (Stanbol 0.12). Entities could be dereferenced 
using the EntityHub before storing the content along with the metadata, 
which is the point of integration of the EntityHub in such use case. And 
even most interesting, now with the integration of Marmotta contributed 
by Rupert, it would be possible to use a whole graph for dereferencing, 
so "simply" routing components like Enhancer->Marmotta->Solr sounds to 
me like an interesting use case.

wdyt?

Cheers,
Rafa

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rupert Westenthaler <ru...@gmail.com>.
On Mon, May 26, 2014 at 6:58 PM, Antonio David Perez Morales
<ap...@zaizi.com> wrote:
> Hi Florent, Rupert and all
>
> I have already created two new issues as sub-tasks of Stanbol-1008 ([1] and
> [2])
> The first one intends to integrate the current Florent's approach into
> Stanbol 1.0 to see if it works well.
> The second one is about to add support to new routes deployed either as
> bundles, either XML files put in a specific folder (containing routes and
> loaded dynamically) or (if necessary) via a new REST endpoint  receiving
> XML route files to be loaded (or removed).
>

Make sure to evaluate the Apache Sling Installer [3] and/or the Felix
File Installer [4] both can be extended to support custom artifacts
(such as XML route files).

> I think this can be a good advance for the midterm. This way (and
> leveraging the engine camel component created by Florent) we would have
> covered the current Enhancement Chain execution process using Camel routes.
> Well, more powerful because all the existing Camel components could be used
> in the routes to perform advanced (or parallel) processing.
>
> How do you see it guys?
>
> Taking into account that the second part of the GSoC is longer than the
> first one, I would like to open a discussion about the new Camel components
> to be created in the second part in order to be used in routes (apart from
> improve the current engine component already developed). As discussed in
> previous messages some interesting components could be:
>  - chain: In order to create routes based on existing chains

+1

>  - store: To store the result in EntityHub or another store

As the result of Enhancement Routes is content + metadata I can not
see what you want to "store" in the Entityhub that is about managing
Entities.

>  - entityhub: To query/update the entityhub component

Maybe. If you can come up with a good use case ^^

>  - contenthub: To develop a new content-hub using chain/engine components
> and solr/elasticsearch/whatever component (solr and elasticsearch component
> already exist in Camel)

IMO implementing a new Contenthub like component is outside the scope
of this GSoC project. However If there is already Solr/Elasticsearch
component it would be a really useful thing

>  - other components
>
> But I would like to sort them by importance of need based on interesting
> use cases like:
>  - Fetch from : local folder / rss / mail / ...
>  - Enhance with engine 1
>  - Depending on the result of this engine go to :
>    -  Chain 1
>   -  or to Chain 2 and 3 and merge results
>  - Output the result to : email / ftp / ... and contenthub
>
> Do you have any interesting use cases in mind which would be good to have
> in Stanbol? In this way, we can decide which components should be developed
> first.

A strong use-case - possibly Enterprise Integration alike  - would be cool.

best
Rupert

> [1] https://issues.apache.org/jira/browse/STANBOL-1347
> [2] https://issues.apache.org/jira/browse/STANBOL-1348
[3] http://sling.apache.org/documentation/bundles/osgi-installer.html
[4] http://felix.apache.org/site/apache-felix-file-install.html


-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Florent, Rupert and all

I have already created two new issues as sub-tasks of Stanbol-1008 ([1] and
[2])
The first one intends to integrate the current Florent's approach into
Stanbol 1.0 to see if it works well.
The second one is about to add support to new routes deployed either as
bundles, either XML files put in a specific folder (containing routes and
loaded dynamically) or (if necessary) via a new REST endpoint  receiving
XML route files to be loaded (or removed).

I think this can be a good advance for the midterm. This way (and
leveraging the engine camel component created by Florent) we would have
covered the current Enhancement Chain execution process using Camel routes.
Well, more powerful because all the existing Camel components could be used
in the routes to perform advanced (or parallel) processing.

How do you see it guys?

Taking into account that the second part of the GSoC is longer than the
first one, I would like to open a discussion about the new Camel components
to be created in the second part in order to be used in routes (apart from
improve the current engine component already developed). As discussed in
previous messages some interesting components could be:
 - chain: In order to create routes based on existing chains
 - store: To store the result in EntityHub or another store
 - entityhub: To query/update the entityhub component
 - contenthub: To develop a new content-hub using chain/engine components
and solr/elasticsearch/whatever component (solr and elasticsearch component
already exist in Camel)
 - other components

But I would like to sort them by importance of need based on interesting
use cases like:
 - Fetch from : local folder / rss / mail / ...
 - Enhance with engine 1
 - Depending on the result of this engine go to :
   -  Chain 1
  -  or to Chain 2 and 3 and merge results
 - Output the result to : email / ftp / ... and contenthub

Do you have any interesting use cases in mind which would be good to have
in Stanbol? In this way, we can decide which components should be developed
first.

Regards

[1] https://issues.apache.org/jira/browse/STANBOL-1347
[2] https://issues.apache.org/jira/browse/STANBOL-1348


On Mon, May 26, 2014 at 2:55 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi all
>
> Ok perfect, so I will use 1.0 branch for the developments and I'll try to
> integrate the current Florent's code into Stanbol 1.0.
>
> Regards
>
>
> On Mon, May 26, 2014 at 1:12 PM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi all
>>
>> IMHO all GSoC stuff should use the trunk (1.0.0-SNAPSHOT) as starting
>> point for the branches.
>>
>> @Antonio, Florent: I do not think that we will back port the Camel
>> integration to 0.12.*
>>
>> best
>> Rupert
>>
>> On Mon, May 26, 2014 at 10:18 AM, Antonio David Perez Morales
>> <ap...@zaizi.com> wrote:
>> > Hi all
>> >
>> > OK for the version 1
>> >
>> > About the code, I prefer to create a repository in my github account and
>> > push the code later to Stanbol branch. This way we keep separately the
>> GSoC
>> > code from issue branching.
>> >
>> > What do you think?
>> >
>> > Regards
>> >
>> >
>> > On Mon, May 26, 2014 at 10:11 AM, Florent André <fl...@apache.org>
>> wrote:
>> >
>> >> Antonio,
>> >>
>> >> About the version, Rupert can fix my words, but It's seems that 0.12
>> and
>> >> 1.0 have few differences and up-coming 1.0 release will not break this.
>> >>
>> >> So I thinks it's better so start on 1.0 and port it to 0.12 afterward
>> if
>> >> needed.
>> >>
>> >> Side question :
>> >> As Antonio is commiter, do he commit his code directly on a branch or
>> in a
>> >> side github repository ?
>> >>
>> >> ++
>> >>
>> >>
>> >> On 23/05/2014 14:49, Antonio David Perez Morales wrote:
>> >>
>> >>> Hi Rupert and Florent
>> >>>
>> >>> Of course Florent, I will create the needed issues for the tasks. This
>> >>> week I have been studying in depth the code of the Cameltrial PoC,
>> >>> reading and playing a lot with Camel.
>> >>>
>> >>> Please find my response in lines.
>> >>>
>> >>>
>> >>>     Such routes would have some restrictions: (a)
>> >>>
>> >>>         start with a request,
>> >>>
>> >>>
>> >>>     They not directly answer to a "direct request" but when something
>> is
>> >>>     send to the email address (or put in a directory), the full
>> >>>     Enhancement Route is launched.
>> >>>
>> >>>
>> >>> Camel supports triggering a route based on an endpoint (like direct,
>> >>> http or whatever) or when some event occurs in other component, like a
>> >>> document added to an ActiveMQ queue, a mail sent to a server, etc.
>> >>> So we can support both, the request-triggered method and another
>> >>> combination (leveraging the power of Camel components).
>> >>>
>> >>> For the midterm, I had thought to improve the Florent's code to
>> support
>> >>> configuring route endpoints, and the engines used in each route. This
>> >>> task would act like the current Enhancement Chains but using Camel
>> >>> framework.
>> >>> For the second part of the project (which has more time than the first
>> >>> one) we could add new things like apply real integration patterns
>> inside
>> >>> routes to do parallel processing of engines, etc.
>> >>>
>> >>>
>> >>>     (b) end with a response,
>> >>>
>> >>>     Depending on you camel output, "end with a response" is not
>> exactly
>> >>>     true in an "classical resquest/reponse http thought"...
>> >>>
>> >>>     I mean that the response of a "route" can be a mail sended or an
>> rdf
>> >>>     serialization write to an ftp...
>> >>>
>> >>>
>> >>> For the time being, we can not consider this feature, but we could add
>> >>> it later if necessary to support something more than the classic
>> >>> request/response flow.
>> >>>
>> >>>             === 5) defining and implementing easy routing definition
>> ===
>> >>>
>> >>>             In my first version of code, adding a new route require to
>> >>>             build a bundle
>> >>>             and add it to Stanbol.
>> >>>             The structure, and the code of this bundle is pretty
>> simple
>> >>>             and allow to
>> >>>             code you route with java DSL (with one I pretty like), but
>> >>>             maybe lack a
>> >>>             little bit of flexibility and user friendliness.
>> >>>
>> >>>
>> >>> Here, we could support several alternatives:
>> >>>   - create bundles with classes extending RouteBuilder (to build route
>> >>> definitions and declared as Osgi component) to deploy new routes
>> >>> declared in Java DSL
>> >>> -  deploy routes in XML format, putting a file in an specific
>> directory
>> >>> (Camel Spring XML format)
>> >>> - deploy routes in XML format enabling a REST endpoint receiving XML
>> >>> route definitions.
>> >>>
>> >>>
>> >>>         I would suggest to provide such a RESTful service as part of
>> the
>> >>> the
>> >>>         Felix Webconsole. This would also allow to provide a simple UI
>> >>>         as tab
>> >>>         of the Felix WebConsole (similar to the tab of the
>> >>>         DataFileProvider).
>> >>>
>> >>>
>> >>> This option could be a good to have, but I should do some researches
>> on
>> >>> how to extend Felix WebConsole, so I think this is not a priority
>> right
>> >>> now.
>> >>>
>> >>>
>> >>> By the way, which version do you recommend me to use in order to
>> >>> implement the project, Stanbol 0.12 or 1.0 version?
>> >>>
>> >>> Best regards
>> >>>
>> >>>
>> >>>
>> ------------------------------------------------------------------------
>> >>> This message should be regarded as confidential. If you have received
>> >>> this email in error please notify the sender and destroy it
>> immediately.
>> >>> Statements of intent shall only become binding when confirmed in hard
>> >>> copy by an authorised signatory.
>> >>>
>> >>> Zaizi Ltd is registered in England and Wales with the registration
>> >>> number 6440931. The Registered Office is Brook House, 229 Shepherds
>> Bush
>> >>> Road, London W6 7AN.
>> >>>
>> >>
>> >
>> > --
>> >
>> > ------------------------------
>> > This message should be regarded as confidential. If you have received
>> this
>> > email in error please notify the sender and destroy it immediately.
>> > Statements of intent shall only become binding when confirmed in hard
>> copy
>> > by an authorised signatory.
>> >
>> > Zaizi Ltd is registered in England and Wales with the registration
>> number
>> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> > London W6 7AN.
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO..........................................................................
>> | http://redlink.co/
>>
>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rafa Haro <rh...@apache.org>.
Hi all,

How about creating an Epic issue for all gsoc projects, not only a 
label. We can easily track and follow the work of all students.

Cheers,
Rafa

El 26/05/14 14:55, Antonio David Perez Morales escribió:
> Hi all
>
> Ok perfect, so I will use 1.0 branch for the developments and I'll try to
> integrate the current Florent's code into Stanbol 1.0.
>
> Regards
>
>
> On Mon, May 26, 2014 at 1:12 PM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi all
>>
>> IMHO all GSoC stuff should use the trunk (1.0.0-SNAPSHOT) as starting
>> point for the branches.
>>
>> @Antonio, Florent: I do not think that we will back port the Camel
>> integration to 0.12.*
>>
>> best
>> Rupert
>>
>> On Mon, May 26, 2014 at 10:18 AM, Antonio David Perez Morales
>> <ap...@zaizi.com> wrote:
>>> Hi all
>>>
>>> OK for the version 1
>>>
>>> About the code, I prefer to create a repository in my github account and
>>> push the code later to Stanbol branch. This way we keep separately the
>> GSoC
>>> code from issue branching.
>>>
>>> What do you think?
>>>
>>> Regards
>>>
>>>
>>> On Mon, May 26, 2014 at 10:11 AM, Florent André <fl...@apache.org>
>> wrote:
>>>> Antonio,
>>>>
>>>> About the version, Rupert can fix my words, but It's seems that 0.12 and
>>>> 1.0 have few differences and up-coming 1.0 release will not break this.
>>>>
>>>> So I thinks it's better so start on 1.0 and port it to 0.12 afterward if
>>>> needed.
>>>>
>>>> Side question :
>>>> As Antonio is commiter, do he commit his code directly on a branch or
>> in a
>>>> side github repository ?
>>>>
>>>> ++
>>>>
>>>>
>>>> On 23/05/2014 14:49, Antonio David Perez Morales wrote:
>>>>
>>>>> Hi Rupert and Florent
>>>>>
>>>>> Of course Florent, I will create the needed issues for the tasks. This
>>>>> week I have been studying in depth the code of the Cameltrial PoC,
>>>>> reading and playing a lot with Camel.
>>>>>
>>>>> Please find my response in lines.
>>>>>
>>>>>
>>>>>      Such routes would have some restrictions: (a)
>>>>>
>>>>>          start with a request,
>>>>>
>>>>>
>>>>>      They not directly answer to a "direct request" but when something
>> is
>>>>>      send to the email address (or put in a directory), the full
>>>>>      Enhancement Route is launched.
>>>>>
>>>>>
>>>>> Camel supports triggering a route based on an endpoint (like direct,
>>>>> http or whatever) or when some event occurs in other component, like a
>>>>> document added to an ActiveMQ queue, a mail sent to a server, etc.
>>>>> So we can support both, the request-triggered method and another
>>>>> combination (leveraging the power of Camel components).
>>>>>
>>>>> For the midterm, I had thought to improve the Florent's code to support
>>>>> configuring route endpoints, and the engines used in each route. This
>>>>> task would act like the current Enhancement Chains but using Camel
>>>>> framework.
>>>>> For the second part of the project (which has more time than the first
>>>>> one) we could add new things like apply real integration patterns
>> inside
>>>>> routes to do parallel processing of engines, etc.
>>>>>
>>>>>
>>>>>      (b) end with a response,
>>>>>
>>>>>      Depending on you camel output, "end with a response" is not exactly
>>>>>      true in an "classical resquest/reponse http thought"...
>>>>>
>>>>>      I mean that the response of a "route" can be a mail sended or an
>> rdf
>>>>>      serialization write to an ftp...
>>>>>
>>>>>
>>>>> For the time being, we can not consider this feature, but we could add
>>>>> it later if necessary to support something more than the classic
>>>>> request/response flow.
>>>>>
>>>>>              === 5) defining and implementing easy routing definition
>> ===
>>>>>              In my first version of code, adding a new route require to
>>>>>              build a bundle
>>>>>              and add it to Stanbol.
>>>>>              The structure, and the code of this bundle is pretty simple
>>>>>              and allow to
>>>>>              code you route with java DSL (with one I pretty like), but
>>>>>              maybe lack a
>>>>>              little bit of flexibility and user friendliness.
>>>>>
>>>>>
>>>>> Here, we could support several alternatives:
>>>>>    - create bundles with classes extending RouteBuilder (to build route
>>>>> definitions and declared as Osgi component) to deploy new routes
>>>>> declared in Java DSL
>>>>> -  deploy routes in XML format, putting a file in an specific directory
>>>>> (Camel Spring XML format)
>>>>> - deploy routes in XML format enabling a REST endpoint receiving XML
>>>>> route definitions.
>>>>>
>>>>>
>>>>>          I would suggest to provide such a RESTful service as part of
>> the
>>>>> the
>>>>>          Felix Webconsole. This would also allow to provide a simple UI
>>>>>          as tab
>>>>>          of the Felix WebConsole (similar to the tab of the
>>>>>          DataFileProvider).
>>>>>
>>>>>
>>>>> This option could be a good to have, but I should do some researches on
>>>>> how to extend Felix WebConsole, so I think this is not a priority right
>>>>> now.
>>>>>
>>>>>
>>>>> By the way, which version do you recommend me to use in order to
>>>>> implement the project, Stanbol 0.12 or 1.0 version?
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>> ------------------------------------------------------------------------
>>>>> This message should be regarded as confidential. If you have received
>>>>> this email in error please notify the sender and destroy it
>> immediately.
>>>>> Statements of intent shall only become binding when confirmed in hard
>>>>> copy by an authorised signatory.
>>>>>
>>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>>> number 6440931. The Registered Office is Brook House, 229 Shepherds
>> Bush
>>>>> Road, London W6 7AN.
>>>>>
>>> --
>>>
>>> ------------------------------
>>> This message should be regarded as confidential. If you have received
>> this
>>> email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard
>> copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>> London W6 7AN.
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO..........................................................................
>> | http://redlink.co/
>>


Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all

Ok perfect, so I will use 1.0 branch for the developments and I'll try to
integrate the current Florent's code into Stanbol 1.0.

Regards


On Mon, May 26, 2014 at 1:12 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi all
>
> IMHO all GSoC stuff should use the trunk (1.0.0-SNAPSHOT) as starting
> point for the branches.
>
> @Antonio, Florent: I do not think that we will back port the Camel
> integration to 0.12.*
>
> best
> Rupert
>
> On Mon, May 26, 2014 at 10:18 AM, Antonio David Perez Morales
> <ap...@zaizi.com> wrote:
> > Hi all
> >
> > OK for the version 1
> >
> > About the code, I prefer to create a repository in my github account and
> > push the code later to Stanbol branch. This way we keep separately the
> GSoC
> > code from issue branching.
> >
> > What do you think?
> >
> > Regards
> >
> >
> > On Mon, May 26, 2014 at 10:11 AM, Florent André <fl...@apache.org>
> wrote:
> >
> >> Antonio,
> >>
> >> About the version, Rupert can fix my words, but It's seems that 0.12 and
> >> 1.0 have few differences and up-coming 1.0 release will not break this.
> >>
> >> So I thinks it's better so start on 1.0 and port it to 0.12 afterward if
> >> needed.
> >>
> >> Side question :
> >> As Antonio is commiter, do he commit his code directly on a branch or
> in a
> >> side github repository ?
> >>
> >> ++
> >>
> >>
> >> On 23/05/2014 14:49, Antonio David Perez Morales wrote:
> >>
> >>> Hi Rupert and Florent
> >>>
> >>> Of course Florent, I will create the needed issues for the tasks. This
> >>> week I have been studying in depth the code of the Cameltrial PoC,
> >>> reading and playing a lot with Camel.
> >>>
> >>> Please find my response in lines.
> >>>
> >>>
> >>>     Such routes would have some restrictions: (a)
> >>>
> >>>         start with a request,
> >>>
> >>>
> >>>     They not directly answer to a "direct request" but when something
> is
> >>>     send to the email address (or put in a directory), the full
> >>>     Enhancement Route is launched.
> >>>
> >>>
> >>> Camel supports triggering a route based on an endpoint (like direct,
> >>> http or whatever) or when some event occurs in other component, like a
> >>> document added to an ActiveMQ queue, a mail sent to a server, etc.
> >>> So we can support both, the request-triggered method and another
> >>> combination (leveraging the power of Camel components).
> >>>
> >>> For the midterm, I had thought to improve the Florent's code to support
> >>> configuring route endpoints, and the engines used in each route. This
> >>> task would act like the current Enhancement Chains but using Camel
> >>> framework.
> >>> For the second part of the project (which has more time than the first
> >>> one) we could add new things like apply real integration patterns
> inside
> >>> routes to do parallel processing of engines, etc.
> >>>
> >>>
> >>>     (b) end with a response,
> >>>
> >>>     Depending on you camel output, "end with a response" is not exactly
> >>>     true in an "classical resquest/reponse http thought"...
> >>>
> >>>     I mean that the response of a "route" can be a mail sended or an
> rdf
> >>>     serialization write to an ftp...
> >>>
> >>>
> >>> For the time being, we can not consider this feature, but we could add
> >>> it later if necessary to support something more than the classic
> >>> request/response flow.
> >>>
> >>>             === 5) defining and implementing easy routing definition
> ===
> >>>
> >>>             In my first version of code, adding a new route require to
> >>>             build a bundle
> >>>             and add it to Stanbol.
> >>>             The structure, and the code of this bundle is pretty simple
> >>>             and allow to
> >>>             code you route with java DSL (with one I pretty like), but
> >>>             maybe lack a
> >>>             little bit of flexibility and user friendliness.
> >>>
> >>>
> >>> Here, we could support several alternatives:
> >>>   - create bundles with classes extending RouteBuilder (to build route
> >>> definitions and declared as Osgi component) to deploy new routes
> >>> declared in Java DSL
> >>> -  deploy routes in XML format, putting a file in an specific directory
> >>> (Camel Spring XML format)
> >>> - deploy routes in XML format enabling a REST endpoint receiving XML
> >>> route definitions.
> >>>
> >>>
> >>>         I would suggest to provide such a RESTful service as part of
> the
> >>> the
> >>>         Felix Webconsole. This would also allow to provide a simple UI
> >>>         as tab
> >>>         of the Felix WebConsole (similar to the tab of the
> >>>         DataFileProvider).
> >>>
> >>>
> >>> This option could be a good to have, but I should do some researches on
> >>> how to extend Felix WebConsole, so I think this is not a priority right
> >>> now.
> >>>
> >>>
> >>> By the way, which version do you recommend me to use in order to
> >>> implement the project, Stanbol 0.12 or 1.0 version?
> >>>
> >>> Best regards
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------
> >>> This message should be regarded as confidential. If you have received
> >>> this email in error please notify the sender and destroy it
> immediately.
> >>> Statements of intent shall only become binding when confirmed in hard
> >>> copy by an authorised signatory.
> >>>
> >>> Zaizi Ltd is registered in England and Wales with the registration
> >>> number 6440931. The Registered Office is Brook House, 229 Shepherds
> Bush
> >>> Road, London W6 7AN.
> >>>
> >>
> >
> > --
> >
> > ------------------------------
> > This message should be regarded as confidential. If you have received
> this
> > email in error please notify the sender and destroy it immediately.
> > Statements of intent shall only become binding when confirmed in hard
> copy
> > by an authorised signatory.
> >
> > Zaizi Ltd is registered in England and Wales with the registration number
> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> > London W6 7AN.
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO..........................................................................
> | http://redlink.co/
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi all

IMHO all GSoC stuff should use the trunk (1.0.0-SNAPSHOT) as starting
point for the branches.

@Antonio, Florent: I do not think that we will back port the Camel
integration to 0.12.*

best
Rupert

On Mon, May 26, 2014 at 10:18 AM, Antonio David Perez Morales
<ap...@zaizi.com> wrote:
> Hi all
>
> OK for the version 1
>
> About the code, I prefer to create a repository in my github account and
> push the code later to Stanbol branch. This way we keep separately the GSoC
> code from issue branching.
>
> What do you think?
>
> Regards
>
>
> On Mon, May 26, 2014 at 10:11 AM, Florent André <fl...@apache.org> wrote:
>
>> Antonio,
>>
>> About the version, Rupert can fix my words, but It's seems that 0.12 and
>> 1.0 have few differences and up-coming 1.0 release will not break this.
>>
>> So I thinks it's better so start on 1.0 and port it to 0.12 afterward if
>> needed.
>>
>> Side question :
>> As Antonio is commiter, do he commit his code directly on a branch or in a
>> side github repository ?
>>
>> ++
>>
>>
>> On 23/05/2014 14:49, Antonio David Perez Morales wrote:
>>
>>> Hi Rupert and Florent
>>>
>>> Of course Florent, I will create the needed issues for the tasks. This
>>> week I have been studying in depth the code of the Cameltrial PoC,
>>> reading and playing a lot with Camel.
>>>
>>> Please find my response in lines.
>>>
>>>
>>>     Such routes would have some restrictions: (a)
>>>
>>>         start with a request,
>>>
>>>
>>>     They not directly answer to a "direct request" but when something is
>>>     send to the email address (or put in a directory), the full
>>>     Enhancement Route is launched.
>>>
>>>
>>> Camel supports triggering a route based on an endpoint (like direct,
>>> http or whatever) or when some event occurs in other component, like a
>>> document added to an ActiveMQ queue, a mail sent to a server, etc.
>>> So we can support both, the request-triggered method and another
>>> combination (leveraging the power of Camel components).
>>>
>>> For the midterm, I had thought to improve the Florent's code to support
>>> configuring route endpoints, and the engines used in each route. This
>>> task would act like the current Enhancement Chains but using Camel
>>> framework.
>>> For the second part of the project (which has more time than the first
>>> one) we could add new things like apply real integration patterns inside
>>> routes to do parallel processing of engines, etc.
>>>
>>>
>>>     (b) end with a response,
>>>
>>>     Depending on you camel output, "end with a response" is not exactly
>>>     true in an "classical resquest/reponse http thought"...
>>>
>>>     I mean that the response of a "route" can be a mail sended or an rdf
>>>     serialization write to an ftp...
>>>
>>>
>>> For the time being, we can not consider this feature, but we could add
>>> it later if necessary to support something more than the classic
>>> request/response flow.
>>>
>>>             === 5) defining and implementing easy routing definition ===
>>>
>>>             In my first version of code, adding a new route require to
>>>             build a bundle
>>>             and add it to Stanbol.
>>>             The structure, and the code of this bundle is pretty simple
>>>             and allow to
>>>             code you route with java DSL (with one I pretty like), but
>>>             maybe lack a
>>>             little bit of flexibility and user friendliness.
>>>
>>>
>>> Here, we could support several alternatives:
>>>   - create bundles with classes extending RouteBuilder (to build route
>>> definitions and declared as Osgi component) to deploy new routes
>>> declared in Java DSL
>>> -  deploy routes in XML format, putting a file in an specific directory
>>> (Camel Spring XML format)
>>> - deploy routes in XML format enabling a REST endpoint receiving XML
>>> route definitions.
>>>
>>>
>>>         I would suggest to provide such a RESTful service as part of the
>>> the
>>>         Felix Webconsole. This would also allow to provide a simple UI
>>>         as tab
>>>         of the Felix WebConsole (similar to the tab of the
>>>         DataFileProvider).
>>>
>>>
>>> This option could be a good to have, but I should do some researches on
>>> how to extend Felix WebConsole, so I think this is not a priority right
>>> now.
>>>
>>>
>>> By the way, which version do you recommend me to use in order to
>>> implement the project, Stanbol 0.12 or 1.0 version?
>>>
>>> Best regards
>>>
>>>
>>> ------------------------------------------------------------------------
>>> This message should be regarded as confidential. If you have received
>>> this email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard
>>> copy by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration
>>> number 6440931. The Registered Office is Brook House, 229 Shepherds Bush
>>> Road, London W6 7AN.
>>>
>>
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi all

OK for the version 1

About the code, I prefer to create a repository in my github account and
push the code later to Stanbol branch. This way we keep separately the GSoC
code from issue branching.

What do you think?

Regards


On Mon, May 26, 2014 at 10:11 AM, Florent André <fl...@apache.org> wrote:

> Antonio,
>
> About the version, Rupert can fix my words, but It's seems that 0.12 and
> 1.0 have few differences and up-coming 1.0 release will not break this.
>
> So I thinks it's better so start on 1.0 and port it to 0.12 afterward if
> needed.
>
> Side question :
> As Antonio is commiter, do he commit his code directly on a branch or in a
> side github repository ?
>
> ++
>
>
> On 23/05/2014 14:49, Antonio David Perez Morales wrote:
>
>> Hi Rupert and Florent
>>
>> Of course Florent, I will create the needed issues for the tasks. This
>> week I have been studying in depth the code of the Cameltrial PoC,
>> reading and playing a lot with Camel.
>>
>> Please find my response in lines.
>>
>>
>>     Such routes would have some restrictions: (a)
>>
>>         start with a request,
>>
>>
>>     They not directly answer to a "direct request" but when something is
>>     send to the email address (or put in a directory), the full
>>     Enhancement Route is launched.
>>
>>
>> Camel supports triggering a route based on an endpoint (like direct,
>> http or whatever) or when some event occurs in other component, like a
>> document added to an ActiveMQ queue, a mail sent to a server, etc.
>> So we can support both, the request-triggered method and another
>> combination (leveraging the power of Camel components).
>>
>> For the midterm, I had thought to improve the Florent's code to support
>> configuring route endpoints, and the engines used in each route. This
>> task would act like the current Enhancement Chains but using Camel
>> framework.
>> For the second part of the project (which has more time than the first
>> one) we could add new things like apply real integration patterns inside
>> routes to do parallel processing of engines, etc.
>>
>>
>>     (b) end with a response,
>>
>>     Depending on you camel output, "end with a response" is not exactly
>>     true in an "classical resquest/reponse http thought"...
>>
>>     I mean that the response of a "route" can be a mail sended or an rdf
>>     serialization write to an ftp...
>>
>>
>> For the time being, we can not consider this feature, but we could add
>> it later if necessary to support something more than the classic
>> request/response flow.
>>
>>             === 5) defining and implementing easy routing definition ===
>>
>>             In my first version of code, adding a new route require to
>>             build a bundle
>>             and add it to Stanbol.
>>             The structure, and the code of this bundle is pretty simple
>>             and allow to
>>             code you route with java DSL (with one I pretty like), but
>>             maybe lack a
>>             little bit of flexibility and user friendliness.
>>
>>
>> Here, we could support several alternatives:
>>   - create bundles with classes extending RouteBuilder (to build route
>> definitions and declared as Osgi component) to deploy new routes
>> declared in Java DSL
>> -  deploy routes in XML format, putting a file in an specific directory
>> (Camel Spring XML format)
>> - deploy routes in XML format enabling a REST endpoint receiving XML
>> route definitions.
>>
>>
>>         I would suggest to provide such a RESTful service as part of the
>> the
>>         Felix Webconsole. This would also allow to provide a simple UI
>>         as tab
>>         of the Felix WebConsole (similar to the tab of the
>>         DataFileProvider).
>>
>>
>> This option could be a good to have, but I should do some researches on
>> how to extend Felix WebConsole, so I think this is not a priority right
>> now.
>>
>>
>> By the way, which version do you recommend me to use in order to
>> implement the project, Stanbol 0.12 or 1.0 version?
>>
>> Best regards
>>
>>
>> ------------------------------------------------------------------------
>> This message should be regarded as confidential. If you have received
>> this email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard
>> copy by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration
>> number 6440931. The Registered Office is Brook House, 229 Shepherds Bush
>> Road, London W6 7AN.
>>
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Florent André <fl...@apache.org>.
Antonio,

About the version, Rupert can fix my words, but It's seems that 0.12 and 
1.0 have few differences and up-coming 1.0 release will not break this.

So I thinks it's better so start on 1.0 and port it to 0.12 afterward if 
needed.

Side question :
As Antonio is commiter, do he commit his code directly on a branch or in 
a side github repository ?

++

On 23/05/2014 14:49, Antonio David Perez Morales wrote:
> Hi Rupert and Florent
>
> Of course Florent, I will create the needed issues for the tasks. This
> week I have been studying in depth the code of the Cameltrial PoC,
> reading and playing a lot with Camel.
>
> Please find my response in lines.
>
>
>     Such routes would have some restrictions: (a)
>
>         start with a request,
>
>
>     They not directly answer to a "direct request" but when something is
>     send to the email address (or put in a directory), the full
>     Enhancement Route is launched.
>
>
> Camel supports triggering a route based on an endpoint (like direct,
> http or whatever) or when some event occurs in other component, like a
> document added to an ActiveMQ queue, a mail sent to a server, etc.
> So we can support both, the request-triggered method and another
> combination (leveraging the power of Camel components).
>
> For the midterm, I had thought to improve the Florent's code to support
> configuring route endpoints, and the engines used in each route. This
> task would act like the current Enhancement Chains but using Camel
> framework.
> For the second part of the project (which has more time than the first
> one) we could add new things like apply real integration patterns inside
> routes to do parallel processing of engines, etc.
>
>
>     (b) end with a response,
>
>     Depending on you camel output, "end with a response" is not exactly
>     true in an "classical resquest/reponse http thought"...
>
>     I mean that the response of a "route" can be a mail sended or an rdf
>     serialization write to an ftp...
>
>
> For the time being, we can not consider this feature, but we could add
> it later if necessary to support something more than the classic
> request/response flow.
>
>             === 5) defining and implementing easy routing definition ===
>
>             In my first version of code, adding a new route require to
>             build a bundle
>             and add it to Stanbol.
>             The structure, and the code of this bundle is pretty simple
>             and allow to
>             code you route with java DSL (with one I pretty like), but
>             maybe lack a
>             little bit of flexibility and user friendliness.
>
>
> Here, we could support several alternatives:
>   - create bundles with classes extending RouteBuilder (to build route
> definitions and declared as Osgi component) to deploy new routes
> declared in Java DSL
> -  deploy routes in XML format, putting a file in an specific directory
> (Camel Spring XML format)
> - deploy routes in XML format enabling a REST endpoint receiving XML
> route definitions.
>
>
>         I would suggest to provide such a RESTful service as part of the the
>         Felix Webconsole. This would also allow to provide a simple UI
>         as tab
>         of the Felix WebConsole (similar to the tab of the
>         DataFileProvider).
>
>
> This option could be a good to have, but I should do some researches on
> how to extend Felix WebConsole, so I think this is not a priority right
> now.
>
>
> By the way, which version do you recommend me to use in order to
> implement the project, Stanbol 0.12 or 1.0 version?
>
> Best regards
>
>
> ------------------------------------------------------------------------
> This message should be regarded as confidential. If you have received
> this email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard
> copy by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration
> number 6440931. The Registered Office is Brook House, 229 Shepherds Bush
> Road, London W6 7AN.

Re: Camel integration (was : Re: Community bonding period started)

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Rupert and Florent

Of course Florent, I will create the needed issues for the tasks. This week
I have been studying in depth the code of the Cameltrial PoC, reading and
playing a lot with Camel.

Please find my response in lines.


Such routes would have some restrictions: (a)
>
>> start with a request,
>>
>
> They not directly answer to a "direct request" but when something is send
> to the email address (or put in a directory), the full Enhancement Route is
> launched.
>

Camel supports triggering a route based on an endpoint (like direct, http
or whatever) or when some event occurs in other component, like a document
added to an ActiveMQ queue, a mail sent to a server, etc.
So we can support both, the request-triggered method and another
combination (leveraging the power of Camel components).

For the midterm, I had thought to improve the Florent's code to support
configuring route endpoints, and the engines used in each route. This task
would act like the current Enhancement Chains but using Camel framework.
For the second part of the project (which has more time than the first one)
we could add new things like apply real integration patterns inside routes
to do parallel processing of engines, etc.

>
> (b) end with a response,
>
> Depending on you camel output, "end with a response" is not exactly true
> in an "classical resquest/reponse http thought"...
>
> I mean that the response of a "route" can be a mail sended or an rdf
> serialization write to an ftp...
>
>
For the time being, we can not consider this feature, but we could add it
later if necessary to support something more than the classic
request/response flow.

 === 5) defining and implementing easy routing definition ===
>>>
>>> In my first version of code, adding a new route require to build a bundle
>>> and add it to Stanbol.
>>> The structure, and the code of this bundle is pretty simple and allow to
>>> code you route with java DSL (with one I pretty like), but maybe lack a
>>> little bit of flexibility and user friendliness.
>>>
>>
Here, we could support several alternatives:
 - create bundles with classes extending RouteBuilder (to build route
definitions and declared as Osgi component) to deploy new routes declared
in Java DSL
-  deploy routes in XML format, putting a file in an specific directory
(Camel Spring XML format)
- deploy routes in XML format enabling a REST endpoint receiving XML route
definitions.

>
>  I would suggest to provide such a RESTful service as part of the the
>> Felix Webconsole. This would also allow to provide a simple UI as tab
>> of the Felix WebConsole (similar to the tab of the DataFileProvider).
>>
>
This option could be a good to have, but I should do some researches on how
to extend Felix WebConsole, so I think this is not a priority right now.


By the way, which version do you recommend me to use in order to implement
the project, Stanbol 0.12 or 1.0 version?

Best regards

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Florent, Antonio

Just one clarification ...

On Fri, May 23, 2014 at 11:59 AM, florent andré
<fl...@4sengines.com> wrote:
>
> comments on your restrictions :
>
>
> Such routes would have some restrictions: (a)
>>
>> start with a request,
>
>
> dou you mean a "user" (= curl like) request ?
> I don't think it as to be a "restriction" as some Camel inputs components
> are sort of "pool".
> For example "mail endpoint", "file endpoint", "ftp endpoint", etc...
>
> They not directly answer to a "direct request" but when something is send to
> the email address (or put in a directory), the full Enhancement Route is
> launched.
>
>
> (b) end with a response,
>
> Depending on you camel output, "end with a response" is not exactly true in
> an "classical resquest/reponse http thought"...
>
> I mean that the response of a "route" can be a mail sended or an rdf
> serialization write to an ftp...
>
>
> (c) run synchronously.
>
> Camel provide some tool for asynch.. but surely restrict to a synchronous
> processing is better on a first step (as maybe the previous restrictions)...
>
>

I was thinking about the possibility to map "routes" similar as chains
to the Enhancer RESTful interface

    http://localhost:8080/enhancer/route/{route-name}

To do that routes would need to use the request as source and provide
the results as response.

In contrast Routes that are triggered by copying a file into a
directory or sending a mail to a special e-mail address will not be
mapped to the RESTful interface

best
Rupert


-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Camel integration (was : Re: Community bonding period started)

Posted by florent andré <fl...@4sengines.com>.
Hi,

Thanks for you comments Rupert.

@Antonio, can you put yours / transform this in issues ?

Mine under :

On 14/05/2014 10:23, Rupert Westenthaler wrote:
> Hi Florent, Antonio, all
>
> Here are my throughs
>
> On Wed, May 7, 2014 at 2:07 PM, florent andré
> <fl...@4sengines.com> wrote:
>> Hi Antonio !
>>
>> Thanks for this first description.
>>
>> As first big overview of a work plan, I will say :
>>
>> === 1) find a name for this feature ===
>>
>> As a first step I think we have to find a clear and stable name for this new
>> feature (as we used many until now) :
>> 1) enhancements route
>> 2) Enhancement Workflows
>> 3) flow graphs :
>> 4) ...others...
>>
>> To my opinion :
>> 1) that use terms of both project Stanbol and Camel, and that's good
>> 2) Is there enough clear difference with chains that already exists ? And
>> this name don't show the ability to put and publish from other source IMO
>> 3) show well the configuration opportunities but not the input/output
>> possibilities
>> 4)
>> * Entreprise Semantic Integration (referring to Entreprise Integration
>> Patern)
>> * Semantic Bus ("Bus" frequently used in camel)
>>
>
> I like "(1) Enhancements Route" as name for the feature.
>
> I would like to have the ability to use Enhancements Routes similar to
> Enhancement Chains.

comments on your restrictions :

Such routes would have some restrictions: (a)
> start with a request,

dou you mean a "user" (= curl like) request ?
I don't think it as to be a "restriction" as some Camel inputs 
components are sort of "pool".
For example "mail endpoint", "file endpoint", "ftp endpoint", etc...

They not directly answer to a "direct request" but when something is 
send to the email address (or put in a directory), the full Enhancement 
Route is launched.

(b) end with a response,

Depending on you camel output, "end with a response" is not exactly true 
in an "classical resquest/reponse http thought"...

I mean that the response of a "route" can be a mail sended or an rdf 
serialization write to an ftp...


(c) run synchronously.

Camel provide some tool for asynch.. but surely restrict to a 
synchronous processing is better on a first step (as maybe the previous 
restrictions)...


> If we want we could call such routes "Enhancement Workflows". But
> before deciding this I would like to see some working demos.
>

+1 for working demo !
It's interesting to wonder the difference you do between "route" and 
"workflow" ! :)


> [..]
>

[...]

>
> [..]
>> === 5) defining and implementing easy routing definition ===
>>
>> In my first version of code, adding a new route require to build a bundle
>> and add it to Stanbol.
>> The structure, and the code of this bundle is pretty simple and allow to
>> code you route with java DSL (with one I pretty like), but maybe lack a
>> little bit of flexibility and user friendliness.
>>
>
> Creating/Installing bundles is not a big hurdle. If we provide a good
> Maven Archetype so that users just need to code the route it should
> also quite user friendly.
>

+1 for maven archetype.

> Having a Maven project also has the advantage that its more simple to
> manage additional dependencies (not knowing how likely such would be).
>

Sure we can provide a "bundle list" with all camel osgi components... 
but can be heavy...

On the other hand I remember camel components stack as an easy to add 
bundles...

>> Have to keep this opportunity to use java DSL imo, but adding a more
>> "dynamic" way could be really cool throw a REST endpoint at a first step.
>>
>
> If it is possible to dynamically load/compile and install routes (e.g.
> from a file or even a configuration or request parameter) we should
> definitely provide such a possibility.
>

Yes doable.
In form of xml dsl file or groovy script (at least)

> I would suggest to provide such a RESTful service as part of the the
> Felix Webconsole. This would also allow to provide a simple UI as tab
> of the Felix WebConsole (similar to the tab of the DataFileProvider).
>

+1
as the rest of this thread.

++



> [..]
>> === 7) Thinking of improvements ===
>>
>> Camel offer great opportunities in terms of asynchronous processing, message
>> splitting and merging, parallel processing and dynamic (on conditionals)
>> processing.
>>
>
> The commons.job module already provide means for asynchronous RESTful
> services. The Enhancement Task API extension as suggested by David [1]
> is also related to this
>
>> How can this features can be implemented in ? Needs stanbol's api
>> modifications ?
>>
>> For another example of idea, why not have a route:// (or graph://) protocols
>> that allow to use already previously defined routes into a new route ?
>>
>> This task is more a "daemon" task where you will add ideas and solutions
>> during the implementations of others.
>>
>
> For API modification you should have a look at the suggested API
> changes for the Enhancer 2.0 API (STANBOL-1326 [2]). This is only a
> first proposal and the ideal place to collect ideas/suggestions/...
>
>
> best
> Rupert
>
> [1] http://markmail.org/message/u5meqclsdrq6nx6e
> [2] https://issues.apache.org/jira/browse/
>
>
>> ++
>>
>>
>> On 05/05/2014 10:24, Antonio David Perez Morales wrote:
>>>
>>> Hi Rupert, Florent and all
>>>
>>> My accepted project is "Enhancement Workflows. Enterprise Integration
>>> Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
>>> a set of components for Semantic Content Management. One of the components
>>> is the Enhancer, which can be used to extract features from content. The
>>> Enhancer is organized using Enhancements Chains, which defines how the
>>> content will be processed but they don't allow to integrate the current
>>> process with the business layer. The goal of the project is to bring EIP
>>> to
>>> Stanbol for easing the integration of the Enhancement workflows within the
>>> business layer of enterprise systems. In order to achieve this, Apache
>>> Camel framework is intended to be used as EIP pprovider.
>>>
>>> About my person, I hold a graduate degree in Computer Science Engineering
>>> from the University of Seville and I am currently finishing a Master in
>>> Software Engineering and Technology at that institution. I consider myself
>>> hardworking, problem-solving, quick-learning an open source lover mainly
>>> interested in all related with new technologies either web, mobile or
>>> desktop. I love learning new things and facing new challenges every day. I
>>> have coded for a long time with Java, PHP and Javascript. I use them on my
>>> daily work. I can write clean and structured code following code rules and
>>> applying well-known design patterns to improve the quality and maintenance
>>> of the code. Last year, I have been working as Senior Software Engineer at
>>> the R&D division of Zaizi, an open source consultant specialized in
>>> Content
>>> and Enterprise Content Management Systems. Apache Stanbol is one of the
>>> main components in our current technical stack; therefore, I have been
>>> widely working with it in the last months, both making integrations with
>>> different enterprise systems like ECMs and directly contributing to the
>>> project. As a result of this effort, I have been confirmed as committer of
>>> the project since January 2014.
>>>
>>>
>>> Regarding the project, I have been taking a look at Florent code about the
>>> first approach to integrate Camel into Stanbol. Moreover I have already
>>> started to read more and play with Camel (and Camel Spring) to refresh and
>>> familiarize with it (because I worked with Camel several years ago). As a
>>> first example (which is one of the tasks I want to do in the integration)
>>> I
>>> have been able to deploy in a local folder some files with example Camel
>>> routes defined in XML (camel-spring) and these routes are automatically
>>> loaded by the example application I have deployed. This way, we can
>>> achieve
>>> something similar to the indexing tool, where the indexing result files
>>> are
>>> put in a directory inside Stanbol and automatically the new Entityhub is
>>> generated from those files.
>>>
>>> I have also read the mail Florent pointed out in a previous mail about the
>>> potential Camel protocols (components) which can be developed to map
>>> Chains, Engines and Stores but I would prefer to talk with Florent first
>>> to
>>> decide the tasks to be done and the order of them, because I know the
>>> proposal is very ambitious but achievable.
>>>
>>> So, as first steps (and while waiting to talk with Florent through IRC
>>> channel or whatever) I will continue playing with Camel and I will review
>>> again the current Florent code to have a clearer idea on how to improve
>>> this code in order to be integrated as a first version of the Enhancement
>>> Workflows.
>>>
>>> Please, comments are more than welcome.
>>>
>>> Regards
>>>
>>> -------------------------
>>>
>>> [1] https://issues.apache.org/jira/browse/STANBOL-1008
>>>
>>>
>>> On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
>>> rupert.westenthaler@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Thx florent for the reminder. I would like to ask all 4 Students to
>>>>
>>>> 1. write a mail on this list with a short summary of the GSoC project
>>>> (project summary + link to the stanbol issue, some info about the
>>>> student, first steps). IMO this is important as the Proposals itself
>>>> are not fully public available.
>>>> 2. to join the #stanbol IRC list on freenode.org (also mentors are
>>>> welcome to join ^^). Having the people around on IRC really helps to
>>>> answer simple questions fast.
>>>>
>>>> and welcome to GSoC 2014!
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> On Thu, May 1, 2014 at 1:05 PM, florent andré
>>>> <fl...@4sengines.com> wrote:
>>>>>
>>>>> Hi there !
>>>>>
>>>>> As you may notice Gsoc community bonding period has begin for some time
>>>>
>>>> now.
>>>>>
>>>>>
>>>>> Speaking for Camel/Stanbol integration [1], the good proposal from
>>>>
>>>> Antonio
>>>>>
>>>>> was accepted ! Congrats !
>>>>> So Antonio, now bonding have to start! :)
>>>>>
>>>>>   From my point of view, a good way to bond the community to this
>>>>
>>>> integration
>>>>>
>>>>> could be to create sub-issues to the "can be considered as the main one"
>>>>> STANBOL-1008. So we can see more specific actions you will take and
>>>>
>>>> discuss
>>>>>
>>>>> specific parts in the related issue, and get a global overview when
>>>>
>>>> looking
>>>>>
>>>>> at the parent issue.
>>>>>
>>>>> Antonio what do you think ? Can you do that ?
>>>>>
>>>>> As a side point, I remembered this morning this mail [2] exchange that
>>>>
>>>> can
>>>>>
>>>>> give you pointer or idea for an "easy to set up throw REST" Camel's
>>>>
>>>> routes /
>>>>>
>>>>> flowchart.
>>>>>
>>>>> Happy bonding !
>>>>> ++
>>>>>
>>>>>
>>>>> [1] be warned, don't know if any-one can access it :
>>>>>
>>>>
>>>> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
>>>>>
>>>>>
>>>>> [2]
>>>>>
>>>>
>>>> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
>>>>
>>>>
>>>>
>>>> --
>>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>>> | Bodenlehenstraße 11                              ++43-699-11108907
>>>> | A-5500 Bischofshofen
>>>> |
>>>> REDLINK.CO..........................................................................
>>>> | http://redlink.co/
>>>>
>>>
>>
>
>
>

Re: Camel integration (was : Re: Community bonding period started)

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Florent, Antonio, all

Here are my throughs

On Wed, May 7, 2014 at 2:07 PM, florent andré
<fl...@4sengines.com> wrote:
> Hi Antonio !
>
> Thanks for this first description.
>
> As first big overview of a work plan, I will say :
>
> === 1) find a name for this feature ===
>
> As a first step I think we have to find a clear and stable name for this new
> feature (as we used many until now) :
> 1) enhancements route
> 2) Enhancement Workflows
> 3) flow graphs :
> 4) ...others...
>
> To my opinion :
> 1) that use terms of both project Stanbol and Camel, and that's good
> 2) Is there enough clear difference with chains that already exists ? And
> this name don't show the ability to put and publish from other source IMO
> 3) show well the configuration opportunities but not the input/output
> possibilities
> 4)
> * Entreprise Semantic Integration (referring to Entreprise Integration
> Patern)
> * Semantic Bus ("Bus" frequently used in camel)
>

I like "(1) Enhancements Route" as name for the feature.

I would like to have the ability to use Enhancements Routes similar to
Enhancement Chains. Such routes would have some restrictions: (a)
start with a request, (b) end with a response, (c) run synchronously.
If we want we could call such routes "Enhancement Workflows". But
before deciding this I would like to see some working demos.

[..]

> === 3) Defining and Implementing protocols ===
>
> One of the big strength of Camel is the ability to define in house protocol.
> When working on the first version of the code, I really like the idea of
> getting a :
> engine://
> chain://
> store://
>
> protocols.
> With that we can use the Stanbol's capabilities throw camel without touch
> Stanbol api.

definitely!

[..]
> === 5) defining and implementing easy routing definition ===
>
> In my first version of code, adding a new route require to build a bundle
> and add it to Stanbol.
> The structure, and the code of this bundle is pretty simple and allow to
> code you route with java DSL (with one I pretty like), but maybe lack a
> little bit of flexibility and user friendliness.
>

Creating/Installing bundles is not a big hurdle. If we provide a good
Maven Archetype so that users just need to code the route it should
also quite user friendly.

Having a Maven project also has the advantage that its more simple to
manage additional dependencies (not knowing how likely such would be).

> Have to keep this opportunity to use java DSL imo, but adding a more
> "dynamic" way could be really cool throw a REST endpoint at a first step.
>

If it is possible to dynamically load/compile and install routes (e.g.
from a file or even a configuration or request parameter) we should
definitely provide such a possibility.

I would suggest to provide such a RESTful service as part of the the
Felix Webconsole. This would also allow to provide a simple UI as tab
of the Felix WebConsole (similar to the tab of the DataFileProvider).

[..]
> === 7) Thinking of improvements ===
>
> Camel offer great opportunities in terms of asynchronous processing, message
> splitting and merging, parallel processing and dynamic (on conditionals)
> processing.
>

The commons.job module already provide means for asynchronous RESTful
services. The Enhancement Task API extension as suggested by David [1]
is also related to this

> How can this features can be implemented in ? Needs stanbol's api
> modifications ?
>
> For another example of idea, why not have a route:// (or graph://) protocols
> that allow to use already previously defined routes into a new route ?
>
> This task is more a "daemon" task where you will add ideas and solutions
> during the implementations of others.
>

For API modification you should have a look at the suggested API
changes for the Enhancer 2.0 API (STANBOL-1326 [2]). This is only a
first proposal and the ideal place to collect ideas/suggestions/...


best
Rupert

[1] http://markmail.org/message/u5meqclsdrq6nx6e
[2] https://issues.apache.org/jira/browse/


> ++
>
>
> On 05/05/2014 10:24, Antonio David Perez Morales wrote:
>>
>> Hi Rupert, Florent and all
>>
>> My accepted project is "Enhancement Workflows. Enterprise Integration
>> Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
>> a set of components for Semantic Content Management. One of the components
>> is the Enhancer, which can be used to extract features from content. The
>> Enhancer is organized using Enhancements Chains, which defines how the
>> content will be processed but they don't allow to integrate the current
>> process with the business layer. The goal of the project is to bring EIP
>> to
>> Stanbol for easing the integration of the Enhancement workflows within the
>> business layer of enterprise systems. In order to achieve this, Apache
>> Camel framework is intended to be used as EIP pprovider.
>>
>> About my person, I hold a graduate degree in Computer Science Engineering
>> from the University of Seville and I am currently finishing a Master in
>> Software Engineering and Technology at that institution. I consider myself
>> hardworking, problem-solving, quick-learning an open source lover mainly
>> interested in all related with new technologies either web, mobile or
>> desktop. I love learning new things and facing new challenges every day. I
>> have coded for a long time with Java, PHP and Javascript. I use them on my
>> daily work. I can write clean and structured code following code rules and
>> applying well-known design patterns to improve the quality and maintenance
>> of the code. Last year, I have been working as Senior Software Engineer at
>> the R&D division of Zaizi, an open source consultant specialized in
>> Content
>> and Enterprise Content Management Systems. Apache Stanbol is one of the
>> main components in our current technical stack; therefore, I have been
>> widely working with it in the last months, both making integrations with
>> different enterprise systems like ECMs and directly contributing to the
>> project. As a result of this effort, I have been confirmed as committer of
>> the project since January 2014.
>>
>>
>> Regarding the project, I have been taking a look at Florent code about the
>> first approach to integrate Camel into Stanbol. Moreover I have already
>> started to read more and play with Camel (and Camel Spring) to refresh and
>> familiarize with it (because I worked with Camel several years ago). As a
>> first example (which is one of the tasks I want to do in the integration)
>> I
>> have been able to deploy in a local folder some files with example Camel
>> routes defined in XML (camel-spring) and these routes are automatically
>> loaded by the example application I have deployed. This way, we can
>> achieve
>> something similar to the indexing tool, where the indexing result files
>> are
>> put in a directory inside Stanbol and automatically the new Entityhub is
>> generated from those files.
>>
>> I have also read the mail Florent pointed out in a previous mail about the
>> potential Camel protocols (components) which can be developed to map
>> Chains, Engines and Stores but I would prefer to talk with Florent first
>> to
>> decide the tasks to be done and the order of them, because I know the
>> proposal is very ambitious but achievable.
>>
>> So, as first steps (and while waiting to talk with Florent through IRC
>> channel or whatever) I will continue playing with Camel and I will review
>> again the current Florent code to have a clearer idea on how to improve
>> this code in order to be integrated as a first version of the Enhancement
>> Workflows.
>>
>> Please, comments are more than welcome.
>>
>> Regards
>>
>> -------------------------
>>
>> [1] https://issues.apache.org/jira/browse/STANBOL-1008
>>
>>
>> On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
>> rupert.westenthaler@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Thx florent for the reminder. I would like to ask all 4 Students to
>>>
>>> 1. write a mail on this list with a short summary of the GSoC project
>>> (project summary + link to the stanbol issue, some info about the
>>> student, first steps). IMO this is important as the Proposals itself
>>> are not fully public available.
>>> 2. to join the #stanbol IRC list on freenode.org (also mentors are
>>> welcome to join ^^). Having the people around on IRC really helps to
>>> answer simple questions fast.
>>>
>>> and welcome to GSoC 2014!
>>>
>>> best
>>> Rupert
>>>
>>> On Thu, May 1, 2014 at 1:05 PM, florent andré
>>> <fl...@4sengines.com> wrote:
>>>>
>>>> Hi there !
>>>>
>>>> As you may notice Gsoc community bonding period has begin for some time
>>>
>>> now.
>>>>
>>>>
>>>> Speaking for Camel/Stanbol integration [1], the good proposal from
>>>
>>> Antonio
>>>>
>>>> was accepted ! Congrats !
>>>> So Antonio, now bonding have to start! :)
>>>>
>>>>  From my point of view, a good way to bond the community to this
>>>
>>> integration
>>>>
>>>> could be to create sub-issues to the "can be considered as the main one"
>>>> STANBOL-1008. So we can see more specific actions you will take and
>>>
>>> discuss
>>>>
>>>> specific parts in the related issue, and get a global overview when
>>>
>>> looking
>>>>
>>>> at the parent issue.
>>>>
>>>> Antonio what do you think ? Can you do that ?
>>>>
>>>> As a side point, I remembered this morning this mail [2] exchange that
>>>
>>> can
>>>>
>>>> give you pointer or idea for an "easy to set up throw REST" Camel's
>>>
>>> routes /
>>>>
>>>> flowchart.
>>>>
>>>> Happy bonding !
>>>> ++
>>>>
>>>>
>>>> [1] be warned, don't know if any-one can access it :
>>>>
>>>
>>> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
>>>>
>>>>
>>>> [2]
>>>>
>>>
>>> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                              ++43-699-11108907
>>> | A-5500 Bischofshofen
>>> |
>>> REDLINK.CO..........................................................................
>>> | http://redlink.co/
>>>
>>
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Camel integration (was : Re: Community bonding period started)

Posted by florent andré <fl...@4sengines.com>.
Hi Antonio !

Thanks for this first description.

As first big overview of a work plan, I will say :

=== 1) find a name for this feature ===

As a first step I think we have to find a clear and stable name for this 
new feature (as we used many until now) :
1) enhancements route
2) Enhancement Workflows
3) flow graphs :
4) ...others...

To my opinion :
1) that use terms of both project Stanbol and Camel, and that's good
2) Is there enough clear difference with chains that already exists ? 
And this name don't show the ability to put and publish from other 
source IMO
3) show well the configuration opportunities but not the input/output 
possibilities
4)
* Entreprise Semantic Integration (referring to Entreprise Integration 
Patern)
* Semantic Bus ("Bus" frequently used in camel)

=== 2) defining a set of concrete use-cases ===

2 ou 3 concrete uses cases that use the input/output possibilities and 
enhancement flow configurations.
A raw template for this use case definitions :
Fetch from : local folder / rss / mail / ...
Enhance with engine 1
Depending on the result of this engine go to :
* Chain 1
* or to Chain 2 and 3 and merge results
Output the result to : email / ftp / ...

=== 3) Defining and Implementing protocols ===

One of the big strength of Camel is the ability to define in house protocol.
When working on the first version of the code, I really like the idea of 
getting a :
engine://
chain://
store://

protocols.
With that we can use the Stanbol's capabilities throw camel without 
touch Stanbol api.

=== 4) Defining and implementing some DataFormat ===

With the pluggable DataFormat (http://camel.apache.org/data-format.html) 
principle, Camel allow automatic, transparent for the user, 
transformation of major data type to java structure or another data type 
depending on the input requested by the transformator.

For example a file in input can be used as a input stream in a 
transformator 1, and processed as a List<String> in T2...

Set up some basic DataFormat transformator can be a great gain.
We can think of for example :
text <-> content Item
rdf <-> content item metadata
content item <-> html
...

It's seems me to remember set-up a really basic one dataFormat during my 
first integration experience..

=== 5) defining and implementing easy routing definition ===

In my first version of code, adding a new route require to build a 
bundle and add it to Stanbol.
The structure, and the code of this bundle is pretty simple and allow to 
code you route with java DSL (with one I pretty like), but maybe lack a 
little bit of flexibility and user friendliness.

Have to keep this opportunity to use java DSL imo, but adding a more 
"dynamic" way could be really cool throw a REST endpoint at a first step.

=== 6) Easy testing framework ===

Camel and Stanbol are bigs and even if both have integration testing 
opportunities, using both at the same time can be hard to learn.
Could be good to have a set of helper classes and / or good 
documentation to set this up easily.

=== 7) Thinking of improvements ===

Camel offer great opportunities in terms of asynchronous processing, 
message splitting and merging, parallel processing and dynamic (on 
conditionals) processing.

How can this features can be implemented in ? Needs stanbol's api 
modifications ?

For another example of idea, why not have a route:// (or graph://) 
protocols that allow to use already previously defined routes into a new 
route ?

This task is more a "daemon" task where you will add ideas and solutions 
during the implementations of others.


what do you think of this 10.000 feet plan ?

++


On 05/05/2014 10:24, Antonio David Perez Morales wrote:
> Hi Rupert, Florent and all
>
> My accepted project is "Enhancement Workflows. Enterprise Integration
> Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
> a set of components for Semantic Content Management. One of the components
> is the Enhancer, which can be used to extract features from content. The
> Enhancer is organized using Enhancements Chains, which defines how the
> content will be processed but they don't allow to integrate the current
> process with the business layer. The goal of the project is to bring EIP to
> Stanbol for easing the integration of the Enhancement workflows within the
> business layer of enterprise systems. In order to achieve this, Apache
> Camel framework is intended to be used as EIP pprovider.
>
> About my person, I hold a graduate degree in Computer Science Engineering
> from the University of Seville and I am currently finishing a Master in
> Software Engineering and Technology at that institution. I consider myself
> hardworking, problem-solving, quick-learning an open source lover mainly
> interested in all related with new technologies either web, mobile or
> desktop. I love learning new things and facing new challenges every day. I
> have coded for a long time with Java, PHP and Javascript. I use them on my
> daily work. I can write clean and structured code following code rules and
> applying well-known design patterns to improve the quality and maintenance
> of the code. Last year, I have been working as Senior Software Engineer at
> the R&D division of Zaizi, an open source consultant specialized in Content
> and Enterprise Content Management Systems. Apache Stanbol is one of the
> main components in our current technical stack; therefore, I have been
> widely working with it in the last months, both making integrations with
> different enterprise systems like ECMs and directly contributing to the
> project. As a result of this effort, I have been confirmed as committer of
> the project since January 2014.
>
>
> Regarding the project, I have been taking a look at Florent code about the
> first approach to integrate Camel into Stanbol. Moreover I have already
> started to read more and play with Camel (and Camel Spring) to refresh and
> familiarize with it (because I worked with Camel several years ago). As a
> first example (which is one of the tasks I want to do in the integration) I
> have been able to deploy in a local folder some files with example Camel
> routes defined in XML (camel-spring) and these routes are automatically
> loaded by the example application I have deployed. This way, we can achieve
> something similar to the indexing tool, where the indexing result files are
> put in a directory inside Stanbol and automatically the new Entityhub is
> generated from those files.
>
> I have also read the mail Florent pointed out in a previous mail about the
> potential Camel protocols (components) which can be developed to map
> Chains, Engines and Stores but I would prefer to talk with Florent first to
> decide the tasks to be done and the order of them, because I know the
> proposal is very ambitious but achievable.
>
> So, as first steps (and while waiting to talk with Florent through IRC
> channel or whatever) I will continue playing with Camel and I will review
> again the current Florent code to have a clearer idea on how to improve
> this code in order to be integrated as a first version of the Enhancement
> Workflows.
>
> Please, comments are more than welcome.
>
> Regards
>
> -------------------------
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1008
>
>
> On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi all,
>>
>> Thx florent for the reminder. I would like to ask all 4 Students to
>>
>> 1. write a mail on this list with a short summary of the GSoC project
>> (project summary + link to the stanbol issue, some info about the
>> student, first steps). IMO this is important as the Proposals itself
>> are not fully public available.
>> 2. to join the #stanbol IRC list on freenode.org (also mentors are
>> welcome to join ^^). Having the people around on IRC really helps to
>> answer simple questions fast.
>>
>> and welcome to GSoC 2014!
>>
>> best
>> Rupert
>>
>> On Thu, May 1, 2014 at 1:05 PM, florent andré
>> <fl...@4sengines.com> wrote:
>>> Hi there !
>>>
>>> As you may notice Gsoc community bonding period has begin for some time
>> now.
>>>
>>> Speaking for Camel/Stanbol integration [1], the good proposal from
>> Antonio
>>> was accepted ! Congrats !
>>> So Antonio, now bonding have to start! :)
>>>
>>>  From my point of view, a good way to bond the community to this
>> integration
>>> could be to create sub-issues to the "can be considered as the main one"
>>> STANBOL-1008. So we can see more specific actions you will take and
>> discuss
>>> specific parts in the related issue, and get a global overview when
>> looking
>>> at the parent issue.
>>>
>>> Antonio what do you think ? Can you do that ?
>>>
>>> As a side point, I remembered this morning this mail [2] exchange that
>> can
>>> give you pointer or idea for an "easy to set up throw REST" Camel's
>> routes /
>>> flowchart.
>>>
>>> Happy bonding !
>>> ++
>>>
>>>
>>> [1] be warned, don't know if any-one can access it :
>>>
>> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
>>>
>>> [2]
>>>
>> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO..........................................................................
>> | http://redlink.co/
>>
>

Re: Community bonding period started

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Suman,

IMO it is no problem that those libs do have an incompatible license.
This GSoC project is about Speech2Text and not about transcoding some
audio files. Having an EnhancementEngine that transcodes audio files
(and Audio Tracks of Video Files) to a format that can be processed by
Sphinx is nice to have but not a requirement.

You can still implement two engines

1. Audio Transcoding Engine
2. Speech2Text Engine

(1) would stay in your repository. Make a release and publish on maven central
(2) would be contributed to Apache Stanbol and provides Text2Speech
functionality.

LGPL is fine for nearly every commercial use case. So users need just
to add the Bundles required for the use of (1) to to their launcher to
use the functionality of (2) with any audio/video format. The only
thing we can not do is to provide an launcher that already includes
(1) in Stanbol.

best
Rupert

On Thu, May 15, 2014 at 11:13 AM, Suman Saurabh
<ss...@gmail.com> wrote:
> Hi Rupert,
>
> On Wed, May 14, 2014 at 5:11 PM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Please Note that Xuggler is LGPL. That means that both the bundle and
>> the converter engine you will build on top will not be able to be
>> contributed to Apache Stanbol (as LGPL is not compatible with the
>> Apache License). This will not affect the Speech to text Enhancement
>> Engine based on Sphinx.
>>
>> In praxis this will mean that users will need to download and install
>> the Xuggler based engine themselves for being able to use other audio
>> formats than the one directly supported by Sphinx
>
>
> I want to know if Xuggler has different license, hence cannot be
> contributed to Apache Stanbol why not
> to use FFmpeg [1]. It is easy to use and is available for all the three
> favourite platform just by the simple
> command : ffmpeg -i input_file -acodec pcm_s16le -ar 16000 -ac 1 output.wav
>
> I have provided the detailed description in the link.
>
> Regards,
> Suman Saurabh
>
> [1] https://sites.google.com/site/gsoc2014stanbol/updates/update1-xuggler



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Community bonding period started

Posted by Suman Saurabh <ss...@gmail.com>.
Hi Rupert,

On Wed, May 14, 2014 at 5:11 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Please Note that Xuggler is LGPL. That means that both the bundle and
> the converter engine you will build on top will not be able to be
> contributed to Apache Stanbol (as LGPL is not compatible with the
> Apache License). This will not affect the Speech to text Enhancement
> Engine based on Sphinx.
>
> In praxis this will mean that users will need to download and install
> the Xuggler based engine themselves for being able to use other audio
> formats than the one directly supported by Sphinx


I want to know if Xuggler has different license, hence cannot be
contributed to Apache Stanbol why not
to use FFmpeg [1]. It is easy to use and is available for all the three
favourite platform just by the simple
command : ffmpeg -i input_file -acodec pcm_s16le -ar 16000 -ac 1 output.wav

I have provided the detailed description in the link.

Regards,
Suman Saurabh

[1] https://sites.google.com/site/gsoc2014stanbol/updates/update1-xuggler

Re: Community bonding period started

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Saurabh,

I created a pull request [1] including the bundle.

Please Note that Xuggler is LGPL. That means that both the bundle and
the converter engine you will build on top will not be able to be
contributed to Apache Stanbol (as LGPL is not compatible with the
Apache License). This will not affect the Speech to text Enhancement
Engine based on Sphinx.

In praxis this will mean that users will need to download and install
the Xuggler based engine themselves for being able to use other audio
formats than the one directly supported by Sphinx.

best
Rupert

[1] https://github.com/sumansaurabh/stanbol-1007/pull/1


On Tue, May 13, 2014 at 7:27 PM, Suman Saurabh
<ss...@gmail.com> wrote:
> Hi Rupert,
>
> On Wed, May 7, 2014 at 12:33 PM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> 2 Month ago I creates an OSGI bundle for for Xuggler 5.4.  As Xuggler
>> uses native libraries (via JNI) those need to be correctly registered
>> in the Bundle meta data. Up to now I have just tested this on mac. So
>> the bundle might still have issues on other systems. However It might
>> still save you a lot of time especially if you are not so experience
>>
>> As soon as you do have a code repository for your GSoC projects I can
>> contribute this.
>>
>
> I have made the repository[1] , please contribute the OSGI bundle for
> Xuggler.
> Thank you for your support.
>
> I was out of the town, so I could not reply early.
>
>
> Regards,
> Suman Saurabh
>
> [1] https://github.com/sumansaurabh/stanbol-1007



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Community bonding period started

Posted by Suman Saurabh <ss...@gmail.com>.
Hi Rupert,

On Wed, May 7, 2014 at 12:33 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> 2 Month ago I creates an OSGI bundle for for Xuggler 5.4.  As Xuggler
> uses native libraries (via JNI) those need to be correctly registered
> in the Bundle meta data. Up to now I have just tested this on mac. So
> the bundle might still have issues on other systems. However It might
> still save you a lot of time especially if you are not so experience
>
> As soon as you do have a code repository for your GSoC projects I can
> contribute this.
>

I have made the repository[1] , please contribute the OSGI bundle for
Xuggler.
Thank you for your support.

I was out of the town, so I could not reply early.


Regards,
Suman Saurabh

[1] https://github.com/sumansaurabh/stanbol-1007

Re: Community bonding period started

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Suman

On Tue, May 6, 2014 at 4:15 AM, Suman Saurabh
<ss...@gmail.com> wrote:
> As mentioned in my proposal, my project mainly involves 3 parts:
> 1) Developing module to extract sound from audio/video data in following
> format: 16 kHz, 16 bit, mono, little-endian using Xuggler libraries.

2 Month ago I creates an OSGI bundle for for Xuggler 5.4.  As Xuggler
uses native libraries (via JNI) those need to be correctly registered
in the Bundle meta data. Up to now I have just tested this on mac. So
the bundle might still have issues on other systems. However It might
still save you a lot of time especially if you are not so experienced
with OSGI.

As soon as you do have a code repository for your GSoC projects I can
contribute this.

best
Rupert

-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: Community bonding period started

Posted by chalitha udara Perera <ch...@gmail.com>.
Hi All,
My accepted proposal is "Integrate YAGO and AIDA NED with Apache Stanbol"
which addresses the issue STANBOL-1295 [1]. YAGO is a semantic knowledge
base similar to dbpedia and freebase, but provides much cleaner thematic
domains and has useful representation of spatial, temporal and context
information about entities. As the initial part of this project, YAGO will
be integrated as a referenced site in Stanbol.

AIDA is a framework developed for entity disambiguation which uses YAGO as
the knowledge base for disambiguation. Even though my proposal aimed at
integrating AIDA, with its conflicting licence it is not possible to
integrate AIDA. Therefore after the initial discussion with my assigned
mentor Rafa, this project will address pending task related to
disambiguation, Jira issue   STANBOL-1183 [2]. That is developing a
disambiguation API for Stanbol.

I have started looking in to Stanbol indexing tool and configurations. Also
getting familiar with entity disambiguation in Stanbol [3].

About myself, I am a final year undergraduate student from Department of
Computer Science and Engineering, University of Moratuwa. Recently I have
completed my internship as R&D Engineering intern at Zaizi, an open source
consultant company in enterprise content management, during which I had the
opportunity to work with open source projects such as Stanbol and Mahout.

Looking forward for a great summer of coding!

[1] https://issues.apache.org/jira/browse/STANBOL-1295
[2] https://issues.apache.org/jira/browse/STANBOL-1183
[3] https://issues.apache.org/jira/browse/STANBOL-1037

cheers,
Chalitha


On Tue, May 6, 2014 at 7:45 AM, Suman Saurabh
<ss...@gmail.com>wrote:

> Hi all,
>
> Thank you all for allowing me to introduce myself and my proposal.
>
> My proposal is building " Speech to text Enhancement Engine ", it was
> originally a jira issue [1] - Stanbol-1007 . TBD enhancement engine uses
> Sphinx library to convert the captured audio. Media (audio/video) data file
> is parsed with the ContentItem and formatted to proper audio format by
> Xuggler libraries. Audio speech is than extracted by Sphinx to 'plain/text'
> with the annotation of temporal position of the extracted text. Sphinx uses
> acoustic model and language model to map the utterances with the text, so
> the engine will also provide support of uploading acoustic model and
> language model.
>
> As mentioned in my proposal, my project mainly involves 3 parts:
> 1) Developing module to extract sound from audio/video data in following
> format: 16 kHz, 16 bit, mono, little-endian using Xuggler libraries.
> 2) Sphinx module to process the sound file to text with proper annotations
> using CMU Sphinx4 libraries.
> 3) Developing Enhancement engine and implement all these modules.
>
> Currently I am working on the 1st module in discussion with my mentor
> Andreas Kuckartz.
>
> Now about myself, I am pre-final year B. Tech student in Computer Science
> and Engineering at Laxmi Niwas Mittal Institute of Information Technology.
> I am into innovation, some of the projects which I was involved in the past
> is : My friend and I developed a very efficient method of head tracking
> using goggle to corresponding mouse pointer movement. It helps physically
> underprivileged person by hand to interact with computer efficiently [2].
> More of my details can be found in my linkedin profile[3].
>
> I have never been involved with an open source organization before, this
> will be very nice experience working with you all people.
>
> Happy Bonding.
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1007
> [2]
>
> http://gstanwar.blogspot.in/2014/01/avatar-goggle-by-which-disabledhand-and.html
> [3] http://in.linkedin.com/in/ssumansaurabh
>
> Regards,
> Suman Saurabh
>
>
>
> On Mon, May 5, 2014 at 1:54 PM, Antonio David Perez Morales <
> aperez@zaizi.com> wrote:
>
> > Hi Rupert, Florent and all
> >
> > My accepted project is "Enhancement Workflows. Enterprise Integration
> > Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol
> provides
> > a set of components for Semantic Content Management. One of the
> components
> > is the Enhancer, which can be used to extract features from content. The
> > Enhancer is organized using Enhancements Chains, which defines how the
> > content will be processed but they don't allow to integrate the current
> > process with the business layer. The goal of the project is to bring EIP
> to
> > Stanbol for easing the integration of the Enhancement workflows within
> the
> > business layer of enterprise systems. In order to achieve this, Apache
> > Camel framework is intended to be used as EIP pprovider.
> >
> > About my person, I hold a graduate degree in Computer Science Engineering
> > from the University of Seville and I am currently finishing a Master in
> > Software Engineering and Technology at that institution. I consider
> myself
> > hardworking, problem-solving, quick-learning an open source lover mainly
> > interested in all related with new technologies either web, mobile or
> > desktop. I love learning new things and facing new challenges every day.
> I
> > have coded for a long time with Java, PHP and Javascript. I use them on
> my
> > daily work. I can write clean and structured code following code rules
> and
> > applying well-known design patterns to improve the quality and
> maintenance
> > of the code. Last year, I have been working as Senior Software Engineer
> at
> > the R&D division of Zaizi, an open source consultant specialized in
> Content
> > and Enterprise Content Management Systems. Apache Stanbol is one of the
> > main components in our current technical stack; therefore, I have been
> > widely working with it in the last months, both making integrations with
> > different enterprise systems like ECMs and directly contributing to the
> > project. As a result of this effort, I have been confirmed as committer
> of
> > the project since January 2014.
> >
> >
> > Regarding the project, I have been taking a look at Florent code about
> the
> > first approach to integrate Camel into Stanbol. Moreover I have already
> > started to read more and play with Camel (and Camel Spring) to refresh
> and
> > familiarize with it (because I worked with Camel several years ago). As a
> > first example (which is one of the tasks I want to do in the
> integration) I
> > have been able to deploy in a local folder some files with example Camel
> > routes defined in XML (camel-spring) and these routes are automatically
> > loaded by the example application I have deployed. This way, we can
> achieve
> > something similar to the indexing tool, where the indexing result files
> are
> > put in a directory inside Stanbol and automatically the new Entityhub is
> > generated from those files.
> >
> > I have also read the mail Florent pointed out in a previous mail about
> the
> > potential Camel protocols (components) which can be developed to map
> > Chains, Engines and Stores but I would prefer to talk with Florent first
> to
> > decide the tasks to be done and the order of them, because I know the
> > proposal is very ambitious but achievable.
> >
> > So, as first steps (and while waiting to talk with Florent through IRC
> > channel or whatever) I will continue playing with Camel and I will review
> > again the current Florent code to have a clearer idea on how to improve
> > this code in order to be integrated as a first version of the Enhancement
> > Workflows.
> >
> > Please, comments are more than welcome.
> >
> > Regards
> >
> > -------------------------
> >
> > [1] https://issues.apache.org/jira/browse/STANBOL-1008
> >
> >
> > On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
> > rupert.westenthaler@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Thx florent for the reminder. I would like to ask all 4 Students to
> > >
> > > 1. write a mail on this list with a short summary of the GSoC project
> > > (project summary + link to the stanbol issue, some info about the
> > > student, first steps). IMO this is important as the Proposals itself
> > > are not fully public available.
> > > 2. to join the #stanbol IRC list on freenode.org (also mentors are
> > > welcome to join ^^). Having the people around on IRC really helps to
> > > answer simple questions fast.
> > >
> > > and welcome to GSoC 2014!
> > >
> > > best
> > > Rupert
> > >
> > > On Thu, May 1, 2014 at 1:05 PM, florent andré
> > > <fl...@4sengines.com> wrote:
> > > > Hi there !
> > > >
> > > > As you may notice Gsoc community bonding period has begin for some
> time
> > > now.
> > > >
> > > > Speaking for Camel/Stanbol integration [1], the good proposal from
> > > Antonio
> > > > was accepted ! Congrats !
> > > > So Antonio, now bonding have to start! :)
> > > >
> > > > From my point of view, a good way to bond the community to this
> > > integration
> > > > could be to create sub-issues to the "can be considered as the main
> > one"
> > > > STANBOL-1008. So we can see more specific actions you will take and
> > > discuss
> > > > specific parts in the related issue, and get a global overview when
> > > looking
> > > > at the parent issue.
> > > >
> > > > Antonio what do you think ? Can you do that ?
> > > >
> > > > As a side point, I remembered this morning this mail [2] exchange
> that
> > > can
> > > > give you pointer or idea for an "easy to set up throw REST" Camel's
> > > routes /
> > > > flowchart.
> > > >
> > > > Happy bonding !
> > > > ++
> > > >
> > > >
> > > > [1] be warned, don't know if any-one can access it :
> > > >
> > >
> >
> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
> > > >
> > > > [2]
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
> > >
> > >
> > >
> > > --
> > > | Rupert Westenthaler             rupert.westenthaler@gmail.com
> > > | Bodenlehenstraße 11                              ++43-699-11108907
> > > | A-5500 Bischofshofen
> > > |
> >
> REDLINK.CO..........................................................................
> > > | http://redlink.co/
> > >
> >
> > --
> >
> > ------------------------------
> > This message should be regarded as confidential. If you have received
> this
> > email in error please notify the sender and destroy it immediately.
> > Statements of intent shall only become binding when confirmed in hard
> copy
> > by an authorised signatory.
> >
> > Zaizi Ltd is registered in England and Wales with the registration number
> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> > London W6 7AN.
> >
>



-- 
J.M Chalitha Udara Perera

*Department of Computer Science and Engineering,*
*University of Moratuwa,*
*Sri Lanka*

Re: Community bonding period started

Posted by Suman Saurabh <ss...@gmail.com>.
Hi all,

Thank you all for allowing me to introduce myself and my proposal.

My proposal is building " Speech to text Enhancement Engine ", it was
originally a jira issue [1] - Stanbol-1007 . TBD enhancement engine uses
Sphinx library to convert the captured audio. Media (audio/video) data file
is parsed with the ContentItem and formatted to proper audio format by
Xuggler libraries. Audio speech is than extracted by Sphinx to 'plain/text'
with the annotation of temporal position of the extracted text. Sphinx uses
acoustic model and language model to map the utterances with the text, so
the engine will also provide support of uploading acoustic model and
language model.

As mentioned in my proposal, my project mainly involves 3 parts:
1) Developing module to extract sound from audio/video data in following
format: 16 kHz, 16 bit, mono, little-endian using Xuggler libraries.
2) Sphinx module to process the sound file to text with proper annotations
using CMU Sphinx4 libraries.
3) Developing Enhancement engine and implement all these modules.

Currently I am working on the 1st module in discussion with my mentor
Andreas Kuckartz.

Now about myself, I am pre-final year B. Tech student in Computer Science
and Engineering at Laxmi Niwas Mittal Institute of Information Technology.
I am into innovation, some of the projects which I was involved in the past
is : My friend and I developed a very efficient method of head tracking
using goggle to corresponding mouse pointer movement. It helps physically
underprivileged person by hand to interact with computer efficiently [2].
More of my details can be found in my linkedin profile[3].

I have never been involved with an open source organization before, this
will be very nice experience working with you all people.

Happy Bonding.

[1] https://issues.apache.org/jira/browse/STANBOL-1007
[2]
http://gstanwar.blogspot.in/2014/01/avatar-goggle-by-which-disabledhand-and.html
[3] http://in.linkedin.com/in/ssumansaurabh

Regards,
Suman Saurabh



On Mon, May 5, 2014 at 1:54 PM, Antonio David Perez Morales <
aperez@zaizi.com> wrote:

> Hi Rupert, Florent and all
>
> My accepted project is "Enhancement Workflows. Enterprise Integration
> Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
> a set of components for Semantic Content Management. One of the components
> is the Enhancer, which can be used to extract features from content. The
> Enhancer is organized using Enhancements Chains, which defines how the
> content will be processed but they don't allow to integrate the current
> process with the business layer. The goal of the project is to bring EIP to
> Stanbol for easing the integration of the Enhancement workflows within the
> business layer of enterprise systems. In order to achieve this, Apache
> Camel framework is intended to be used as EIP pprovider.
>
> About my person, I hold a graduate degree in Computer Science Engineering
> from the University of Seville and I am currently finishing a Master in
> Software Engineering and Technology at that institution. I consider myself
> hardworking, problem-solving, quick-learning an open source lover mainly
> interested in all related with new technologies either web, mobile or
> desktop. I love learning new things and facing new challenges every day. I
> have coded for a long time with Java, PHP and Javascript. I use them on my
> daily work. I can write clean and structured code following code rules and
> applying well-known design patterns to improve the quality and maintenance
> of the code. Last year, I have been working as Senior Software Engineer at
> the R&D division of Zaizi, an open source consultant specialized in Content
> and Enterprise Content Management Systems. Apache Stanbol is one of the
> main components in our current technical stack; therefore, I have been
> widely working with it in the last months, both making integrations with
> different enterprise systems like ECMs and directly contributing to the
> project. As a result of this effort, I have been confirmed as committer of
> the project since January 2014.
>
>
> Regarding the project, I have been taking a look at Florent code about the
> first approach to integrate Camel into Stanbol. Moreover I have already
> started to read more and play with Camel (and Camel Spring) to refresh and
> familiarize with it (because I worked with Camel several years ago). As a
> first example (which is one of the tasks I want to do in the integration) I
> have been able to deploy in a local folder some files with example Camel
> routes defined in XML (camel-spring) and these routes are automatically
> loaded by the example application I have deployed. This way, we can achieve
> something similar to the indexing tool, where the indexing result files are
> put in a directory inside Stanbol and automatically the new Entityhub is
> generated from those files.
>
> I have also read the mail Florent pointed out in a previous mail about the
> potential Camel protocols (components) which can be developed to map
> Chains, Engines and Stores but I would prefer to talk with Florent first to
> decide the tasks to be done and the order of them, because I know the
> proposal is very ambitious but achievable.
>
> So, as first steps (and while waiting to talk with Florent through IRC
> channel or whatever) I will continue playing with Camel and I will review
> again the current Florent code to have a clearer idea on how to improve
> this code in order to be integrated as a first version of the Enhancement
> Workflows.
>
> Please, comments are more than welcome.
>
> Regards
>
> -------------------------
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1008
>
>
> On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
> > Hi all,
> >
> > Thx florent for the reminder. I would like to ask all 4 Students to
> >
> > 1. write a mail on this list with a short summary of the GSoC project
> > (project summary + link to the stanbol issue, some info about the
> > student, first steps). IMO this is important as the Proposals itself
> > are not fully public available.
> > 2. to join the #stanbol IRC list on freenode.org (also mentors are
> > welcome to join ^^). Having the people around on IRC really helps to
> > answer simple questions fast.
> >
> > and welcome to GSoC 2014!
> >
> > best
> > Rupert
> >
> > On Thu, May 1, 2014 at 1:05 PM, florent andré
> > <fl...@4sengines.com> wrote:
> > > Hi there !
> > >
> > > As you may notice Gsoc community bonding period has begin for some time
> > now.
> > >
> > > Speaking for Camel/Stanbol integration [1], the good proposal from
> > Antonio
> > > was accepted ! Congrats !
> > > So Antonio, now bonding have to start! :)
> > >
> > > From my point of view, a good way to bond the community to this
> > integration
> > > could be to create sub-issues to the "can be considered as the main
> one"
> > > STANBOL-1008. So we can see more specific actions you will take and
> > discuss
> > > specific parts in the related issue, and get a global overview when
> > looking
> > > at the parent issue.
> > >
> > > Antonio what do you think ? Can you do that ?
> > >
> > > As a side point, I remembered this morning this mail [2] exchange that
> > can
> > > give you pointer or idea for an "easy to set up throw REST" Camel's
> > routes /
> > > flowchart.
> > >
> > > Happy bonding !
> > > ++
> > >
> > >
> > > [1] be warned, don't know if any-one can access it :
> > >
> >
> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
> > >
> > > [2]
> > >
> >
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
> >
> >
> >
> > --
> > | Rupert Westenthaler             rupert.westenthaler@gmail.com
> > | Bodenlehenstraße 11                              ++43-699-11108907
> > | A-5500 Bischofshofen
> > |
> REDLINK.CO..........................................................................
> > | http://redlink.co/
> >
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.
>

Re: Community bonding period started

Posted by florent andré <fl...@4sengines.com>.
((re-sending a yesterday mail that not spot on the list... may you get 
finally twice))

Hi Antonio !

Thanks for this first description.

As first big overview of a work plan, I will say :

=== 1) find a name for this feature ===

As a first step I think we have to find a clear and stable name for this 
new feature (as we used many until now) :
1) enhancements route
2) Enhancement Workflows
3) flow graphs :
4) ...others...

To my opinion :
1) that use terms of both project Stanbol and Camel, and that's good
2) Is there enough clear difference with chains that already exists ? 
And this name don't show the ability to put and publish from other 
source IMO
3) show well the configuration opportunities but not the input/output 
possibilities
4)
* Entreprise Semantic Integration (referring to Entreprise Integration 
Patern)
* Semantic Bus ("Bus" frequently used in camel)

=== 2) defining a set of concrete use-cases ===

2 ou 3 concrete uses cases that use the input/output possibilities and 
enhancement flow configurations.
A raw template for this use case definitions :
Fetch from : local folder / rss / mail / ...
Enhance with engine 1
Depending on the result of this engine go to :
* Chain 1
* or to Chain 2 and 3 and merge results
Output the result to : email / ftp / ...

=== 3) Defining and Implementing protocols ===

One of the big strength of Camel is the ability to define in house protocol.
When working on the first version of the code, I really like the idea of 
getting a :
engine://
chain://
store://

protocols.
With that we can use the Stanbol's capabilities throw camel without 
touch Stanbol api.

=== 4) Defining and implementing some DataFormat ===

With the pluggable DataFormat (http://camel.apache.org/data-format.html) 
principle, Camel allow automatic, transparent for the user, 
transformation of major data type to java structure or another data type 
depending on the input requested by the transformator.

For example a file in input can be used as a input stream in a 
transformator 1, and processed as a List<String> in T2...

Set up some basic DataFormat transformator can be a great gain.
We can think of for example :
text <-> content Item
rdf <-> content item metadata
content item <-> html
...

It's seems me to remember set-up a really basic one dataFormat during my 
first integration experience..

=== 5) defining and implementing easy routing definition ===

In my first version of code, adding a new route require to build a 
bundle and add it to Stanbol.
The structure, and the code of this bundle is pretty simple and allow to 
code you route with java DSL (with one I pretty like), but maybe lack a 
little bit of flexibility and user friendliness.

Have to keep this opportunity to use java DSL imo, but adding a more 
"dynamic" way could be really cool throw a REST endpoint at a first step.

=== 6) Easy testing framework ===

Camel and Stanbol are bigs and even if both have integration testing 
opportunities, using both at the same time can be hard to learn.
Could be good to have a set of helper classes and / or good 
documentation to set this up easily.

=== 7) Thinking of improvements ===

Camel offer great opportunities in terms of asynchronous processing, 
message splitting and merging, parallel processing and dynamic (on 
conditionals) processing.

How can this features can be implemented in ? Needs stanbol's api 
modifications ?

For another example of idea, why not have a route:// (or graph://) 
protocols that allow to use already previously defined routes into a new 
route ?

This task is more a "daemon" task where you will add ideas and solutions 
during the implementations of others.


what do you think of this 10.000 feet plan ?

++

On 05/05/2014 10:24, Antonio David Perez Morales wrote:
> Hi Rupert, Florent and all
>
> My accepted project is "Enhancement Workflows. Enterprise Integration
> Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
> a set of components for Semantic Content Management. One of the components
> is the Enhancer, which can be used to extract features from content. The
> Enhancer is organized using Enhancements Chains, which defines how the
> content will be processed but they don't allow to integrate the current
> process with the business layer. The goal of the project is to bring EIP to
> Stanbol for easing the integration of the Enhancement workflows within the
> business layer of enterprise systems. In order to achieve this, Apache
> Camel framework is intended to be used as EIP pprovider.
>
> About my person, I hold a graduate degree in Computer Science Engineering
> from the University of Seville and I am currently finishing a Master in
> Software Engineering and Technology at that institution. I consider myself
> hardworking, problem-solving, quick-learning an open source lover mainly
> interested in all related with new technologies either web, mobile or
> desktop. I love learning new things and facing new challenges every day. I
> have coded for a long time with Java, PHP and Javascript. I use them on my
> daily work. I can write clean and structured code following code rules and
> applying well-known design patterns to improve the quality and maintenance
> of the code. Last year, I have been working as Senior Software Engineer at
> the R&D division of Zaizi, an open source consultant specialized in Content
> and Enterprise Content Management Systems. Apache Stanbol is one of the
> main components in our current technical stack; therefore, I have been
> widely working with it in the last months, both making integrations with
> different enterprise systems like ECMs and directly contributing to the
> project. As a result of this effort, I have been confirmed as committer of
> the project since January 2014.
>
>
> Regarding the project, I have been taking a look at Florent code about the
> first approach to integrate Camel into Stanbol. Moreover I have already
> started to read more and play with Camel (and Camel Spring) to refresh and
> familiarize with it (because I worked with Camel several years ago). As a
> first example (which is one of the tasks I want to do in the integration) I
> have been able to deploy in a local folder some files with example Camel
> routes defined in XML (camel-spring) and these routes are automatically
> loaded by the example application I have deployed. This way, we can achieve
> something similar to the indexing tool, where the indexing result files are
> put in a directory inside Stanbol and automatically the new Entityhub is
> generated from those files.
>
> I have also read the mail Florent pointed out in a previous mail about the
> potential Camel protocols (components) which can be developed to map
> Chains, Engines and Stores but I would prefer to talk with Florent first to
> decide the tasks to be done and the order of them, because I know the
> proposal is very ambitious but achievable.
>
> So, as first steps (and while waiting to talk with Florent through IRC
> channel or whatever) I will continue playing with Camel and I will review
> again the current Florent code to have a clearer idea on how to improve
> this code in order to be integrated as a first version of the Enhancement
> Workflows.
>
> Please, comments are more than welcome.
>
> Regards
>
> -------------------------
>
> [1] https://issues.apache.org/jira/browse/STANBOL-1008
>
>
> On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
> rupert.westenthaler@gmail.com> wrote:
>
>> Hi all,
>>
>> Thx florent for the reminder. I would like to ask all 4 Students to
>>
>> 1. write a mail on this list with a short summary of the GSoC project
>> (project summary + link to the stanbol issue, some info about the
>> student, first steps). IMO this is important as the Proposals itself
>> are not fully public available.
>> 2. to join the #stanbol IRC list on freenode.org (also mentors are
>> welcome to join ^^). Having the people around on IRC really helps to
>> answer simple questions fast.
>>
>> and welcome to GSoC 2014!
>>
>> best
>> Rupert
>>
>> On Thu, May 1, 2014 at 1:05 PM, florent andré
>> <fl...@4sengines.com> wrote:
>>> Hi there !
>>>
>>> As you may notice Gsoc community bonding period has begin for some time
>> now.
>>>
>>> Speaking for Camel/Stanbol integration [1], the good proposal from
>> Antonio
>>> was accepted ! Congrats !
>>> So Antonio, now bonding have to start! :)
>>>
>>>  From my point of view, a good way to bond the community to this
>> integration
>>> could be to create sub-issues to the "can be considered as the main one"
>>> STANBOL-1008. So we can see more specific actions you will take and
>> discuss
>>> specific parts in the related issue, and get a global overview when
>> looking
>>> at the parent issue.
>>>
>>> Antonio what do you think ? Can you do that ?
>>>
>>> As a side point, I remembered this morning this mail [2] exchange that
>> can
>>> give you pointer or idea for an "easy to set up throw REST" Camel's
>> routes /
>>> flowchart.
>>>
>>> Happy bonding !
>>> ++
>>>
>>>
>>> [1] be warned, don't know if any-one can access it :
>>>
>> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
>>>
>>> [2]
>>>
>> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO..........................................................................
>> | http://redlink.co/
>>
>

Re: Community bonding period started

Posted by Antonio David Perez Morales <ap...@zaizi.com>.
Hi Rupert, Florent and all

My accepted project is "Enhancement Workflows. Enterprise Integration
Patterns in Apache Stanbol", based on the Jira issue [1]. Stanbol provides
a set of components for Semantic Content Management. One of the components
is the Enhancer, which can be used to extract features from content. The
Enhancer is organized using Enhancements Chains, which defines how the
content will be processed but they don't allow to integrate the current
process with the business layer. The goal of the project is to bring EIP to
Stanbol for easing the integration of the Enhancement workflows within the
business layer of enterprise systems. In order to achieve this, Apache
Camel framework is intended to be used as EIP pprovider.

About my person, I hold a graduate degree in Computer Science Engineering
from the University of Seville and I am currently finishing a Master in
Software Engineering and Technology at that institution. I consider myself
hardworking, problem-solving, quick-learning an open source lover mainly
interested in all related with new technologies either web, mobile or
desktop. I love learning new things and facing new challenges every day. I
have coded for a long time with Java, PHP and Javascript. I use them on my
daily work. I can write clean and structured code following code rules and
applying well-known design patterns to improve the quality and maintenance
of the code. Last year, I have been working as Senior Software Engineer at
the R&D division of Zaizi, an open source consultant specialized in Content
and Enterprise Content Management Systems. Apache Stanbol is one of the
main components in our current technical stack; therefore, I have been
widely working with it in the last months, both making integrations with
different enterprise systems like ECMs and directly contributing to the
project. As a result of this effort, I have been confirmed as committer of
the project since January 2014.


Regarding the project, I have been taking a look at Florent code about the
first approach to integrate Camel into Stanbol. Moreover I have already
started to read more and play with Camel (and Camel Spring) to refresh and
familiarize with it (because I worked with Camel several years ago). As a
first example (which is one of the tasks I want to do in the integration) I
have been able to deploy in a local folder some files with example Camel
routes defined in XML (camel-spring) and these routes are automatically
loaded by the example application I have deployed. This way, we can achieve
something similar to the indexing tool, where the indexing result files are
put in a directory inside Stanbol and automatically the new Entityhub is
generated from those files.

I have also read the mail Florent pointed out in a previous mail about the
potential Camel protocols (components) which can be developed to map
Chains, Engines and Stores but I would prefer to talk with Florent first to
decide the tasks to be done and the order of them, because I know the
proposal is very ambitious but achievable.

So, as first steps (and while waiting to talk with Florent through IRC
channel or whatever) I will continue playing with Camel and I will review
again the current Florent code to have a clearer idea on how to improve
this code in order to be integrated as a first version of the Enhancement
Workflows.

Please, comments are more than welcome.

Regards

-------------------------

[1] https://issues.apache.org/jira/browse/STANBOL-1008


On Mon, May 5, 2014 at 10:01 AM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi all,
>
> Thx florent for the reminder. I would like to ask all 4 Students to
>
> 1. write a mail on this list with a short summary of the GSoC project
> (project summary + link to the stanbol issue, some info about the
> student, first steps). IMO this is important as the Proposals itself
> are not fully public available.
> 2. to join the #stanbol IRC list on freenode.org (also mentors are
> welcome to join ^^). Having the people around on IRC really helps to
> answer simple questions fast.
>
> and welcome to GSoC 2014!
>
> best
> Rupert
>
> On Thu, May 1, 2014 at 1:05 PM, florent andré
> <fl...@4sengines.com> wrote:
> > Hi there !
> >
> > As you may notice Gsoc community bonding period has begin for some time
> now.
> >
> > Speaking for Camel/Stanbol integration [1], the good proposal from
> Antonio
> > was accepted ! Congrats !
> > So Antonio, now bonding have to start! :)
> >
> > From my point of view, a good way to bond the community to this
> integration
> > could be to create sub-issues to the "can be considered as the main one"
> > STANBOL-1008. So we can see more specific actions you will take and
> discuss
> > specific parts in the related issue, and get a global overview when
> looking
> > at the parent issue.
> >
> > Antonio what do you think ? Can you do that ?
> >
> > As a side point, I remembered this morning this mail [2] exchange that
> can
> > give you pointer or idea for an "easy to set up throw REST" Camel's
> routes /
> > flowchart.
> >
> > Happy bonding !
> > ++
> >
> >
> > [1] be warned, don't know if any-one can access it :
> >
> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
> >
> > [2]
> >
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO..........................................................................
> | http://redlink.co/
>

-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Community bonding period started

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi all,

Thx florent for the reminder. I would like to ask all 4 Students to

1. write a mail on this list with a short summary of the GSoC project
(project summary + link to the stanbol issue, some info about the
student, first steps). IMO this is important as the Proposals itself
are not fully public available.
2. to join the #stanbol IRC list on freenode.org (also mentors are
welcome to join ^^). Having the people around on IRC really helps to
answer simple questions fast.

and welcome to GSoC 2014!

best
Rupert

On Thu, May 1, 2014 at 1:05 PM, florent andré
<fl...@4sengines.com> wrote:
> Hi there !
>
> As you may notice Gsoc community bonding period has begin for some time now.
>
> Speaking for Camel/Stanbol integration [1], the good proposal from Antonio
> was accepted ! Congrats !
> So Antonio, now bonding have to start! :)
>
> From my point of view, a good way to bond the community to this integration
> could be to create sub-issues to the "can be considered as the main one"
> STANBOL-1008. So we can see more specific actions you will take and discuss
> specific parts in the related issue, and get a global overview when looking
> at the parent issue.
>
> Antonio what do you think ? Can you do that ?
>
> As a side point, I remembered this morning this mail [2] exchange that can
> give you pointer or idea for an "easy to set up throw REST" Camel's routes /
> flowchart.
>
> Happy bonding !
> ++
>
>
> [1] be warned, don't know if any-one can access it :
> https://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/adperezmorales3/5629499534213120
>
> [2]
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201206.mbox/%3C4FDFC494.3090309@4sengines.com%3E



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/