You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Augusto Ribeiro Silva <ar...@unsilo.com> on 2016/06/22 13:29:22 UTC

Non-linear pipelines

Hi,

I couldn’t find any example on the documentation about the definition of non-linear pipelines (not sure this is the right name to call it). 
What I want to do is something like this:

Pipeline: A -> (B or C) -> D

So the step A supports two file formats, then depending on the file format a normalisation step B or C should be performed. Then D should be performed for the result of B and C. How would I go about defining such pipeline or if it is even possible to do it.

Thanks for the help in advance.

Best regards,
Augusto

Re: Non-linear pipelines

Posted by Augusto Ribeiro Silva <ar...@unsilo.com>.
Hi Richard,

Thanks again for the help. I’ll give it a try.

Cheers,
Augusto

> On 23 Jun 2016, at 20:40, Richard Eckart de Castilho <re...@apache.org> wrote:
> 
> On 23.06.2016, at 14:02, Augusto Ribeiro Silva <ar...@unsilo.com> wrote:
>> 
>> Thanks for the help. Is there any place where I can see how to define capabilities in a non-XML way, i.e., using the Java API.
> 
> If you are using uimaFIT, you can declare capabilities using Java annotations, e.g.:
> 
> @TypeCapability(
>        inputs = { 
>            "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token",
>            "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence" }, 
>        outputs = { 
>            "de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS" })
> public class OpenNlpPosTagger
> 	extends JCasAnnotator_ImplBase
> 
> These will then be present in descriptors created via AnalysisEngineFactory.createEngineDescription(...).
> 
> Otherwise, if you build your descriptors with the plain UIMA API, something like
> 
> Capability capability = new Capability_impl();
> ...
> AnalysisEngineDescription desc = UIMAFramework.getResourceSpecifierFactory()
>            .createAnalysisEngineDescription();
> ...
> desc.getAnalysisEngineMetaData().setCapabilities(new Capability[] { capability });
> 
> Cheers,
> 
> -- Richard


Re: Non-linear pipelines

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 23.06.2016, at 14:02, Augusto Ribeiro Silva <ar...@unsilo.com> wrote:
> 
> Thanks for the help. Is there any place where I can see how to define capabilities in a non-XML way, i.e., using the Java API.

If you are using uimaFIT, you can declare capabilities using Java annotations, e.g.:

@TypeCapability(
        inputs = { 
            "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token",
            "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence" }, 
        outputs = { 
            "de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS" })
public class OpenNlpPosTagger
	extends JCasAnnotator_ImplBase

These will then be present in descriptors created via AnalysisEngineFactory.createEngineDescription(...).

Otherwise, if you build your descriptors with the plain UIMA API, something like

Capability capability = new Capability_impl();
...
AnalysisEngineDescription desc = UIMAFramework.getResourceSpecifierFactory()
            .createAnalysisEngineDescription();
...
desc.getAnalysisEngineMetaData().setCapabilities(new Capability[] { capability });

Cheers,

-- Richard

Re: Non-linear pipelines

Posted by Augusto Ribeiro Silva <ar...@unsilo.com>.
Hi Richard,

Thanks for the help. Is there any place where I can see how to define capabilities in a non-XML way, i.e., using the Java API.

Best regards,
Augusto

> On 22 Jun 2016, at 15:57, Richard Eckart de Castilho <re...@apache.org> wrote:
> 
> You could maybe use capabilities to expose which type of information your respective components support:
> 
> https://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.aes.capabilities
> 
> To realize a custom pipeline topology, you would implement a custom flow controller
> 
> https://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.flow_controller
> 
> As inspiration, the CapabilityLanguageFlowController may be useful:
> 
> https://svn.apache.org/repos/asf/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/flow/impl/CapabilityLanguageFlowController.java
> 
> Cheers,
> 
> -- Richard
> 
>> On 22.06.2016, at 15:29, Augusto Ribeiro Silva <ar...@unsilo.com> wrote:
>> 
>> Hi,
>> 
>> I couldn’t find any example on the documentation about the definition of non-linear pipelines (not sure this is the right name to call it). 
>> What I want to do is something like this:
>> 
>> Pipeline: A -> (B or C) -> D
>> 
>> So the step A supports two file formats, then depending on the file format a normalisation step B or C should be performed. Then D should be performed for the result of B and C. How would I go about defining such pipeline or if it is even possible to do it.
>> 
>> Thanks for the help in advance.
>> 
>> Best regards,
>> Augusto
> 


Re: Non-linear pipelines

Posted by Richard Eckart de Castilho <re...@apache.org>.
You could maybe use capabilities to expose which type of information your respective components support:

https://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.aes.capabilities

To realize a custom pipeline topology, you would implement a custom flow controller

https://uima.apache.org/d/uimaj-current/references.html#ugr.ref.xml.component_descriptor.flow_controller

As inspiration, the CapabilityLanguageFlowController may be useful:

https://svn.apache.org/repos/asf/uima/uimaj/trunk/uimaj-core/src/main/java/org/apache/uima/flow/impl/CapabilityLanguageFlowController.java

Cheers,

-- Richard

> On 22.06.2016, at 15:29, Augusto Ribeiro Silva <ar...@unsilo.com> wrote:
> 
> Hi,
> 
> I couldn’t find any example on the documentation about the definition of non-linear pipelines (not sure this is the right name to call it). 
> What I want to do is something like this:
> 
> Pipeline: A -> (B or C) -> D
> 
> So the step A supports two file formats, then depending on the file format a normalisation step B or C should be performed. Then D should be performed for the result of B and C. How would I go about defining such pipeline or if it is even possible to do it.
> 
> Thanks for the help in advance.
> 
> Best regards,
> Augusto