You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@opennlp.apache.org by William Colen <wi...@gmail.com> on 2017/01/04 20:06:23 UTC

UIMA Fit + Pythonator issue

Hi,

I managed to create a UIMA C++ component that performs POSTagging with
Pythonator. It works very well as a standalone annotator. I created a XMI
with sentence and token annotation, the Python code could iterate them and
create the POS tags. I could run it as follows

runAE.sh PythonAnnotator.xml -xmi xmi_folder


Now I am integrating it to the pipeline using UIMA Fit.



...

AggregateBuilder builder = new AggregateBuilder();

builder.add(AnalysisEngineFactory.createEngineDescription(SentDetect.class,

   SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE, sentdetectModelRes
));

builder.add(AnalysisEngineFactory.createEngineDescription(Tokenizer.class,

  TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE, tokenizerModelRes
));

builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
"/complete_path/PythonAnnotator.xml"))


AnalysisEngine aggregate = builder.createAggregate();


It runs OK. I can see a log in the Python code that the "process" function
was called. It loads the type system. I can also run getDocumentText and it
works as expected.


The issue starts when I try to iterate over the sentence annotations. They
are not there! It works in the standalone version when I read it from XMI.


Any clue what I am missing?


Thank you,

William

Re: UIMA Fit + Pythonator issue

Posted by William Colen <wi...@gmail.com>.

I would like to share my final solution

1. Created a combo iterator like the one in OpenNLP
https://gist.github.com/wcolen/037a68fca7e8b402b6e0d3e4df4fab49#file-annotationcomboiterator-py

2. Created a sample,py that iterates over UIMA annotations
https://gist.github.com/wcolen/5edbdcb1d2b6588fead45bbc2dd4fb5b#file-sample-py


William

2017-01-05 1:43 GMT-02:00 William Colen <wi...@gmail.com>:

> Thank you very much, Richard!
>
> Actually it was an error in my Python iterator.
>
> I am using sentence detector and tokenizer from OpenNLP.
> POS Tagger I am using one created in Python using neural networks (
> https://github.com/erickrf/nlpnet).
>
>
>
> 2017-01-04 21:29 GMT-02:00 Richard Eckart de Castilho <re...@apache.org>:
>
>> Hi William,
>>
>> what component collection are you using? OpenNLP? Maybe the components
>> are not set up completely. If you use OpenNLP with uimaFIT, you might
>> find this example here useful:
>>
>>   https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>>
>> Cheers,
>>
>> -- Richard
>>
>> > On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I managed to create a UIMA C++ component that performs POSTagging with
>> > Pythonator. It works very well as a standalone annotator. I created a
>> XMI
>> > with sentence and token annotation, the Python code could iterate them
>> and
>> > create the POS tags. I could run it as follows
>> >
>> > runAE.sh PythonAnnotator.xml -xmi xmi_folder
>> >
>> >
>> > Now I am integrating it to the pipeline using UIMA Fit.
>> >
>> >
>> >
>> > ...
>> >
>> > AggregateBuilder builder = new AggregateBuilder();
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(Se
>> ntDetect.class,
>> >
>> >   SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE,
>> sentdetectModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(To
>> kenizer.class,
>> >
>> >  TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE,
>> tokenizerModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
>> > "/complete_path/PythonAnnotator.xml"))
>> >
>> >
>> > AnalysisEngine aggregate = builder.createAggregate();
>> >
>> >
>> > It runs OK. I can see a log in the Python code that the "process"
>> function
>> > was called. It loads the type system. I can also run getDocumentText
>> and it
>> > works as expected.
>> >
>> >
>> > The issue starts when I try to iterate over the sentence annotations.
>> They
>> > are not there! It works in the standalone version when I read it from
>> XMI.
>> >
>> >
>> > Any clue what I am missing?
>> >
>> >
>> > Thank you,
>> >
>> > William
>>
>>
>

Re: UIMA Fit + Pythonator issue

Posted by William Colen <wi...@gmail.com>.

Thank you very much, Richard!

Actually it was an error in my Python iterator.

I am using sentence detector and tokenizer from OpenNLP.
POS Tagger I am using one created in Python using neural networks (
https://github.com/erickrf/nlpnet).



2017-01-04 21:29 GMT-02:00 Richard Eckart de Castilho <re...@apache.org>:

> Hi William,
>
> what component collection are you using? OpenNLP? Maybe the components
> are not set up completely. If you use OpenNLP with uimaFIT, you might
> find this example here useful:
>
>   https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>
> Cheers,
>
> -- Richard
>
> > On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
> >
> > Hi,
> >
> > I managed to create a UIMA C++ component that performs POSTagging with
> > Pythonator. It works very well as a standalone annotator. I created a XMI
> > with sentence and token annotation, the Python code could iterate them
> and
> > create the POS tags. I could run it as follows
> >
> > runAE.sh PythonAnnotator.xml -xmi xmi_folder
> >
> >
> > Now I am integrating it to the pipeline using UIMA Fit.
> >
> >
> >
> > ...
> >
> > AggregateBuilder builder = new AggregateBuilder();
> >
> > builder.add(AnalysisEngineFactory.createEngineDescription(
> SentDetect.class,
> >
> >   SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE,
> sentdetectModelRes
> > ));
> >
> > builder.add(AnalysisEngineFactory.createEngineDescription(
> Tokenizer.class,
> >
> >  TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE,
> tokenizerModelRes
> > ));
> >
> > builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
> > "/complete_path/PythonAnnotator.xml"))
> >
> >
> > AnalysisEngine aggregate = builder.createAggregate();
> >
> >
> > It runs OK. I can see a log in the Python code that the "process"
> function
> > was called. It loads the type system. I can also run getDocumentText and
> it
> > works as expected.
> >
> >
> > The issue starts when I try to iterate over the sentence annotations.
> They
> > are not there! It works in the standalone version when I read it from
> XMI.
> >
> >
> > Any clue what I am missing?
> >
> >
> > Thank you,
> >
> > William
>
>

Re: UIMA Fit + Pythonator issue

Posted by Richard Eckart de Castilho <re...@apache.org>.

Hi William,

what component collection are you using? OpenNLP? Maybe the components
are not set up completely. If you use OpenNLP with uimaFIT, you might
find this example here useful:

  https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy

Cheers,

-- Richard

> On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
> 
> Hi,
> 
> I managed to create a UIMA C++ component that performs POSTagging with
> Pythonator. It works very well as a standalone annotator. I created a XMI
> with sentence and token annotation, the Python code could iterate them and
> create the POS tags. I could run it as follows
> 
> runAE.sh PythonAnnotator.xml -xmi xmi_folder
> 
> 
> Now I am integrating it to the pipeline using UIMA Fit.
> 
> 
> 
> ...
> 
> AggregateBuilder builder = new AggregateBuilder();
> 
> builder.add(AnalysisEngineFactory.createEngineDescription(SentDetect.class,
> 
>   SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE, sentdetectModelRes
> ));
> 
> builder.add(AnalysisEngineFactory.createEngineDescription(Tokenizer.class,
> 
>  TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE, tokenizerModelRes
> ));
> 
> builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
> "/complete_path/PythonAnnotator.xml"))
> 
> 
> AnalysisEngine aggregate = builder.createAggregate();
> 
> 
> It runs OK. I can see a log in the Python code that the "process" function
> was called. It loads the type system. I can also run getDocumentText and it
> works as expected.
> 
> 
> The issue starts when I try to iterate over the sentence annotations. They
> are not there! It works in the standalone version when I read it from XMI.
> 
> 
> Any clue what I am missing?
> 
> 
> Thank you,
> 
> William