You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by William Colen <wi...@gmail.com> on 2017/01/04 20:06:23 UTC
UIMA Fit + Pythonator issue
Hi,
I managed to create a UIMA C++ component that performs POSTagging with
Pythonator. It works very well as a standalone annotator. I created a XMI
with sentence and token annotation, the Python code could iterate them and
create the POS tags. I could run it as follows
runAE.sh PythonAnnotator.xml -xmi xmi_folder
Now I am integrating it to the pipeline using UIMA Fit.
...
AggregateBuilder builder = new AggregateBuilder();
builder.add(AnalysisEngineFactory.createEngineDescription(SentDetect.class,
SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE, sentdetectModelRes
));
builder.add(AnalysisEngineFactory.createEngineDescription(Tokenizer.class,
TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE, tokenizerModelRes
));
builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
"/complete_path/PythonAnnotator.xml"))
AnalysisEngine aggregate = builder.createAggregate();
It runs OK. I can see a log in the Python code that the "process" function
was called. It loads the type system. I can also run getDocumentText and it
works as expected.
The issue starts when I try to iterate over the sentence annotations. They
are not there! It works in the standalone version when I read it from XMI.
Any clue what I am missing?
Thank you,
William
Re: UIMA Fit + Pythonator issue
Posted by William Colen <wi...@gmail.com>.
I would like to share my final solution
1. Created a combo iterator like the one in OpenNLP
https://gist.github.com/wcolen/037a68fca7e8b402b6e0d3e4df4fab49#file-annotationcomboiterator-py
2. Created a sample,py that iterates over UIMA annotations
https://gist.github.com/wcolen/5edbdcb1d2b6588fead45bbc2dd4fb5b#file-sample-py
William
2017-01-05 1:43 GMT-02:00 William Colen <wi...@gmail.com>:
> Thank you very much, Richard!
>
> Actually it was an error in my Python iterator.
>
> I am using sentence detector and tokenizer from OpenNLP.
> POS Tagger I am using one created in Python using neural networks (
> https://github.com/erickrf/nlpnet).
>
>
>
> 2017-01-04 21:29 GMT-02:00 Richard Eckart de Castilho <re...@apache.org>:
>
>> Hi William,
>>
>> what component collection are you using? OpenNLP? Maybe the components
>> are not set up completely. If you use OpenNLP with uimaFIT, you might
>> find this example here useful:
>>
>> https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>>
>> Cheers,
>>
>> -- Richard
>>
>> > On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I managed to create a UIMA C++ component that performs POSTagging with
>> > Pythonator. It works very well as a standalone annotator. I created a
>> XMI
>> > with sentence and token annotation, the Python code could iterate them
>> and
>> > create the POS tags. I could run it as follows
>> >
>> > runAE.sh PythonAnnotator.xml -xmi xmi_folder
>> >
>> >
>> > Now I am integrating it to the pipeline using UIMA Fit.
>> >
>> >
>> >
>> > ...
>> >
>> > AggregateBuilder builder = new AggregateBuilder();
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(Se
>> ntDetect.class,
>> >
>> > SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE,
>> sentdetectModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(To
>> kenizer.class,
>> >
>> > TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE,
>> tokenizerModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
>> > "/complete_path/PythonAnnotator.xml"))
>> >
>> >
>> > AnalysisEngine aggregate = builder.createAggregate();
>> >
>> >
>> > It runs OK. I can see a log in the Python code that the "process"
>> function
>> > was called. It loads the type system. I can also run getDocumentText
>> and it
>> > works as expected.
>> >
>> >
>> > The issue starts when I try to iterate over the sentence annotations.
>> They
>> > are not there! It works in the standalone version when I read it from
>> XMI.
>> >
>> >
>> > Any clue what I am missing?
>> >
>> >
>> > Thank you,
>> >
>> > William
>>
>>
>
Re: UIMA Fit + Pythonator issue
Posted by William Colen <wi...@gmail.com>.
Thank you very much, Richard!
Actually it was an error in my Python iterator.
I am using sentence detector and tokenizer from OpenNLP.
POS Tagger I am using one created in Python using neural networks (
https://github.com/erickrf/nlpnet).
2017-01-04 21:29 GMT-02:00 Richard Eckart de Castilho <re...@apache.org>:
> Hi William,
>
> what component collection are you using? OpenNLP? Maybe the components
> are not set up completely. If you use OpenNLP with uimaFIT, you might
> find this example here useful:
>
> https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>
> Cheers,
>
> -- Richard
>
> > On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
> >
> > Hi,
> >
> > I managed to create a UIMA C++ component that performs POSTagging with
> > Pythonator. It works very well as a standalone annotator. I created a XMI
> > with sentence and token annotation, the Python code could iterate them
> and
> > create the POS tags. I could run it as follows
> >
> > runAE.sh PythonAnnotator.xml -xmi xmi_folder
> >
> >
> > Now I am integrating it to the pipeline using UIMA Fit.
> >
> >
> >
> > ...
> >
> > AggregateBuilder builder = new AggregateBuilder();
> >
> > builder.add(AnalysisEngineFactory.createEngineDescription(
> SentDetect.class,
> >
> > SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE,
> sentdetectModelRes
> > ));
> >
> > builder.add(AnalysisEngineFactory.createEngineDescription(
> Tokenizer.class,
> >
> > TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE,
> tokenizerModelRes
> > ));
> >
> > builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
> > "/complete_path/PythonAnnotator.xml"))
> >
> >
> > AnalysisEngine aggregate = builder.createAggregate();
> >
> >
> > It runs OK. I can see a log in the Python code that the "process"
> function
> > was called. It loads the type system. I can also run getDocumentText and
> it
> > works as expected.
> >
> >
> > The issue starts when I try to iterate over the sentence annotations.
> They
> > are not there! It works in the standalone version when I read it from
> XMI.
> >
> >
> > Any clue what I am missing?
> >
> >
> > Thank you,
> >
> > William
>
>
Re: UIMA Fit + Pythonator issue
Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi William,
what component collection are you using? OpenNLP? Maybe the components
are not set up completely. If you use OpenNLP with uimaFIT, you might
find this example here useful:
https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
Cheers,
-- Richard
> On 04.01.2017, at 21:06, William Colen <wi...@gmail.com> wrote:
>
> Hi,
>
> I managed to create a UIMA C++ component that performs POSTagging with
> Pythonator. It works very well as a standalone annotator. I created a XMI
> with sentence and token annotation, the Python code could iterate them and
> create the POS tags. I could run it as follows
>
> runAE.sh PythonAnnotator.xml -xmi xmi_folder
>
>
> Now I am integrating it to the pipeline using UIMA Fit.
>
>
>
> ...
>
> AggregateBuilder builder = new AggregateBuilder();
>
> builder.add(AnalysisEngineFactory.createEngineDescription(SentDetect.class,
>
> SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE, sentdetectModelRes
> ));
>
> builder.add(AnalysisEngineFactory.createEngineDescription(Tokenizer.class,
>
> TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE, tokenizerModelRes
> ));
>
> builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
> "/complete_path/PythonAnnotator.xml"))
>
>
> AnalysisEngine aggregate = builder.createAggregate();
>
>
> It runs OK. I can see a log in the Python code that the "process" function
> was called. It loads the type system. I can also run getDocumentText and it
> works as expected.
>
>
> The issue starts when I try to iterate over the sentence annotations. They
> are not there! It works in the standalone version when I read it from XMI.
>
>
> Any clue what I am missing?
>
>
> Thank you,
>
> William