You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "Cawley, Tim" <Ti...@dsto.defence.gov.au> on 2008/03/17 08:22:18 UTC

View data

Hi all,

My understanding is that SOFA data is immutable.

What do people thing of making it appendable?

Tim

IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914.  If you have received this email in error, you are requested to contact the sender and delete the email.



Re: View data

Posted by Burn Lewis <bu...@gmail.com>.
>  I would like each translation annotator to append its
>  translations and annotations to the same results view.

I would assume that you wouldn't want an annotator to run on a partially
completed view ... or else you'd have to re-run annotators after each append
to the view which could duplicate work and possibly annotations.

As Eddie described we added the translations to the source-language view and
waited until all were available before creating the target-language view,
but it might be reasonable to add them to a target-language view before the
sofa data is created, and only when all translators have run would the sofa
data be set ... would require careful dead-reckoning of annotation bounds!

- Burn.

Re: View data

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Tim,

>  1. Translation of a document that contains multiple languages.
>  I need to use different translation annotators on different parts of a
>  document. I would like each translation annotator to append its
>  translations and annotations to the same results view.
>
>  At the moment there is a proliferation of views for each translation
>  annotator followed by an annoying process of stitching them back
>  together in a result view.

Yes, proliferation of views is a bad thing. On another
multi-translation project using UIMA, the design put all translation
results as annotations in the source text view. Translation results
included a feature to identify the particular translator, and this
feature used to iterate over annotations associated with the desired
translation. New target language views were only created when all
translation work was finished.

The same organization is used for transcription results, where ASR
results are all placed in an audio view, and the resultant
transcription view(s) created after all ASR processing is done.

>  2. Keeping a log of processing with the data.
>  I would like every annotator to be able to append what it has done. It
>  would be useful for my exceptions to go in as annotations pointing at
>  the log. I realise there are other logging mechanisms and I do use them,
>  but a logfile is not kept on a per document basis and does not reside
>  with the relevant processes data.

Thilo's suggestion here is good, just create a new log FS that
includes the text and a feature to identify the component, provide
ordering, etc.

Regards,
Eddie

Re: View data

Posted by Thilo Goetz <tw...@gmx.de>.
Hi Tim,

Cawley, Tim wrote:
> I can think of two use cases.  
> 
> 1. Translation of a document that contains multiple languages.  
> I need to use different translation annotators on different parts of a
> document. I would like each translation annotator to append its
> translations and annotations to the same results view.  
> 
> At the moment there is a proliferation of views for each translation
> annotator followed by an annoying process of stitching them back
> together in a result view.

That's exactly what views are supposed to be good for.
Any way views could be improved in your opinion?

> 
> 2. Keeping a log of processing with the data.
> I would like every annotator to be able to append what it has done. It
> would be useful for my exceptions to go in as annotations pointing at
> the log. I realise there are other logging mechanisms and I do use them,
> but a logfile is not kept on a per document basis and does not reside
> with the relevant processes data.

Is it not sufficient to have a string valued feature
somewhere where you can keep this information?

Anyway, I don't have strong opinions about this.  I
think I wouldn't want to use an annotator that modifies
my data (as opposed to creating new metadata), even
if it's just appending.  You may well have annotators
that work differently on longer documents than shorter
ones (summarization, document categorization etc).
But that's just my opinion.

--Thilo

> 
> Tim
> 
> 
> -----Original Message-----
> From: Thilo Goetz [mailto:twgoetz@gmx.de] 
> Sent: Monday, 17 March 2008 8:45 PM
> To: uima-user@incubator.apache.org
> Subject: Re: View data
> 
> Cawley, Tim wrote:
>> Hi all,
>>
>> My understanding is that SOFA data is immutable.
>>
>> What do people thing of making it appendable?
> 
> What's your use case?
> 
> --Thilo
> 
>> Tim
>>
>> IMPORTANT: This email remains the property of the Australian Defence
> Organisation and is subject to the jurisdiction of section 70 of the
> CRIMES ACT 1914.  If you have received this email in error, you are
> requested to contact the sender and delete the email.

RE: View data

Posted by "Cawley, Tim" <Ti...@dsto.defence.gov.au>.
I can think of two use cases.  

1. Translation of a document that contains multiple languages.  
I need to use different translation annotators on different parts of a
document. I would like each translation annotator to append its
translations and annotations to the same results view.  

At the moment there is a proliferation of views for each translation
annotator followed by an annoying process of stitching them back
together in a result view.

2. Keeping a log of processing with the data.
I would like every annotator to be able to append what it has done. It
would be useful for my exceptions to go in as annotations pointing at
the log. I realise there are other logging mechanisms and I do use them,
but a logfile is not kept on a per document basis and does not reside
with the relevant processes data.

Tim


-----Original Message-----
From: Thilo Goetz [mailto:twgoetz@gmx.de] 
Sent: Monday, 17 March 2008 8:45 PM
To: uima-user@incubator.apache.org
Subject: Re: View data

Cawley, Tim wrote:
> Hi all,
> 
> My understanding is that SOFA data is immutable.
> 
> What do people thing of making it appendable?

What's your use case?

--Thilo

> 
> Tim
> 
> IMPORTANT: This email remains the property of the Australian Defence
Organisation and is subject to the jurisdiction of section 70 of the
CRIMES ACT 1914.  If you have received this email in error, you are
requested to contact the sender and delete the email.
> 

Re: View data

Posted by Thilo Goetz <tw...@gmx.de>.
Cawley, Tim wrote:
> Hi all,
> 
> My understanding is that SOFA data is immutable.
> 
> What do people thing of making it appendable?

What's your use case?

--Thilo

> 
> Tim
> 
> IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914.  If you have received this email in error, you are requested to contact the sender and delete the email.
>