You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Wu, Stephen T., Ph.D." <Wu...@mayo.edu> on 2012/12/05 21:36:19 UTC

Re: type system changes needed to read SHARP data

Sorry for the delayed response, Steve.  The type system was not designed to
house the annotations, but rather the later results of processing.  It makes
sense to do both.  

Takeaways, first, then point-by-point response.
For 3.1.0 the type system should include more than just "LabMention,
ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
AnatomicalSiteMention."  It should also include the exhaustive list of
attributes, which would come as subtypes of Modifier.

Let me hear some +1s and we'll make it happen...

stephen


>> "Clinical_attribute" -- is this what you're looking for:
>> org.apache.ctakes.typesystem.type.refsem.Attribute
>> It inherits from Element.
> But Attribute is a TOP and we need an Annotation here. (An added concern is,
> does it really make sense to have a raw Attribute, and not some specific
> sub-type like BodyLaterality or BodySide?)
To capture the Knowtator annotations, yes, we do need an Annotation --
namely Modifier subtypes, as you've suggested.
Attribute is not really meant to be instantiated, it is just meant to be a
super-type that could feasibly provide easier indexing.

>> Lab should be at org.apache.ctakes.typesystem.type.refsem.Lab
> But Lab is a TOP, and we need an Annotation here.
Again, for the case of reading in Knowtator, yes.  I think the addition of
LabMention, etc, were slated for 3.1.0, right james?

>> Use the type org.apache.ctakes.typesystem.type.textsem.Modifier with the
>> "category" feature.
> Should there be constants for each of these categories?
There are constants in
/ctakes-type-system/src/main/java/org/apache/ctakes/typesystem/type/constant
s/CONST.java

>> "Person", --> Entity
> But Entity is a TOP, not an Annotation.
This is an interesting question.  Person was not previously included in a
CEM, so it doesn't have a semantic TOP subtype.  Therefore, it also doesn't
have a Annotation subtype.  For now we'll just leave it be.

>>> After working with this data I think we should consider having separate UIMA
>>> Annotation sub-types for each of the things that are Modifiers now. For
>>> example, if we have a real Severity Annotation for textual mentions of
>>> severity, then the CAS makes it easy to select these.
I think we're lining up with you on this now.

> The types we're talking about are not
> used locally within a single AnalysisEngine. They're read in from the
> SHARPKnowtatorXMLReader AnalysisEngine, and used separately...
> So they can't be local to a
> single AnalysisEngine, and they must be in the CAS.
Agreed, because of the gold standard representation issue.

> That's exactly what I'm talking about with the severity modifiers. We have a
> severity modifier extraction annotator, and we *do* need to evaluate its
> performance by comparing the severity modifiers it extracts to those in the
> annotated data... So we really do want everything that's in the Knowtator XML
> annotations to be loaded and accessible to all our UIMA AnalysisEngines.
Ok.  There is a slight difference in finding modifiers because, for the most
part annotators wouldn't mark e.g., a negation term that didn't modify
anything clinically interesting.  But there are enough cases where an
attribute should be searched for and evaluated on its own that I suppose
it's worth it to add all these Modifier subtypes.
 
>> 2) Will these modifiers be reusable downstream?
> I'm not sure what you mean here. Are you suggesting that the type system
> should only have types for things that external users of cTAKES might need,
> and that we shouldn't have types for things that must be passed between
> different cTAKES AnalysisEngines?
Sorry for being unclear: "downstream" in this context meant "to other UIMA
components in the NLP pipeline."


RE: type system changes needed to read SHARP data

Posted by "Masanz, James J." <Ma...@mayo.edu>.
CU Boulder (Martha Palmer et al) is working on an annotation tool. I forwarded a link to our thread to them for comments.

-- James


> -----Original Message-----
> From: ctakes-dev-return-958-Masanz.James=mayo.edu@incubator.apache.org
> [mailto:ctakes-dev-return-958-
> Masanz.James=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei
> Sent: Wednesday, December 12, 2012 10:24 AM
> To: ctakes-dev@incubator.apache.org
> Subject: RE: type system changes needed to read SHARP data
> 
> > Yeah, agreed that web based annotation tools are the way to go. I
> > would love to see a BRAT-like tool that could work directly from a
> > UIMA type system schema. But I'm not going to hold my breath. ;-)
> 
> It sounds like a comprehensive annotation tool (BRAT on top of UIMA)
> that works directly from a UIMA type system schema would be a common
> tool that would benefit the entire UIMA community; not just OpenNLP or
> cTAKES.  Perhaps we can combine our efforts.
> 
> --Pei
> 
> > -----Original Message-----
> > From: Steven Bethard [mailto:steven.bethard@Colorado.EDU]
> > Sent: Saturday, December 08, 2012 10:08 AM
> > To: ctakes-dev@incubator.apache.org
> > Subject: Re: type system changes needed to read SHARP data
> >
> > On Dec 7, 2012, at 6:14 PM, Jörn Kottmann <ko...@gmail.com> wrote:
> > > Anyway to do an annotation project efficiently a web based tool like
> > > brat is better than the Cas Editor, but brat is not easy to
> > > integrate with UIMA currently. For now we are doing it half-half,
> > > the first annotation work is done with the Cas Editor (layout,
> > > sentences, tokens, named entities), and the more advanced tasks are
> > > done with brat (e.g. relations, coref, disambiguation).
> >
> > Yeah, agreed that web based annotation tools are the way to go. I
> > would love to see a BRAT-like tool that could work directly from a
> > UIMA type system schema. But I'm not going to hold my breath. ;-)
> >
> > Steve

Re: type system changes needed to read SHARP data

Posted by Jörn Kottmann <ko...@gmail.com>.
On 12/12/2012 05:24 PM, Chen, Pei wrote:
>> Yeah, agreed that web based annotation tools are the way to go. I would
>> love to see a BRAT-like tool that could work directly from a UIMA type
>> system schema. But I'm not going to hold my breath. ;-)
> It sounds like a comprehensive annotation tool (BRAT on top of UIMA) that works directly from a UIMA type system schema would be a common tool that would benefit the entire UIMA community; not just OpenNLP or cTAKES.  Perhaps we can combine our efforts.
>

Over at OpenNLP we are working on the Corpus Server which can be used to 
host a set of XMI files and share them between a group of annotators,
it would be really nice if we could find a way to attach BRAT to this 
server. UIMA based trainers can be connected to the Corpus Server via
a Collection Reader which fetches the training material from it.

Jörn

RE: type system changes needed to read SHARP data

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
> Yeah, agreed that web based annotation tools are the way to go. I would
> love to see a BRAT-like tool that could work directly from a UIMA type
> system schema. But I'm not going to hold my breath. ;-)

It sounds like a comprehensive annotation tool (BRAT on top of UIMA) that works directly from a UIMA type system schema would be a common tool that would benefit the entire UIMA community; not just OpenNLP or cTAKES.  Perhaps we can combine our efforts.

--Pei

> -----Original Message-----
> From: Steven Bethard [mailto:steven.bethard@Colorado.EDU]
> Sent: Saturday, December 08, 2012 10:08 AM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: type system changes needed to read SHARP data
> 
> On Dec 7, 2012, at 6:14 PM, Jörn Kottmann <ko...@gmail.com> wrote:
> > Anyway to do an annotation project efficiently a web based tool like
> > brat is better than the Cas Editor, but brat is not easy to integrate
> > with UIMA currently. For now we are doing it half-half, the first
> > annotation work is done with the Cas Editor (layout, sentences,
> > tokens, named entities), and the more advanced tasks are done with
> > brat (e.g. relations, coref, disambiguation).
> 
> Yeah, agreed that web based annotation tools are the way to go. I would
> love to see a BRAT-like tool that could work directly from a UIMA type
> system schema. But I'm not going to hold my breath. ;-)
> 
> Steve

Re: type system changes needed to read SHARP data

Posted by Steven Bethard <st...@Colorado.EDU>.
On Dec 7, 2012, at 6:14 PM, Jörn Kottmann <ko...@gmail.com> wrote:
> Anyway to do an annotation project efficiently a web based tool like 
> brat is better than the Cas Editor, but brat is not
> easy to integrate with UIMA currently. For now we are doing it 
> half-half, the first annotation work is done with the Cas Editor 
> (layout, sentences,
> tokens, named entities), and the more advanced tasks are done with brat 
> (e.g. relations, coref, disambiguation).

Yeah, agreed that web based annotation tools are the way to go. I would love to see a BRAT-like tool that could work directly from a UIMA type system schema. But I'm not going to hold my breath… ;-)

Steve

Re: type system changes needed to read SHARP data

Posted by Jörn Kottmann <ko...@gmail.com>.
On 12/07/2012 06:02 PM, Chen, Pei wrote:
>> CAS Editor doesn't count - that's not usable for any real large-scale complex annotation
> Can we extend the existing UIMA one CAS Editor? Update Knowtator to work directly off the UIMA type system file (objects) Or even, something like BRAT (http://brat.nlplab.org/) that could works directly off a UIMA types system file/objects

I extended the Cas Editor with some plugins to suit my annotation needs. 
Some of them are Open Source, like the OpenNLP integration or the 
connector to the Corpus Server.
Anyway to do an annotation project efficiently a web based tool like 
brat is better than the Cas Editor, but brat is not
easy to integrate with UIMA currently. For now we are doing it 
half-half, the first annotation work is done with the Cas Editor 
(layout, sentences,
tokens, named entities), and the more advanced tasks are done with brat 
(e.g. relations, coref, disambiguation).

For my next annotation project I will probably try to do the named 
entities also with brat, but currently its too slow compared tot he Cas 
Editor with
the OpenNLP support.

Jörn



RE: type system changes needed to read SHARP data

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
> What we'd really want is an annotation tool that works directly off of a UIMA type system file
I was thinking of same... I think it's fine to have additional types which are not used so then, they can be easily subset'd.  For example, CTS could be a subset of this larger schema.  But I think the key would be sharing the same underlying common types/schema/objects.  So any new types created in the annotation tool/schema could just be automatically reused by the system (or subset'd which is an easier problem to solve than mapping).

> CAS Editor doesn't count - that's not usable for any real large-scale complex annotation
Can we extend the existing UIMA one CAS Editor? Update Knowtator to work directly off the UIMA type system file (objects) Or even, something like BRAT (http://brat.nlplab.org/) that could works directly off a UIMA types system file/objects?



> -----Original Message-----
> From: Steven Bethard [mailto:steven.bethard@Colorado.EDU]
> Sent: Friday, December 07, 2012 5:33 AM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: type system changes needed to read SHARP data
> 
> On Dec 5, 2012, at 9:36 PM, "Wu, Stephen T., Ph.D."
> <Wu...@mayo.edu> wrote:
> > For 3.1.0 the type system should include more than just "LabMention,
> > ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
> > AnatomicalSiteMention."  It should also include the exhaustive list of
> > attributes, which would come as subtypes of Modifier.
> 
> +1
> 
> On Dec 6, 2012, at 8:20 PM, "Chen, Pei" <Pe...@childrens.harvard.edu>
> wrote:
> > Just taking a step back,  should there always be a 1-1 mapping between
> human annotated data (Knowtator schema) and the System annotated data
> (cTAKES type system)?
> > If this is true, then should they really share the schema then?  i.e.
> > Can the annotation tool(s) be auto generated/based off the type system
> > schema or vice versa then?  Just thinking of ways we may save time
> > with mappings...
> 
> Ideally yes, they should share exactly the same schema. The main problem
> here is annotation tools. What we'd really want is an annotation tool that
> works directly off of a UIMA type system file. But I don't know of any such
> tool. (And no, the CAS Editor doesn't count - that's not usable for any real
> large-scale complex annotation.)
> 
> Steve

Re: type system changes needed to read SHARP data

Posted by Steven Bethard <st...@Colorado.EDU>.
On Dec 5, 2012, at 9:36 PM, "Wu, Stephen T., Ph.D." <Wu...@mayo.edu> wrote:
> For 3.1.0 the type system should include more than just "LabMention,
> ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
> AnatomicalSiteMention."  It should also include the exhaustive list of
> attributes, which would come as subtypes of Modifier.

+1

On Dec 6, 2012, at 8:20 PM, "Chen, Pei" <Pe...@childrens.harvard.edu> wrote:
> Just taking a step back,  should there always be a 1-1 mapping between human annotated data (Knowtator schema) and the System annotated data (cTAKES type system)?
> If this is true, then should they really share the schema then?  i.e. Can the annotation tool(s) be auto generated/based off the type system schema or vice versa then?  Just thinking of ways we may save time with mappings…

Ideally yes, they should share exactly the same schema. The main problem here is annotation tools. What we'd really want is an annotation tool that works directly off of a UIMA type system file. But I don't know of any such tool. (And no, the CAS Editor doesn't count - that's not usable for any real large-scale complex annotation.)

Steve

Re: type system changes needed to read SHARP data

Posted by "Wu, Stephen T., Ph.D." <Wu...@mayo.edu>.
This was our hesitation.  We didn't believe that the annotation schema
should unequivocally set the schema of a "common" type system.

Unfortunately, there are multiple people trying to define the semantics of
what gets stored.  The annotations people went ahead and defined an
annotation schema very early on because it was necessary in order to get
actual annotations out.  But I think other people (i.e., the SHARP CEM and
data norm people) have continued tinkering under the hood.  So we could end
up with more types that were not in the original Knowtator annotations.

Currently, all of our stuff (CEM->type system, annotation schema->type
system, task needs->type system, type system->documentation) requires manual
work because there are so many sources, and all conflicts would have to be
mediated anyways.  That's frustrating, and I don't know how to change it.
Suggestions are welcome.

stephen

On 12/6/12 1:20 PM, "Chen, Pei" <Pe...@childrens.harvard.edu> wrote:

> Hi Steven,
> +1 it seems reasonable.
> 
> Just taking a step back,  should there always be a 1-1 mapping between human
> annotated data (Knowtator schema) and the System annotated data (cTAKES type
> system)?
> If this is true, then should they really share the schema then?  i.e. Can the
> annotation tool(s) be auto generated/based off the type system schema or vice
> versa then?  Just thinking of ways we may save time with mappings...
> 
> --Pei
> 
>> -----Original Message-----
>> From: Wu, Stephen T., Ph.D. [mailto:Wu.Stephen@mayo.edu]
>> Sent: Wednesday, December 05, 2012 3:37 PM
>> To: ctakes-dev@incubator.apache.org
>> Subject: Re: type system changes needed to read SHARP data
>> 
>> Sorry for the delayed response, Steve.  The type system was not designed to
>> house the annotations, but rather the later results of processing.  It makes
>> sense to do both.
>> 
>> Takeaways, first, then point-by-point response.
>> For 3.1.0 the type system should include more than just "LabMention,
>> ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
>> AnatomicalSiteMention."  It should also include the exhaustive list of
>> attributes, which would come as subtypes of Modifier.
>> 
>> 
>> Let me hear some +1s and we'll make it happen...
>> 
>> stephen
>> 
>> 
>>>> "Clinical_attribute" -- is this what you're looking for:
>>>> org.apache.ctakes.typesystem.type.refsem.Attribute
>>>> It inherits from Element.
>>> But Attribute is a TOP and we need an Annotation here. (An added
>>> concern is, does it really make sense to have a raw Attribute, and not
>>> some specific sub-type like BodyLaterality or BodySide?)
>> To capture the Knowtator annotations, yes, we do need an Annotation --
>> namely Modifier subtypes, as you've suggested.
>> Attribute is not really meant to be instantiated, it is just meant to be a
>> super-
>> type that could feasibly provide easier indexing.
>> 
>>>> Lab should be at org.apache.ctakes.typesystem.type.refsem.Lab
>>> But Lab is a TOP, and we need an Annotation here.
>> Again, for the case of reading in Knowtator, yes.  I think the addition of
>> LabMention, etc, were slated for 3.1.0, right james?
>> 
>>>> Use the type org.apache.ctakes.typesystem.type.textsem.Modifier with
>>>> the "category" feature.
>>> Should there be constants for each of these categories?
>> There are constants in
>> /ctakes-type-
>> system/src/main/java/org/apache/ctakes/typesystem/type/constant
>> s/CONST.java
>> 
>>>> "Person", --> Entity
>>> But Entity is a TOP, not an Annotation.
>> This is an interesting question.  Person was not previously included in a
>> CEM,
>> so it doesn't have a semantic TOP subtype.  Therefore, it also doesn't have a
>> Annotation subtype.  For now we'll just leave it be.
>> 
>>>>> After working with this data I think we should consider having
>>>>> separate UIMA Annotation sub-types for each of the things that are
>>>>> Modifiers now. For example, if we have a real Severity Annotation
>>>>> for textual mentions of severity, then the CAS makes it easy to select
>> these.
>> I think we're lining up with you on this now.
>> 
>>> The types we're talking about are not
>>> used locally within a single AnalysisEngine. They're read in from the
>>> SHARPKnowtatorXMLReader AnalysisEngine, and used separately...
>>> So they can't be local to a
>>> single AnalysisEngine, and they must be in the CAS.
>> Agreed, because of the gold standard representation issue.
>> 
>>> That's exactly what I'm talking about with the severity modifiers. We
>>> have a severity modifier extraction annotator, and we *do* need to
>>> evaluate its performance by comparing the severity modifiers it
>>> extracts to those in the annotated data... So we really do want
>>> everything that's in the Knowtator XML annotations to be loaded and
>> accessible to all our UIMA AnalysisEngines.
>> Ok.  There is a slight difference in finding modifiers because, for the most
>> part annotators wouldn't mark e.g., a negation term that didn't modify
>> anything clinically interesting.  But there are enough cases where an
>> attribute
>> should be searched for and evaluated on its own that I suppose it's worth it
>> to add all these Modifier subtypes.
>> 
>>>> 2) Will these modifiers be reusable downstream?
>>> I'm not sure what you mean here. Are you suggesting that the type
>>> system should only have types for things that external users of cTAKES
>>> might need, and that we shouldn't have types for things that must be
>>> passed between different cTAKES AnalysisEngines?
>> Sorry for being unclear: "downstream" in this context meant "to other UIMA
>> components in the NLP pipeline."
> 


RE: type system changes needed to read SHARP data

Posted by "Chen, Pei" <Pe...@childrens.harvard.edu>.
Hi Steven,
+1 it seems reasonable.

Just taking a step back,  should there always be a 1-1 mapping between human annotated data (Knowtator schema) and the System annotated data (cTAKES type system)?
If this is true, then should they really share the schema then?  i.e. Can the annotation tool(s) be auto generated/based off the type system schema or vice versa then?  Just thinking of ways we may save time with mappings...

--Pei

> -----Original Message-----
> From: Wu, Stephen T., Ph.D. [mailto:Wu.Stephen@mayo.edu]
> Sent: Wednesday, December 05, 2012 3:37 PM
> To: ctakes-dev@incubator.apache.org
> Subject: Re: type system changes needed to read SHARP data
> 
> Sorry for the delayed response, Steve.  The type system was not designed to
> house the annotations, but rather the later results of processing.  It makes
> sense to do both.
> 
> Takeaways, first, then point-by-point response.
> For 3.1.0 the type system should include more than just "LabMention,
> ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
> AnatomicalSiteMention."  It should also include the exhaustive list of
> attributes, which would come as subtypes of Modifier.
> 
> 
> Let me hear some +1s and we'll make it happen...
> 
> stephen
> 
> 
> >> "Clinical_attribute" -- is this what you're looking for:
> >> org.apache.ctakes.typesystem.type.refsem.Attribute
> >> It inherits from Element.
> > But Attribute is a TOP and we need an Annotation here. (An added
> > concern is, does it really make sense to have a raw Attribute, and not
> > some specific sub-type like BodyLaterality or BodySide?)
> To capture the Knowtator annotations, yes, we do need an Annotation --
> namely Modifier subtypes, as you've suggested.
> Attribute is not really meant to be instantiated, it is just meant to be a super-
> type that could feasibly provide easier indexing.
> 
> >> Lab should be at org.apache.ctakes.typesystem.type.refsem.Lab
> > But Lab is a TOP, and we need an Annotation here.
> Again, for the case of reading in Knowtator, yes.  I think the addition of
> LabMention, etc, were slated for 3.1.0, right james?
> 
> >> Use the type org.apache.ctakes.typesystem.type.textsem.Modifier with
> >> the "category" feature.
> > Should there be constants for each of these categories?
> There are constants in
> /ctakes-type-
> system/src/main/java/org/apache/ctakes/typesystem/type/constant
> s/CONST.java
> 
> >> "Person", --> Entity
> > But Entity is a TOP, not an Annotation.
> This is an interesting question.  Person was not previously included in a CEM,
> so it doesn't have a semantic TOP subtype.  Therefore, it also doesn't have a
> Annotation subtype.  For now we'll just leave it be.
> 
> >>> After working with this data I think we should consider having
> >>> separate UIMA Annotation sub-types for each of the things that are
> >>> Modifiers now. For example, if we have a real Severity Annotation
> >>> for textual mentions of severity, then the CAS makes it easy to select
> these.
> I think we're lining up with you on this now.
> 
> > The types we're talking about are not
> > used locally within a single AnalysisEngine. They're read in from the
> > SHARPKnowtatorXMLReader AnalysisEngine, and used separately...
> > So they can't be local to a
> > single AnalysisEngine, and they must be in the CAS.
> Agreed, because of the gold standard representation issue.
> 
> > That's exactly what I'm talking about with the severity modifiers. We
> > have a severity modifier extraction annotator, and we *do* need to
> > evaluate its performance by comparing the severity modifiers it
> > extracts to those in the annotated data... So we really do want
> > everything that's in the Knowtator XML annotations to be loaded and
> accessible to all our UIMA AnalysisEngines.
> Ok.  There is a slight difference in finding modifiers because, for the most
> part annotators wouldn't mark e.g., a negation term that didn't modify
> anything clinically interesting.  But there are enough cases where an attribute
> should be searched for and evaluated on its own that I suppose it's worth it
> to add all these Modifier subtypes.
> 
> >> 2) Will these modifiers be reusable downstream?
> > I'm not sure what you mean here. Are you suggesting that the type
> > system should only have types for things that external users of cTAKES
> > might need, and that we shouldn't have types for things that must be
> > passed between different cTAKES AnalysisEngines?
> Sorry for being unclear: "downstream" in this context meant "to other UIMA
> components in the NLP pipeline."


Re: type system changes needed to read SHARP data

Posted by "Wu, Stephen T., Ph.D." <Wu...@mayo.edu>.
Maybe I should've added some additional considerations around
SHARPKnowtatorXMLReader to the discussion...

The previous way to handle all these modifiers was to directly map them to
the named entities that they're associated with.  Again taking negation as
an example, we hadn't been creating a Modifier subtype for polarity, but
just set the value of a named entity as negated.

Storing all of these attributes as Modifier subtypes in
SHARPKnowtatorXMLReader does not eliminate the need to map these subtypes to
NEs.  The Knowtator data includes both the spans of modifiers AND the
assignment of values to the NEs.

So there are some modifiers that you'd never be interested in evaluating on
their own apart from the NEs.  However, I'm agreeing with the previous
proposition because there are other modifiers that are interesting to
evaluate apart from NEs, and we should just keep things consistent.

stephen


On 12/5/12 2:36 PM, "Stephen Wu" <wu...@mayo.edu> wrote:

> Sorry for the delayed response, Steve.  The type system was not designed to
> house the annotations, but rather the later results of processing.  It makes
> sense to do both.
> 
> Takeaways, first, then point-by-point response.
> For 3.1.0 the type system should include more than just "LabMention,
> ProcedureMention, SignSymptomMention, DiseaseDisorderMention,
> AnatomicalSiteMention."  It should also include the exhaustive list of
> attributes, which would come as subtypes of Modifier.
> 
> Let me hear some +1s and we'll make it happen...
> 
> stephen
> 
> 
>>> "Clinical_attribute" -- is this what you're looking for:
>>> org.apache.ctakes.typesystem.type.refsem.Attribute
>>> It inherits from Element.
>> But Attribute is a TOP and we need an Annotation here. (An added concern is,
>> does it really make sense to have a raw Attribute, and not some specific
>> sub-type like BodyLaterality or BodySide?)
> To capture the Knowtator annotations, yes, we do need an Annotation --
> namely Modifier subtypes, as you've suggested.
> Attribute is not really meant to be instantiated, it is just meant to be a
> super-type that could feasibly provide easier indexing.
> 
>>> Lab should be at org.apache.ctakes.typesystem.type.refsem.Lab
>> But Lab is a TOP, and we need an Annotation here.
> Again, for the case of reading in Knowtator, yes.  I think the addition of
> LabMention, etc, were slated for 3.1.0, right james?
> 
>>> Use the type org.apache.ctakes.typesystem.type.textsem.Modifier with the
>>> "category" feature.
>> Should there be constants for each of these categories?
> There are constants in
> /ctakes-type-system/src/main/java/org/apache/ctakes/typesystem/type/constant
> s/CONST.java
> 
>>> "Person", --> Entity
>> But Entity is a TOP, not an Annotation.
> This is an interesting question.  Person was not previously included in a
> CEM, so it doesn't have a semantic TOP subtype.  Therefore, it also doesn't
> have a Annotation subtype.  For now we'll just leave it be.
> 
>>>> After working with this data I think we should consider having separate
>>>> UIMA
>>>> Annotation sub-types for each of the things that are Modifiers now. For
>>>> example, if we have a real Severity Annotation for textual mentions of
>>>> severity, then the CAS makes it easy to select these.
> I think we're lining up with you on this now.
> 
>> The types we're talking about are not
>> used locally within a single AnalysisEngine. They're read in from the
>> SHARPKnowtatorXMLReader AnalysisEngine, and used separately...
>> So they can't be local to a
>> single AnalysisEngine, and they must be in the CAS.
> Agreed, because of the gold standard representation issue.
> 
>> That's exactly what I'm talking about with the severity modifiers. We have a
>> severity modifier extraction annotator, and we *do* need to evaluate its
>> performance by comparing the severity modifiers it extracts to those in the
>> annotated data... So we really do want everything that's in the Knowtator XML
>> annotations to be loaded and accessible to all our UIMA AnalysisEngines.
> Ok.  There is a slight difference in finding modifiers because, for the most
> part annotators wouldn't mark e.g., a negation term that didn't modify
> anything clinically interesting.  But there are enough cases where an
> attribute should be searched for and evaluated on its own that I suppose
> it's worth it to add all these Modifier subtypes.
>  
>>> 2) Will these modifiers be reusable downstream?
>> I'm not sure what you mean here. Are you suggesting that the type system
>> should only have types for things that external users of cTAKES might need,
>> and that we shouldn't have types for things that must be passed between
>> different cTAKES AnalysisEngines?
> Sorry for being unclear: "downstream" in this context meant "to other UIMA
> components in the NLP pipeline."
>