You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "Kline, Larry D" <La...@USONCOLOGY.COM> on 2013/02/28 17:04:54 UTC

Re: ConceptMapper dictionary entry variant

I'm finally getting around this.  I think your 1st suggestion would work
except that my task is complicated by the fact that I am stemming the
dictionary.  If I understand the ConceptMapper process correctly I
believe the dictionary is stemmed when it is loaded into memory.  Thus
the matched text is stemmed.  What I need is the original unstemmed
text.  It sounds like I would have to modify the part of ConceptMapper
that does the stemming, or could I do something in the dictionary so it
would keep the stemmed and non-stemmed variants?

 

My dictionary looks like this:

 

<synonym>

      <token canonical="Primary malignant neoplasm of parotid gland
(disorder)" ConceptId="372004005"
BrokerConceptId="ABjfrwDvkAhN1p++2I5XRA" G2Id="17001"
CodingSystem="SNOMED_CT" Category="ProblemDef" SemClass="Problem"
POS="NN">

            <variant base="Primary malignant parotid gland Ca"
IsPreferredTerm="false" POS="NN"/>

            <variant base="Primary malignant parotid gland malignancy"
IsPreferredTerm="false" POS="NN"/>

            <variant base="Primary cancer of parotid gland"
IsPreferredTerm="false" POS="NN"/>

            <variant base="Parotid gland cancer" IsPreferredTerm="false"
POS="NN"/>

            ...

      </token>

      ...

</synonym>

 

Thanks

Larry

 

> There are a few ways this can be done, depending on your needs. 

> 

> 1. Set the "ResultingAnnotationMatchedTextFeature" parameter in the
descriptor. This should be the name of a String feature > > in your
result annotation (e.g., DictTerm is used in the distributed example).
In the resulting annotation instance, it will > be filled with actual
text that was matched in input document.

> 

> 2. Set the "MatchedTokensFeatureName"parameter in the descriptor. This
should be the name of an FSArray feature in your > result annotation,
and will be filled with tokens matched in input document.

> 

> 3. Add some other indicator into your dictionary (e.g., a simple
unique identifier for each variant, or perhaps an id for the > parent
and one for the variant--depends on your needs). Then use the fact that
you can map any attribute in the dictionary > > into the resulting
annotation by setting up the parameters "AttributeList" and
"FeatureList". 

> 

> Let me know if this make sense, or if you need more information.
Looking back at the documentation, I can see that things > > > really
need to be clarified a lot--sorry!

> 

> 

> 

> On Jan 4, 2013, at 3:06 PM, "Kline, Larry D"
<La...@USONCOLOGY.COM> wrote:

> 

> > Is there some way in the ConceptMapper to propagate the actual
variant 

> > text that matched the document text into the annotation that is 

> > created for the match.  I don't even see the variant information in 

> > the DictinaryResource.DictEntry class, just the base concept
information.

> > 

> > </pre>The contents of this electronic mail message and any
attachments 

> > are confidential, possibly privileged and intended for the 

> > addressee(s) only.<br>Only the addressee(s) may read, disseminate, 

> > retain or otherwise use this message. If received in error, please 

> > immediately inform the sender and then delete this message without 

> > disclosing its contents to anyone.</pre>

 

 

</pre>The contents of this electronic mail message and any attachments are confidential, possibly privileged and intended for the addressee(s) only.<br>Only the addressee(s) may read, disseminate, retain or otherwise use this message. If received in error, please immediately inform the sender and then delete this message without disclosing its contents to anyone.</pre>