You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Greg Holmberg <ho...@comcast.net> on 2011/09/16 19:00:41 UTC
Bug in TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml ?
In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation type is defined to have a feature named "attributes" of range-type FSArray and element-type FSArray.
In a small sample of XMI output, I see MarkupAnnotations with "attributes" values referencing objects of type AttributeFS, not FSArray. For example:
<tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" attributes="110" name="a" qualifiedName="a" uri="http://www.w3.org/1999/xhtml" />
<tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" value="/Title?0091209" />
Shouldn't the element-type of the "attributes" feature of the MarkupAnnotation type be AttributeFS, not FSArray?
<typeDescription>
<name>org.apache.uima.tika.MarkupAnnotation</name>
<description/>
<supertypeName>uima.tcas.Annotation</supertypeName>
<features>
<featureDescription>
<name>attributes</name>
<description/>
<rangeTypeName>uima.cas.FSArray</rangeTypeName>
<elementType>org.apache.uima.tika.AttributeFS</elementType>
</featureDescription>
Thanks,
Greg
Re: Bug in TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml ?
Posted by Marshall Schor <ms...@schor.com>.
Yes, this seems like a bug.
The range of the "attributes" feature should be an FSArray, each element of
which should be an instance of AttributeFS.
Can you create a Jira bug report for this, and maybe a patch?
-Marshall (trying to get others to contribute :-) )
On 9/16/2011 1:00 PM, Greg Holmberg wrote:
>
> In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation type is defined to have a feature named "attributes" of range-type FSArray and element-type FSArray.
>
> In a small sample of XMI output, I see MarkupAnnotations with "attributes" values referencing objects of type AttributeFS, not FSArray. For example:
>
> <tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" attributes="110" name="a" qualifiedName="a" uri="http://www.w3.org/1999/xhtml" />
>
> <tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" value="/Title?0091209" />
>
>
> Shouldn't the element-type of the "attributes" feature of the MarkupAnnotation type be AttributeFS, not FSArray?
>
> <typeDescription>
> <name>org.apache.uima.tika.MarkupAnnotation</name>
> <description/>
> <supertypeName>uima.tcas.Annotation</supertypeName>
> <features>
> <featureDescription>
> <name>attributes</name>
> <description/>
> <rangeTypeName>uima.cas.FSArray</rangeTypeName>
> <elementType>org.apache.uima.tika.AttributeFS</elementType>
> </featureDescription>
>
>
> Thanks,
>
> Greg
>