You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Greg Holmberg <ho...@comcast.net> on 2011/09/16 19:00:41 UTC

Bug in TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml ?


In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation type is defined to have a feature named "attributes" of range-type FSArray and element-type FSArray. 

In a small sample of XMI output, I see MarkupAnnotations with "attributes" values referencing objects of type AttributeFS, not FSArray.  For example: 

  <tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" attributes="110" name="a" qualifiedName="a" uri="http://www.w3.org/1999/xhtml" /> 

  <tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" value="/Title?0091209" /> 

  
Shouldn't the element-type of the "attributes" feature of the MarkupAnnotation type be AttributeFS, not FSArray? 

    <typeDescription>
      <name>org.apache.uima.tika.MarkupAnnotation</name>
      <description/>
      <supertypeName>uima.tcas.Annotation</supertypeName>
      <features>
        <featureDescription>
          <name>attributes</name>
          <description/>
          <rangeTypeName>uima.cas.FSArray</rangeTypeName>
          <elementType>org.apache.uima.tika.AttributeFS</elementType>
        </featureDescription>


Thanks,

Greg

Re: Bug in TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml ?

Posted by Marshall Schor <ms...@schor.com>.
Yes, this seems like a bug.

The range of the "attributes" feature should be an FSArray, each element of
which should be an instance of AttributeFS.

Can you create a Jira bug report for this, and maybe a patch?

-Marshall  (trying to get others to contribute :-) )

On 9/16/2011 1:00 PM, Greg Holmberg wrote:
>
> In TikaAnnotator/desc/MarkupAnnotatorTypeSystem.xml, the MarkupAnnotation type is defined to have a feature named "attributes" of range-type FSArray and element-type FSArray. 
>
> In a small sample of XMI output, I see MarkupAnnotations with "attributes" values referencing objects of type AttributeFS, not FSArray.  For example: 
>
>   <tika:MarkupAnnotation xmi:id="97" sofa="61" begin="33" end="52" attributes="110" name="a" qualifiedName="a" uri="http://www.w3.org/1999/xhtml" /> 
>
>   <tika:AttributeFS xmi:id="110" localName="href" qualifiedName="href" uri="" value="/Title?0091209" /> 
>
>   
> Shouldn't the element-type of the "attributes" feature of the MarkupAnnotation type be AttributeFS, not FSArray? 
>
>     <typeDescription>
>       <name>org.apache.uima.tika.MarkupAnnotation</name>
>       <description/>
>       <supertypeName>uima.tcas.Annotation</supertypeName>
>       <features>
>         <featureDescription>
>           <name>attributes</name>
>           <description/>
>           <rangeTypeName>uima.cas.FSArray</rangeTypeName>
>           <elementType>org.apache.uima.tika.AttributeFS</elementType>
>         </featureDescription>
>
>
> Thanks,
>
> Greg
>