You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Richard Eckart de Castilho <re...@apache.org> on 2017/05/12 16:32:10 UTC

Extracting a type system definition from an XMI file

Hi all,

do we have code somewhere that tries to reverse-engineer
a type system description given an XMI file?

Cheers,

-- Richard

Re: Extracting a type system definition from an XMI file

Posted by Richard Eckart de Castilho <re...@apache.org>.
Thanks Marshall, José and Markus for your feedback / code!

You're right: the XMI is not sufficiently self-describing.
The ambiguity between numeric and referential features is
a critical problem to inferring a type system from an XMI
file - at least in my scenario.

Best,

-- Richard

Re: Extracting a type system definition from an XMI file

Posted by mak28ma <ma...@uni-wuerzburg.de>.
Hi,

a collegue once tried that with the attached code. Can obviously not 
recreate the hierarchy but for some easy cases its fine.

Best regards

Markus Krug



Am 15.05.2017 um 19:02 schrieb José Tomás Atria:
> I think Marshall is right, in the sense that you could recover "a" type
> system that would allow for the observed features found in a serialized
> cas, but that resulting type system would be pretty useless, as it would
> have none of the type and hierarchy constraints of the original.
>
> But also, it would probably be full of errors, for example, since AFAIK,
> there's no way to distinguish whether an int found in the value for a
> feature is an actual int value, or an xmi:id reference to another
> annotation in the xmi (e.g. <cas:Annotation someIntValue="1675",
> someOtherAnnotation="1765"> offers no way of distinguishing between the
> literal int 1675 value for the "someInt" feature and the xmi:id reference
> value from the "someOtherAnnotation" feature).
>
> in brief: this sounds like trying to infer a valid schema from a set of
> conforming instances, which i remember to be a futile effort.
>
> On Mon, May 15, 2017 at 9:59 AM Marshall Schor <ms...@schor.com> wrote:
>
>> Hi,
>>
>> The xmi file would contain just a set of "examples" of the type system,
>> right?
>>
>> And there would be nothing there that would indicate the type hierarchy, I
>> think, although one might be able to heuristically guess at a possible
>> hierarchy, if there were instances of types that were members of various
>> levels
>> of the hierarchy, for instance if there was a type
>>
>> foo  with features foof1 (e.g. string) and foof2 (ref to type bar)
>>
>> superfoo with features just being foof1 (of same type as foof1 in foo)
>>
>> Then you might be able to conclude a guess about the hierarchy...
>>
>>
>> You might mean, instead, to come up with some type system that would "fit"
>> the
>> types in the xmi, with no need to have those be the actual type system.
>> Even
>> that, it may be difficult, because an xmi instance doesn't describe the
>> data
>> type of its feature value, and the encoding of the feature value is
>> ambiguous
>> with respect to the type system, I think. For instance, a feature
>> reference is
>> encoded as an integer value.
>>
>> -Marshall
>>
>>
>> On 5/12/2017 12:32 PM, Richard Eckart de Castilho wrote:
>>> Hi all,
>>>
>>> do we have code somewhere that tries to reverse-engineer
>>> a type system description given an XMI file?
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>> --
> sent from a phone. please excuse terseness and tpyos.
>
> enviado desde un teléfono. por favor disculpe la parquedad y los erroers.
>


Re: Extracting a type system definition from an XMI file

Posted by José Tomás Atria <jt...@gmail.com>.
I think Marshall is right, in the sense that you could recover "a" type
system that would allow for the observed features found in a serialized
cas, but that resulting type system would be pretty useless, as it would
have none of the type and hierarchy constraints of the original.

But also, it would probably be full of errors, for example, since AFAIK,
there's no way to distinguish whether an int found in the value for a
feature is an actual int value, or an xmi:id reference to another
annotation in the xmi (e.g. <cas:Annotation someIntValue="1675",
someOtherAnnotation="1765"> offers no way of distinguishing between the
literal int 1675 value for the "someInt" feature and the xmi:id reference
value from the "someOtherAnnotation" feature).

in brief: this sounds like trying to infer a valid schema from a set of
conforming instances, which i remember to be a futile effort.

On Mon, May 15, 2017 at 9:59 AM Marshall Schor <ms...@schor.com> wrote:

> Hi,
>
> The xmi file would contain just a set of "examples" of the type system,
> right?
>
> And there would be nothing there that would indicate the type hierarchy, I
> think, although one might be able to heuristically guess at a possible
> hierarchy, if there were instances of types that were members of various
> levels
> of the hierarchy, for instance if there was a type
>
> foo  with features foof1 (e.g. string) and foof2 (ref to type bar)
>
> superfoo with features just being foof1 (of same type as foof1 in foo)
>
> Then you might be able to conclude a guess about the hierarchy...
>
>
> You might mean, instead, to come up with some type system that would "fit"
> the
> types in the xmi, with no need to have those be the actual type system.
> Even
> that, it may be difficult, because an xmi instance doesn't describe the
> data
> type of its feature value, and the encoding of the feature value is
> ambiguous
> with respect to the type system, I think. For instance, a feature
> reference is
> encoded as an integer value.
>
> -Marshall
>
>
> On 5/12/2017 12:32 PM, Richard Eckart de Castilho wrote:
> > Hi all,
> >
> > do we have code somewhere that tries to reverse-engineer
> > a type system description given an XMI file?
> >
> > Cheers,
> >
> > -- Richard
> >
>
> --

sent from a phone. please excuse terseness and tpyos.

enviado desde un teléfono. por favor disculpe la parquedad y los erroers.

Re: Extracting a type system definition from an XMI file

Posted by Marshall Schor <ms...@schor.com>.
Hi,

The xmi file would contain just a set of "examples" of the type system, right?

And there would be nothing there that would indicate the type hierarchy, I
think, although one might be able to heuristically guess at a possible
hierarchy, if there were instances of types that were members of various levels
of the hierarchy, for instance if there was a type

foo  with features foof1 (e.g. string) and foof2 (ref to type bar)

superfoo with features just being foof1 (of same type as foof1 in foo)

Then you might be able to conclude a guess about the hierarchy...


You might mean, instead, to come up with some type system that would "fit" the
types in the xmi, with no need to have those be the actual type system.  Even
that, it may be difficult, because an xmi instance doesn't describe the data
type of its feature value, and the encoding of the feature value is ambiguous
with respect to the type system, I think. For instance, a feature reference is
encoded as an integer value.

-Marshall


On 5/12/2017 12:32 PM, Richard Eckart de Castilho wrote:
> Hi all,
>
> do we have code somewhere that tries to reverse-engineer
> a type system description given an XMI file?
>
> Cheers,
>
> -- Richard
>