You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Pablo Duboue <pa...@gmail.com> on 2016/11/19 18:12:14 UTC

[UIMA 3.0] Typesystem / select

(Two smaller comments, this is my last email. Have a nice weekend!)


[UIMA 3.0] Typesystem

Page 3

PEAR support: multiple type definition errors. What about exact
duplicates. For example if two pears ship the same OpenNLP JCasGen'ed
types.

I don't know what "committed" means. It seems an internal detail that
might be better introduced. The discussion regarding type system
sharing is unclear whether this is a problem with the old system or
the new system.


[UIMA 3.0] Select

I love the select mechanism. I wonder if we can have somebody comment
on whether its use is similar to other selects (like XSL and JQuery).

Some of the predicates seem a subset of what RuTA offers. Maybe is
worth extending the list so as many predicates are shared with RuTA?
That will also simplify the RuTA learning curve. (This might be better
off discussed on the issue tracker.)

Besides limit and nullOk in 3.3.2 I would add filterNull.

Re: [UIMA 3.0] Typesystem / select

Posted by Peter Klügl <pe...@averbis.com>.
>> [UIMA 3.0] Select
>>
>> I love the select mechanism. I wonder if we can have somebody comment
>> on whether its use is similar to other selects (like XSL and JQuery).
>>
>> Some of the predicates seem a subset of what RuTA offers. Maybe is
>> worth extending the list so as many predicates are shared with RuTA?
>> That will also simplify the RuTA learning curve. (This might be better
>> off discussed on the issue tracker.)
> Maybe - I don't think anyone's looked at this yet.
>

I had a similar thought. There are some plans to provide a ruta
implementation that builds only on uima/uimafit implemantions meaning
that the RutaBasic stuff is removed. Thus, adding some additional
predicates that are useful for sequential matching would be great.

I am currently quite busy and did not find the time to catch up with the
current development.

I assume that an incremental process (have a first version, then see
what is missing for ruta) is the most realistic. Are there some specific
predicate you were thinking about?

Best,

Peter

Re: [UIMA 3.0] Typesystem / select

Posted by Marshall Schor <ms...@schor.com>.

On 11/19/2016 1:12 PM, Pablo Duboue wrote:
> (Two smaller comments, this is my last email. Have a nice weekend!)
>
>
> [UIMA 3.0] Typesystem
>
> Page 3
>
> PEAR support: multiple type definition errors. What about exact
> duplicates. For example if two pears ship the same OpenNLP JCasGen'ed
> types.
This is somewhat ambiguous, the reason being that PEARs are defined to have
their own class loading contexts, wherein their locally defined version of
classes overrides other definitions of a class higher in the hierarchy.  Note
this is opposite to normal Java classloaders, which delegate first to their
parent. The result is that any instances of class "Foo" made by Pear 1 using its
local definition of that class would get a class-cast-exception if an attempt
was made to use that instance in Pear 2 which defined, also, a "Foo" (because it
would be loaded using a different class loader).

With some careful work, some accommodation might be possible for common
well-understood use cases, though.  I think this would not be in the first
releases though.
>
> I don't know what "committed" means. It seems an internal detail that
> might be better introduced. The discussion regarding type system
> sharing is unclear whether this is a problem with the old system or
> the new system.
Type systems, and the low level, have a life cycle where you create a type
system "manager", and then add types and features to that type system.  When
you're finished, you "commit" the type system.  At that time, a bunch of
calculations are done to allow high performance, and the type system is "locked
down" against further modifications.  That's what commit means.  For many users,
this is all hidden by other layers being used.

Because the type system is finalized/locked down at commit time, it's possible
to discover that
the running UIMA instance already has an exact instance of that type system; if
so, that is used instead.  For large scaleouts involving 100's or more instances
of pipelines, this can amount to a significant performance improvement.
>
> [UIMA 3.0] Select
>
> I love the select mechanism. I wonder if we can have somebody comment
> on whether its use is similar to other selects (like XSL and JQuery).
>
> Some of the predicates seem a subset of what RuTA offers. Maybe is
> worth extending the list so as many predicates are shared with RuTA?
> That will also simplify the RuTA learning curve. (This might be better
> off discussed on the issue tracker.)
Maybe - I don't think anyone's looked at this yet.
>
> Besides limit and nullOk in 3.3.2 I would add filterNull.
It is very easy to add arbitrary filters, because the results of a select
implement the Java 8 stream APIs.  So, you could filter out null values using:
   ...  . filter ( fs -> fs != null ) ...

of course many other filters are similarly trivial, for instance, filter out
annotations whose span is too small (e.g. less that "epsilon" - a "final" java
int value presumably set earlier:

   ... . filter( fs -> (fs.end() - fs.begin()) > epsilon ) ...
>