You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2017/04/30 13:29:18 UTC

Random thoughts about Ruta - performance and space

Hi Peter,

It sounds like one of the difficulties is in "seeing" what is taking up space,
and what kinds of things a person might want to consider to do refactoring.

One idea to think about: could there be some new kind of tracing or reporting
that could be added (conditionally, as it might take space and/or slow things
down) that a user having these kinds of problems could run with to get a useful
report on where to look?

Also, I'm wondering if there are any "findbugs" kinds of tooling (or compilation
mode) (that could be reasonably developed) that users could run against their
rules, to guide them toward suspicious constructs, etc.

Just random thoughts... -Marshall

On 4/30/2017 7:01 AM, Peter Kl�gl (JIRA) wrote:
> Peter Kl�gl created UIMA-5414:
> ---------------------------------
>
>              Summary: Ruta: config param for max amount of rule and rule element matches
>                  Key: UIMA-5414
>                  URL: https://issues.apache.org/jira/browse/UIMA-5414
>              Project: UIMA
>           Issue Type: New Feature
>           Components: Ruta
>     Affects Versions: 2.6.0ruta
>             Reporter: Peter Kl�gl
>
>
> Ruta: config param for max amount of rule and rule element matches. If exceeded, an runtime exception is throw with the name of the script and the verbalization of the rule/rule element.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)
>


Re: Random thoughts about Ruta - performance and space

Posted by Peter Klügl <pe...@averbis.com>.
Hi,


Am 30.04.2017 um 15:29 schrieb Marshall Schor:
> Hi Peter,
>
> It sounds like one of the difficulties is in "seeing" what is taking up space,
> and what kinds of things a person might want to consider to do refactoring.
>
> One idea to think about: could there be some new kind of tracing or reporting
> that could be added (conditionally, as it might take space and/or slow things
> down) that a user having these kinds of problems could run with to get a useful
> report on where to look?

the language has already some kind of visitor pattern where the rule
execution and matches are stored and also the time each language element
needed. Right now, there are a few given visitors for debugging and
profiling which can be activated by some configuration parameters. I
just created a ticket to open it for extensions so that everyone can
plug in their own logging or analysis.

Besides using an actual java profiler like visualvm or a commercial one,
these visitor can already be used to estimate the memory usage in terms
of rule matches. The problem is that if you have problems with the
memory, the visitor blows it up ten times. A more light weight visitor
would help here...

>
> Also, I'm wondering if there are any "findbugs" kinds of tooling (or compilation
> mode) (that could be reasonably developed) that users could run against their
> rules, to guide them toward suspicious constructs, etc.

Hmm... haven't thought about it yet. There are maybe some data-driven
patterns that can be applied automatically, maybe a ratio of spent time
vs matched annotations vs created annotations. There are definitely some
specific patterns like the ANY+ thing. I probably need to see more rules
written by other people in order to identify them...

(I still think that there is a flaw in the implementation, and that
there are still some room for improvements just by thinking out of (my)
box a bit. I have to check when the GC cleans up the rule element matches.)

Best,

Peter

> Just random thoughts... -Marshall
>
> On 4/30/2017 7:01 AM, Peter Kl�gl (JIRA) wrote:
>> Peter Kl�gl created UIMA-5414:
>> ---------------------------------
>>
>>              Summary: Ruta: config param for max amount of rule and rule element matches
>>                  Key: UIMA-5414
>>                  URL: https://issues.apache.org/jira/browse/UIMA-5414
>>              Project: UIMA
>>           Issue Type: New Feature
>>           Components: Ruta
>>     Affects Versions: 2.6.0ruta
>>             Reporter: Peter Kl�gl
>>
>>
>> Ruta: config param for max amount of rule and rule element matches. If exceeded, an runtime exception is throw with the name of the script and the verbalization of the rule/rule element.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.15#6346)
>>