You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <ui...@incubator.apache.org> on 2009/08/30 05:25:34 UTC

[jira] Updated: (UIMA-1089) Space/Time tradeoffs in the CAS

     [ https://issues.apache.org/jira/browse/UIMA-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor updated UIMA-1089:
---------------------------------

    Affects Version/s: 2.3

defer beyond 2.3.0

> Space/Time tradeoffs in the CAS
> -------------------------------
>
>                 Key: UIMA-1089
>                 URL: https://issues.apache.org/jira/browse/UIMA-1089
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.2.2, 2.3
>            Reporter: Marshall Schor
>            Priority: Minor
>
> Investigate / implement optimizations that trade user-controllable time (running the optimizations) for space.  One such optimization could be: sharing strings.  To do the sharing requires additional computation and (temporary) storage to detect the sharing opportunities, but results in space savings.  For instance, a common annotation might assign short strings like "noun" to a "part-of-speech" feature.  If you are processing a large document, there may be a large number of these kinds of string valued features, picked from a small pool of allowable values. The CAS's string storage might be able to be optimized to share the string references in this case, at a cost of temporarily creating a hash table of the unique strings and using it to identify sharing possibilities.  A new API call to do this optimization would isolate the performance/space overhead of doing this optimization to just those users and times where it makes sense to do this.
> An alternative would be to automatically figure this out for some selected kinds of optimizations, but I'm not sure that could be done without impacting finely-tuned systems negatively.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.