You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <de...@uima.apache.org> on 2014/11/21 19:35:33 UTC
[jira] [Resolved] (UIMA-4003) add alternative int - int maps, and
int sets for better space/time performance
[ https://issues.apache.org/jira/browse/UIMA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marshall Schor resolved UIMA-4003.
----------------------------------
Resolution: Fixed
Switched ListUtils to use PositiveIntSet
> add alternative int - int maps, and int sets for better space/time performance
> ------------------------------------------------------------------------------
>
> Key: UIMA-4003
> URL: https://issues.apache.org/jira/browse/UIMA-4003
> Project: UIMA
> Issue Type: Improvement
> Components: Core Java Framework
> Affects Versions: 2.6.0SDK
> Reporter: Marshall Schor
> Assignee: Marshall Schor
> Priority: Minor
> Fix For: 2.7.0SDK
>
>
> int - int maps and int sets are implemented in UIMA as either red-black trees (size = 5 words (4 words for sets) + 1 bit per item, search time = log 2 size (binary search), insert /removal can cause rebalancing tree), or as intVectors (like ArrayList<Integers> but doesn't wrap ints as Integers).
> For int - int maps, add a hash version (loses key "ordering"), which takes 3 - 6 words per item (avg 4.5 words - slightly smaller), and has O(1) performance (based on existing JCasHashMap impl, but without concurrency support).
> For int sets, add 2 styles: one based on the same hash map (space = 1.5 to 3 words per item vs 4 words), the other based on BitSets. The BitSet one would be faster, would have a conventional key ordering, and would be smaller, if the max int stored < 128 * number of items stored. (This is often the case in uses where it is used to track Feature Structure addresses).
> Finally, add one adaptable int set that is usually is implemented as a BitSet, but if the items being stored are such that the size would be significantly larger than the IntHashSet implementation, switch to that alternative representation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)