You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@uima.apache.org by "Marshall Schor (JIRA)" <de...@uima.apache.org> on 2012/08/27 16:29:07 UTC

[jira] [Updated] (UIMA-2460) Binary deserialization inefficient

     [ https://issues.apache.org/jira/browse/UIMA-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marshall Schor updated UIMA-2460:
---------------------------------

    Affects Version/s: 2.4.0SDK
    
> Binary deserialization inefficient
> ----------------------------------
>
>                 Key: UIMA-2460
>                 URL: https://issues.apache.org/jira/browse/UIMA-2460
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.4.0SDK
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>            Priority: Minor
>             Fix For: 2.4.1SDK
>
>
> The CAS binary deserialization code can be made (much) more space efficient.  Currently, the char data that is used in the strings is read into a char array; each string is represented as an offset into this char array + a length; and new Java strings are created using new String(chararray, offset, length).  This works, but it allocates a new char array for each string being created, and copies from the original char array.  This results in new char array objects for each string object.
> The alternative is to reuse the original char array object, and not allocate any other char array objects.  This can be done by:
> * making a temporary string from the entire char array object, and then
> * making the new strings using tempString.substring(offset, offset + length)
> For 1000 strings, this will save 999 char array object overheads (probably about 16 bytes per).
> An additional space savings is possible by reusing the same string object for equal strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira