You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Benjamin Segal (JIRA)" <de...@uima.apache.org> on 2012/05/09 00:27:48 UTC
[jira] [Commented] (UIMA-2385) Improve XmiCasDeserializer performance

    [ https://issues.apache.org/jira/browse/UIMA-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270896#comment-13270896 ] 

Benjamin Segal commented on UIMA-2385:
--------------------------------------

Upon further analysis, the poor performance is due to (slowly) expanding the CAS (heaps).  IF the XMI CAS contains hundreds of thousands to millions of 8, 16, or 64 bit feature values which are being added roughly one (or a few) at a time (through the XMI deserialization), then there is a high CAS expansion cost with the current implementation.  

The existing byte/short/long heap expansion algorithm is defined in CommonAuxHeap.  The idea is that the heap will exponentially grow until a threshold is reached (DEFAULT_HEAP_MULT_LIMIT), after which point, exponential growth is replaced with a linear growth.  All that sounds reasonable, *but*, DEFAULT_HEAP_MULT_LIMIT is only defined to be 1024.  Once the array grows to 1024, the algorithm only expands by 1024 entries at a time.

In our case, the XMI deserialization is only adding a few feature values at a time, and *slowly* expanding out to 1.5 million 18 seconds later (for 5 million, it's over 2 minutes).

Furthermore, the byte/short/long heap seeding values (dealing with CAS expansion) are not currently configurable or exposed through the CAS (CASImpl) or CAS creation utility. 

I think, minimally, we should consider increasing the DEFAULT_HEAP_MULT_LIMIT to something a bit larger to allow for quicker expansion.  The regular (32-bit) Heap uses a default size of 500,000.  For what it's worth, I changed DEFAULT_HEAP_MULT_LIMIT to 512K, and cut the deserialization time by 2/3 or more (from 18 seconds down to 6 seconds, and 140 seconds down to 20 seconds).  From a memory footprint perspective, that would allow an exponential expansion up to 4MBs (512K*8), and then linear expansion from there, which seems reasonable.

We could also consider exposing DEFAULT_HEAP_MULT_LIMIT as a "setable" property (something analogous to CAS_INITIAL_HEAP_SIZE), and allow the cas creation utility to honor the requested limit.   

It should also be noted that binary CAS deserialization does *not* suffer this same fate, as the total heap sizes are self described within the binary blob, so the total heaps sizes are known prior to allocating the (heap) arrays.

And it's also worth noting that this overhead may or may not be problematic, depending on the various types of use cases.
                
> Improve XmiCasDeserializer performance
> --------------------------------------
>
>                 Key: UIMA-2385
>                 URL: https://issues.apache.org/jira/browse/UIMA-2385
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.4.0SDK
>            Reporter: Adam Lally
>            Assignee: Adam Lally
>
> I profiled an expensive CAS deserialization and found that 46% of the time was in CASImpl.ll_getFSForRef (the method that creates a FeatureStructure Java object for a CAS FS).  All those calls were coming from deserializing arrays (of which this particular CAS has many).
> It is unnecessary to create FeatureStructure Java objects here.  For non-array FSs, XmiCasDeserializer uses low-level CAS APIs in order to avoid this overhead.  But for arrays, it currently does not use the low-level APIs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira