You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Antoine Pitrou (JIRA)" <ji...@apache.org> on 2019/06/03 12:15:01 UTC

[jira] [Updated] (ARROW-5259) [Java] Add option for ValueVector to allocate buffers with actual size

     [ https://issues.apache.org/jira/browse/ARROW-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antoine Pitrou updated ARROW-5259:
----------------------------------
    Summary: [Java] Add option for ValueVector to allocate buffers with actual size  (was: Add option for ValueVector to allocate buffers with actual size)

> [Java] Add option for ValueVector to allocate buffers with actual size
> ----------------------------------------------------------------------
>
>                 Key: ARROW-5259
>                 URL: https://issues.apache.org/jira/browse/ARROW-5259
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: Java
>            Reporter: Ji Liu
>            Assignee: Ji Liu
>            Priority: Minor
>
> Currently in _BaseValueVector#computeCombinedBufferSize_, it calculates the buffer size with _valueCount_ and _typeWidth_ as inputs and then allocates memory for dataBuffer and validityBuffer. However, it always allocate memory greater than the actual size, because of the invoke of _BaseAllocator.nextPowerOfTwo(bufferSize)_.
> For example, IntVector will allocate buffers with size 8192 with valueCount = 1025, memory usage is almost double what it actually is. So in some cases, there have enough memory for actual use but throws OOM when the allocated memory is increased to next power of 2 and I think this problem is absolutely avoidable.
> Is it feasible to add option for ValueVector to allocate actual buffer size rather than make it next power of 2 to reduce memory allocation?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)