You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Uwe L. Korn (JIRA)" <ji...@apache.org> on 2017/01/08 15:36:58 UTC
[jira] [Commented] (ARROW-464) C++: More intelligent array growing
[ https://issues.apache.org/jira/browse/ARROW-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15809567#comment-15809567 ]
Uwe L. Korn commented on ARROW-464:
-----------------------------------
I implemented those suggested changes locally and did notice a performance drop of 20% as more reallocations still come in quite expensive, even with in-place resizing provided by {{jemalloc}}. Also the overhead of calling {{nallocx}} is in our case larger then we actually save by fully using the allocated space (wastage is always <4KiB). Thus the above mentioned optimizations still sound good theroretically to me but don't seem to bring a practical benefit.
Closing this as it is not a pressing issue. We can tackle this once someone has a usecase.
> C++: More intelligent array growing
> -----------------------------------
>
> Key: ARROW-464
> URL: https://issues.apache.org/jira/browse/ARROW-464
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Uwe L. Korn
> Assignee: Uwe L. Korn
>
> Three things to consider:
> * Instead of always growing the memory to twice the size in the Builders, we should consider 1.5 as a growth factor as this still leads to less memory wastage. As the allocation costly is mostly linear in the number of newly requested pages, this shouldn't be a noticeable impact. Due to memory below the size of a single page (i.e. 4KiB) not being able to be expanded in place, we should keep the factor of there.
> * In the case of jemalloc, we can ask the allocator with the function {{nallocx}} for the size it would actually allocate for the requested size and then allocate this as the new capacity for the Builder/Buffer/... and not trigger an reallocation as long as the output of {{nallocx}} doesn't change.
> * If not using {{jemalloc}}, we should be careful about the cost of minor allocation size changes. It would be preferable in these cases to have also an implementation along the lines of {{nallocx}} that does some rounding to avoid unnecessary reallocations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)