You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Philipp Moritz (JIRA)" <ji...@apache.org> on 2019/01/31 01:46:00 UTC

[jira] [Resolved] (ARROW-4422) [Plasma] Enforce memory limit in plasma, rather than relying on dlmalloc_set_footprint_limit

     [ https://issues.apache.org/jira/browse/ARROW-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Philipp Moritz resolved ARROW-4422.
-----------------------------------
    Resolution: Fixed

Issue resolved by pull request 3526
[https://github.com/apache/arrow/pull/3526]

> [Plasma] Enforce memory limit in plasma, rather than relying on dlmalloc_set_footprint_limit
> --------------------------------------------------------------------------------------------
>
>                 Key: ARROW-4422
>                 URL: https://issues.apache.org/jira/browse/ARROW-4422
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Plasma (C++)
>    Affects Versions: 0.12.0
>            Reporter: Anurag Khandelwal
>            Assignee: Anurag Khandelwal
>            Priority: Minor
>             Fix For: 0.13.0
>
>
> Currently, Plasma relies on dlmalloc_set_footprint_limit to limit the memory utilization for Plasma Store. This is restrictive because:
>  * It restricts Plasma to dlmalloc, which supports limiting memory footprint, as opposed to other, potentially more performant malloc implementations (e.g., jemalloc)
>  * dlmalloc_set_footprint_limit does not guarantee that the limit set by it the amount of _usable_ memory. As such, we might trigger evictions much earlier than hitting this limit, e.g., due to fragmentation or metadata overheads.
> To overcome this, we can impose the memory limit at Plasma by tracking the number of bytes allocated and freed using malloc and free calls. Whenever the allocation reaches the set limit, we fail any subsequent allocations (i.e., return NULL from malloc). This allows Plasma to not be tied to dlmalloc, and also provides more accurate tracking of memory allocation/capacity. 
> Caveat: We will need to make sure that the mmaped files are living on a file system that is a bit larger (depending on malloc implementation) than the Plasma memory limit to account for the extra memory required due to fragmentation/metadata overheads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)