You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Lari Hotari (JIRA)" <ji...@apache.org> on 2015/04/21 23:24:00 UTC

[jira] [Commented] (HADOOP-7154) Should set MALLOC_ARENA_MAX in hadoop-config.sh

    [ https://issues.apache.org/jira/browse/HADOOP-7154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505792#comment-14505792 ] 

Lari Hotari commented on HADOOP-7154:
-------------------------------------

There might be other environment settings that should be tuned besides MALLOC_ARENA_MAX.

The [mallopt man page|http://man7.org/linux/man-pages/man3/mallopt.3.html] ("man mallopt") contains an important notice about dynamic mmap threshold in glibc malloc.

{quote}
Note: Nowadays, glibc uses a dynamic mmap threshold by
              default.  The initial value of the threshold is 128*1024, but
              when blocks larger than the current threshold and less than or
              equal to DEFAULT_MMAP_THRESHOLD_MAX are freed, the threshold
              is adjusted upward to the size of the freed block.  When
              dynamic mmap thresholding is in effect, the threshold for
              trimming the heap is also dynamically adjusted to be twice the
              dynamic mmap threshold.  Dynamic adjustment of the mmap
              threshold is disabled if any of the M_TRIM_THRESHOLD,
              M_TOP_PAD, M_MMAP_THRESHOLD, or M_MMAP_MAX parameters is set.
{quote}

More information https://sourceware.org/bugzilla/show_bug.cgi?id=11044 .

I'd suggest disabling dynamic mmap threshold by setting these environment variables (besided MALLOC_ARENA_MAX):
{code}
# tune glibc memory allocation, optimize for low fragmentation
# limit the number of arenas
export MALLOC_ARENA_MAX=4
# disable dynamic mmap threshold, see M_MMAP_THRESHOLD in "man mallopt"
export MALLOC_MMAP_THRESHOLD_=131072
export MALLOC_TRIM_THRESHOLD_=131072
export MALLOC_TOP_PAD_=131072
export MALLOC_MMAP_MAX_=65536
{code}

That would prevent memory fragmentation by keeping using mmap for memory allocations that are over 128K. (the default before dynamic mmap behaviour was introduced)

> Should set MALLOC_ARENA_MAX in hadoop-config.sh
> -----------------------------------------------
>
>                 Key: HADOOP-7154
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7154
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>             Fix For: 1.0.4, 0.22.0
>
>         Attachments: hadoop-7154.txt
>
>
> New versions of glibc present in RHEL6 include a new arena allocator design. In several clusters we've seen this new allocator cause huge amounts of virtual memory to be used, since when multiple threads perform allocations, they each get their own memory arena. On a 64-bit system, these arenas are 64M mappings, and the maximum number of arenas is 8 times the number of cores. We've observed a DN process using 14GB of vmem for only 300M of resident set. This causes all kinds of nasty issues for obvious reasons.
> Setting MALLOC_ARENA_MAX to a low number will restrict the number of memory arenas and bound the virtual memory, with no noticeable downside in performance - we've been recommending MALLOC_ARENA_MAX=4. We should set this in hadoop-env.sh to avoid this issue as RHEL6 becomes more and more common.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)