You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Sean Chow (Jira)" <ji...@apache.org> on 2020/08/15 10:26:00 UTC

[jira] [Commented] (HADOOP-17209) ErasureCode native library memory leak

    [ https://issues.apache.org/jira/browse/HADOOP-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17178225#comment-17178225 ] 

Sean Chow commented on HADOOP-17209:
------------------------------------

In order to dig this out. I've enabled jvm {{Native Memory Tracking}} , and found out it's the jni(Java Native Interface) call issue:

 
{code:java}
[0x00007f2a7b5e2ea5] jni_GetIntArrayElements+0x2b5
 (malloc=2444372KB +225563KB #433115199 +31527004)
[0x00007f2a7b5e2ea5] jni_GetIntArrayElements+0x2b5
[0x00007f2a4c7c35b9] getOutputs+0x79
 (malloc=2457719KB +227373KB #434823615 +31758718)
[0x00007f2a7b5e2ea5] jni_GetIntArrayElements+0x2b5
[0x00007f2a4c7c34c9] getInputs+0x79
 (malloc=8479302KB +618477KB #434823615 +31758718)


{code}
 

The attached file datanode.202137.detail_diff.5.txt is output of {{jcmd 202137 VM.native_memory detail.diff}} , that keep tracking about 24hours from baseline.

 

Okay it's clear now. The jni function GetIntArrayElements need to release memory manually, according to [https://www.eg.bucknell.edu/~mead/Java-tutorial/native1.1/implementing/array.html |https://www.eg.bucknell.edu/~mead/Java-tutorial/native1.1/implementing/array.html]
{code:java}
Similar to the Get<type>ArrayElements functions, the JNI provides a set of Release<type>ArrayElements functions. Do not forget to call the appropriate Release<type>ArrayElements function, such as ReleaseIntArrayElements. If you forget to make this call, the array stays pinned for an extended period of time. Or, the Java Virtual Machine is unable to reclaim the memory used to store the nonmovable copy of the array{code}
But I didn't find any function named {{ReleaseIntArrayElements .}}

{{I have a patch running in my datanode. If it's ok, patch will be attached soon.}}

> ErasureCode native library memory leak
> --------------------------------------
>
>                 Key: HADOOP-17209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17209
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: native
>    Affects Versions: 3.3.0, 3.2.1, 3.1.3
>            Reporter: Sean Chow
>            Assignee: Sean Chow
>            Priority: Major
>         Attachments: image-2020-08-15-18-25-48-830.png
>
>
> We use both {{apache-hadoop-3.1.3}} and {{CDH-6.1.1-1.cdh6.1.1.p0.875250}} HDFS in production, and both of them have the memory increasing over {{-Xmx}} value. 
> This's the jvm options:
>  
> {code:java}
> -Dproc_datanode -Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true -Xms8589934592 -Xmx8589934592 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:+HeapDumpOnOutOfMemoryError ...{code}
>  
> The max jvm heapsize is 8GB, but we can see the datanode RSS memory is 48g.
> {code:java}
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
> 226044 hdfs 20 0 50.6g 48g 4780 S 90.5 77.0 14728:27 /usr/java/jdk1.8.0_162/bin/java -Dproc_datanode{code}
> !image-2020-08-15-17-45-27-363.png!
> !image-2020-08-15-17-50-48-598.png!
> This too much memory used leads to my machine unresponsive(if enable swap), or oom-killer happens.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org