You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Tak Lon (Stephen) Wu (JIRA)" <ji...@apache.org> on 2018/11/07 18:44:00 UTC

[jira] [Commented] (HADOOP-14176) distcp reports beyond physical memory limits on 2.X

    [ https://issues.apache.org/jira/browse/HADOOP-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678621#comment-16678621 ] 

Tak Lon (Stephen) Wu commented on HADOOP-14176:
-----------------------------------------------

any update on this? I recently got into an issue that on a very large memory instance, it went `{color:#222222}beyond virtual memory limits{color}`. IMO if we don't have this `distcp-default.xml` and only use the system default, this issue will not happened and operator don't need to guess what's happening behind the scenes.

> distcp reports beyond physical memory limits on 2.X
> ---------------------------------------------------
>
>                 Key: HADOOP-14176
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14176
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 2.9.0
>            Reporter: Fei Hui
>            Assignee: Fei Hui
>            Priority: Major
>         Attachments: HADOOP-14176-branch-2.001.patch, HADOOP-14176-branch-2.002.patch, HADOOP-14176-branch-2.003.patch, HADOOP-14176-branch-2.004.patch
>
>
> When i run distcp,  i get some errors as follow
> {quote}
> 17/02/21 15:31:18 INFO mapreduce.Job: Task Id : attempt_1487645941615_0037_m_000003_0, Status : FAILED
> Container [pid=24661,containerID=container_1487645941615_0037_01_000005] is running beyond physical memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 4.0 GB of 5 GB virtual memory used. Killing container.
> Dump of the process-tree for container_1487645941615_0037_01_000005 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 24661 24659 24661 24661 (bash) 0 0 108650496 301 /bin/bash -c /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Xmx2120m -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 44048 attempt_1487645941615_0037_m_000003_0 5 1>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stdout 2>/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005/stderr
>         |- 24665 24661 24661 24661 (java) 1766 336 4235558912 280699 /usr/lib/jvm/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2120m -Djava.io.tmpdir=/mnt/disk4/yarn/usercache/hadoop/appcache/application_1487645941615_0037/container_1487645941615_0037_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/mnt/disk2/log/hadoop-yarn/containers/application_1487645941615_0037/container_1487645941615_0037_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog org.apache.hadoop.mapred.YarnChild 192.168.1.208 44048 attempt_1487645941615_0037_m_000003_0 5
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> {quote}
> Deep into the code , i find that because distcp configuration covers mapred-site.xml
> {code}
>     <property>
>         <name>mapred.job.map.memory.mb</name>
>         <value>1024</value>
>     </property>
>     <property>
>         <name>mapred.job.reduce.memory.mb</name>
>         <value>1024</value>
>     </property>
> {code}
> When mapreduce.map.java.opts and mapreduce.map.memory.mb is setting in mapred-default.xml, and the value is larger than setted in distcp-default.xml, the error maybe occur.
> we should remove those two configurations in distcp-default.xml 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org