You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Jakub Stransky <st...@gmail.com> on 2014/09/10 15:28:37 UTC

running beyond virtual memory limits

Hello,

I am getting following error when running on 500MB dataset compressed in
avro data format.

Container [pid=22961,containerID=container_1409834588043_0080_01_000010] is
running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB
physical memory used; 2.1 GB of 2.1 GB virtual memory used.
Killing container. Dump of the process-tree for
container_1409834588043_0080_01_000010 :
|- PID    PPID  PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 22961  16896 22961  22961  (bash)    0                      0
        9424896           312                 /bin/bash -c
/usr/java/default/bin/java -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN -Xmx768m
-Djava.io.tmpdir=/home/hadoop/yarn/local/usercache/jobsubmit/appcache/application_1409834588043_0080/container_1409834588043_0080_01_000010/tmp
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.container.log.dir=/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_000010
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
org.apache.hadoop.mapred.YarnChild 153.87.47.116 47184
attempt_1409834588043_0080_r_000000_0 10
1>/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_000010/stdout
2>/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_000010/stderr
|- 22970 22961 22961 22961 (java) 24692 1165 2256662528 162659
/usr/java/default/bin/java -Djava.net.preferIPv4Stack=true
-Dhadoop.metrics.log.level=WARN -Xmx768m
-Djava.io.tmpdir=/home/hadoop/yarn/local/usercache/jobsubmit/appcache/application_1409834588043_0080/container_1409834588043_0080_01_000010/tmp
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.container.log.dir=/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_000010
-Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
org.apache.hadoop.mapred.YarnChild 153.87.47.116 47184
attempt_1409834588043_0080_r_000000_0 10 Container killed on request. Exit
code is 143

I have read a lot about hadoop yarn memory settings but seems that
something basic I am missing in my understanding of how yarn and MR2 works.
I have pretty small testing cluster of 5 machines, 2nn and 3dn with
following parameters set

# hadoop - yarn-site.xml
yarn.nodemanager.resource.memory-mb  : 2048
yarn.scheduler.minimum-allocation-mb : 256
yarn.scheduler.maximum-allocation-mb : 2048

# hadoop - mapred-site.xml
mapreduce.map.memory.mb              : 768
mapreduce.map.java.opts              : -Xmx512m
mapreduce.reduce.memory.mb           : 1024
mapreduce.reduce.java.opts           : -Xmx768m
mapreduce.task.io.sort.mb            : 100
yarn.app.mapreduce.am.resource.mb    : 1024
yarn.app.mapreduce.am.command-opts   : -Xmx768m

I understand the mathematics here for the parameters but what I do not
understand is: Does your containers need to grow with the size of your
dataset? e.g. setting of mapreduce.map.memory.mb   and
mapreduce.map.java.opts  on per job basis? My reducer doesn't cache any
data, it is simply in -> out just categorize data to multiple outputs as
follows using AvroMultipleOutputs()

    @Override
    public void reduce(Text key, Iterable<AvroValue<PosData>> values,
Context context) throws IOException, InterruptedException {
        try {
            log.info("Processing key {}", key.toString());
            final StoreIdDob storeIdDob = separateKey(key);

            log.info("Processing DOB {}, SotoreId {}", storeIdDob.getDob(),
storeIdDob.getStoreId());
            int size = 0;

            Output out;
            String path;

            if (storeIdDob.getDob() != null &&
isValidDOB(storeIdDob.getDob()) && storeIdDob.getStoreId() != null &&
!storeIdDob.getStoreId().isEmpty()) {
                // reasonable data
                if (isHistoricalDOB(storeIdDob.getDob())) {
                    out = Output.HISTORY;
                } else {
                    out = Output.ACTUAL;
                }
                path = out.getKey() + "/" + storeIdDob.getDob() + "/" +
storeIdDob.getStoreId();
            } else {
                // error data
                out = Output.ERROR;
                path = out.getKey() + "/" + "part";
            }

            for (AvroValue<PosData> posData : values) {
                amos.write(out.getKey(), new AvroKey<PosData>
(posData.datum()), null, path);
            }

        } catch (Exception e) {
            log.error("Error on reducer ", e);
            //TODO audit log :-)
        }
    }

Do I need to grow the container size with size of the dataset? That seems
to me odd and I did expect that is what MR is for. Or am I missing some
settings which decides the size of data chunks?

Thx
Jakub

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.

You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.

You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.

You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Re: Hadoop Smoke Test: TERASORT

Posted by Rich Haase <rd...@gmail.com>.

You can set the number of reducers used in any hadoop job from the command
line by using -Dmapred.reduce.tasks=XX.

e.g.  hadoop jar hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=10  /terasort-input /terasort-output

Hadoop Smoke Test: TERASORT

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

I am trying the smoke test for Hadoop (2.4.1).  About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job.  Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part?


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar  terasort /tmp/teragenout /tmp/terasortout


Job ID							Name	State			Maps Total	Maps Completed		Reduce Total 			Reduce Complted
job_1409876705457_0002  	TeraSort	RUNNING 		22352		22352				1 					0


Regards
Arthur

Hadoop Smoke Test: TERASORT

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

I am trying the smoke test for Hadoop (2.4.1).  About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job.  Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part?


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar  terasort /tmp/teragenout /tmp/terasortout


Job ID							Name	State			Maps Total	Maps Completed		Reduce Total 			Reduce Complted
job_1409876705457_0002  	TeraSort	RUNNING 		22352		22352				1 					0


Regards
Arthur

Hadoop Smoke Test: TERASORT

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

I am trying the smoke test for Hadoop (2.4.1).  About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job.  Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part?


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar  terasort /tmp/teragenout /tmp/terasortout


Job ID							Name	State			Maps Total	Maps Completed		Reduce Total 			Reduce Complted
job_1409876705457_0002  	TeraSort	RUNNING 		22352		22352				1 					0


Regards
Arthur

Hadoop Smoke Test: TERASORT

Posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com>.

Hi,

I am trying the smoke test for Hadoop (2.4.1).  About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job.  Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part?


bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar  terasort /tmp/teragenout /tmp/terasortout


Job ID							Name	State			Maps Total	Maps Completed		Reduce Total 			Reduce Complted
job_1409876705457_0002  	TeraSort	RUNNING 		22352		22352				1 					0


Regards
Arthur