You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by pr...@nokia.com on 2011/02/10 22:40:35 UTC

Hadoop on physical machine Vs Cloud

Hello all,
I have been using Hadoop on physical machine for sometime now. But recently I tried to run the same hadoop jobs on the Raskspace cloud and I am not yet successful.
My input file has 150M transactions and all hadoop jobs finish in less than 90 minutes on a 4 node 4GB hadoop cluster on physical machines. But on the cloud, I am using 8 GB servers with 4 node cluster and I keep getting following failures. I am wondering if hadoop on the cloud is any different than physical machines.

Do I need to use different parameters on the cloud than on the physical machine?

Praveen

FAILED  java.io.IOException: Spill failed
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:56)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:34)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
        at org.apache.hadoop.util.Shell.run(Shell.java:134)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory
        at java.lang.UNIXProcess.(UNIXProcess.java:148)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 9 more




RE: Hadoop on physical machine Vs Cloud

Posted by pr...@nokia.com.
I got this working when I bumped up the memory on the cloud to 8GB instead of 4GB. I guess with 4GB it was running out of resources.

Praveen
________________________________
From: ext praveen.peddi@nokia.com [praveen.peddi@nokia.com]
Sent: Thursday, February 10, 2011 4:40 PM
To: common-user@hadoop.apache.org; mapreduce-user@hadoop.apache.org
Subject: Hadoop on physical machine Vs Cloud

Hello all,
I have been using Hadoop on physical machine for sometime now. But recently I tried to run the same hadoop jobs on the Raskspace cloud and I am not yet successful.
My input file has 150M transactions and all hadoop jobs finish in less than 90 minutes on a 4 node 4GB hadoop cluster on physical machines. But on the cloud, I am using 8 GB servers with 4 node cluster and I keep getting following failures. I am wondering if hadoop on the cloud is any different than physical machines.

Do I need to use different parameters on the cloud than on the physical machine?

Praveen

FAILED  java.io.IOException: Spill failed
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:56)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:34)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
        at org.apache.hadoop.util.Shell.run(Shell.java:134)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory
        at java.lang.UNIXProcess.(UNIXProcess.java:148)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 9 more



RE: Hadoop on physical machine Vs Cloud

Posted by pr...@nokia.com.
I got this working when I bumped up the memory on the cloud to 8GB instead of 4GB. I guess with 4GB it was running out of resources.

Praveen
________________________________
From: ext praveen.peddi@nokia.com [praveen.peddi@nokia.com]
Sent: Thursday, February 10, 2011 4:40 PM
To: common-user@hadoop.apache.org; mapreduce-user@hadoop.apache.org
Subject: Hadoop on physical machine Vs Cloud

Hello all,
I have been using Hadoop on physical machine for sometime now. But recently I tried to run the same hadoop jobs on the Raskspace cloud and I am not yet successful.
My input file has 150M transactions and all hadoop jobs finish in less than 90 minutes on a 4 node 4GB hadoop cluster on physical machines. But on the cloud, I am using 8 GB servers with 4 node cluster and I keep getting following failures. I am wondering if hadoop on the cloud is any different than physical machines.

Do I need to use different parameters on the cloud than on the physical machine?

Praveen

FAILED  java.io.IOException: Spill failed
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:860)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:56)
        at com.nokia.relevancy.util.mahout.DownloadTransactionBuilderJob$DownloadTransactionBuilderMapper.map(DownloadTransactionBuilderJob.java:34)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Cannot run program "bash": java.io.IOException: error=12, Cannot allocate memory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
        at org.apache.hadoop.util.Shell.run(Shell.java:134)
        at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
        at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory
        at java.lang.UNIXProcess.(UNIXProcess.java:148)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 9 more