You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "S.L" <si...@gmail.com> on 2014/03/01 01:07:04 UTC

Hadoop "Spill Failed" Exception in an ec2 instance with 420 GB of instance storage

Hi All,


I am using Hadoop2.3.0 and have installed it as single node cluster
(psuedo-distributed mode)  on CentOS 6.4 Amazon ec2 instance with an
instance storage of 420GB and 7.5GB of RAM , my understanding is that the "
Spill Failed " exception only occurs when the node runs out of the disk
space however , after running map/reduce tasks for only a short amount of
time (no where near to 420 GB of data ) I get the following exception.

I would like to mention that I moved the Hadoop installation on the same
node from an EBS volume of 8GB (where I had installed it originally)  to an
instance store volume of 420GB on the same node  and changed the
$HADOOP_HOME environment variable and other properties to point to the
instance store volume accordingly and the Hadoop2.3.0 is now completely
contained in the 420 GB drive.

However I still see the following exception , can you please let me know if
there is anything besides Diskspace that can cause the Spill Failed
exception ?

    2014-02-28 15:35:07,630 ERROR [IPC Server handler 12 on 58189]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
attempt_1393591821307_0013_m_000000_0 - exited :
    java.io.IOException: Spill failed
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapTask.java:1533)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1442)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Unknown Source)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
    Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
not find any valid local directory for
attempt_1393591821307_0013_m_000000_0_spill_26.out
        at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
        at
org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputFiles.java:159)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1564)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$900(MapTask.java:853)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1503)


    2014-02-28 15:35:07,604 WARN [main]
org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:root (auth:SIMPLE) cause:java.io.IOException: Spill failed
    2014-02-28 15:35:07,605 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.io.IOException: Spill failed
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapTask.java:1533)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1442)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Unknown Source)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
    Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
not find any valid local directory for
attempt_1393591821307_0013_m_000000_0_spill_26.out
        at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:402)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
        at
org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputFiles.java:159)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1564)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$900(MapTask.java:853)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1503)