You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Bryan A. P. Pendleton" <bp...@geekdom.net> on 2007/01/08 21:03:23 UTC

Problems with running out of file handles?

I've recently added a bunch of big new machines to my cluster.

However, it seems their defaults for number of files open per user is lower.
Though I know I can fix this, I thought the "spilling" code was supposed to
prevent running out of open files? Are spills not always closed after
they've been written to, or, is there some other growth in the number of
open output files?

The machines have 1024 files open per user by default, and have been running
out of file descriptors (and erroring out the tasks run on them) at
spill253.out. For instance:

java.io.FileNotFoundException:
/state/partition1/hadoop/mapred/local/task_0001_m_000077_0/spill253.out (Too
many open files)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.(FileInputStream.java:106)
    at org.apache.hadoop.fs.LocalFileSystem$LocalFSFileInputStream.(
LocalFileSystem.java:92)
    at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java
:143)
    at org.apache.hadoop.fs.FSDataInputStream$Checker.(
FSDataInputStream.java:52)
    at org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:279)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:262)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(
MapTask.java:475)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:191)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1347)


Before anyone asks, yes, my map outputs have *huge* outputs (~1Tb/90 tasks
~= 10Gb each).

The problem appears to be the mergeParts job, which looks like it opens up
*all* spill files when its merging. Wasn't this code supposed to reduce
file-descriptor pressure? Seems like, in this case, if I had only 30
reducers, I'd've been capped at 30 file descriptors per map reduce task.
Now, it appears to need outputsize/spillsize file descriptors, which *could*
be >> 30.

-- 
Bryan A. P. Pendleton
Ph: (877) geek-1-bp

RE: Problems with running out of file handles?

Posted by Devaraj Das <dd...@yahoo-inc.com>.
During the merge of the spills on the Maps, yes, all spills are opened. This
is because each spill has data for all the Reduces (for your case each spill
has 30 parts). The merge happens across the spills and a final output file
generated.
But, the above can be fixed. Thanks to you, I noticed in the code there is
an unnecessary "open" done for the spill files (the same spill is opened
again later on). In the current code, the merge factor, that controls the
number of open spills, is hardcoded to number of spills that the Map made.
This can be fixed by not doing this hardcoding, and instead use the
configured value - io.sort.factor.

Will submit a patch soon. Thanks!

> -----Original Message-----
> From: bpendleton@gmail.com [mailto:bpendleton@gmail.com] On Behalf Of
> Bryan A. P. Pendleton
> Sent: Tuesday, January 09, 2007 1:33 AM
> To: hadoop-dev@lucene.apache.org
> Subject: Problems with running out of file handles?
> 
> I've recently added a bunch of big new machines to my cluster.
> 
> However, it seems their defaults for number of files open per user is
> lower.
> Though I know I can fix this, I thought the "spilling" code was supposed
> to
> prevent running out of open files? Are spills not always closed after
> they've been written to, or, is there some other growth in the number of
> open output files?
> 
> The machines have 1024 files open per user by default, and have been
> running
> out of file descriptors (and erroring out the tasks run on them) at
> spill253.out. For instance:
> 
> java.io.FileNotFoundException:
> /state/partition1/hadoop/mapred/local/task_0001_m_000077_0/spill253.out
> (Too
> many open files)
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.(FileInputStream.java:106)
>     at org.apache.hadoop.fs.LocalFileSystem$LocalFSFileInputStream.(
> LocalFileSystem.java:92)
>     at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java
> :143)
>     at org.apache.hadoop.fs.FSDataInputStream$Checker.(
> FSDataInputStream.java:52)
>     at org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:279)
>     at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:262)
>     at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(
> MapTask.java:475)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:191)
>     at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
> :1347)
> 
> 
> Before anyone asks, yes, my map outputs have *huge* outputs (~1Tb/90 tasks
> ~= 10Gb each).
> 
> The problem appears to be the mergeParts job, which looks like it opens up
> *all* spill files when its merging. Wasn't this code supposed to reduce
> file-descriptor pressure? Seems like, in this case, if I had only 30
> reducers, I'd've been capped at 30 file descriptors per map reduce task.
> Now, it appears to need outputsize/spillsize file descriptors, which
> *could*
> be >> 30.
> 
> --
> Bryan A. P. Pendleton
> Ph: (877) geek-1-bp