You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Shailendra Mudgal <mu...@gmail.com> on 2007/07/17 08:36:28 UTC

"Too many open files" error after running a number of jobs

Hi ,
We have upgraded our code to nutch-0.9 with hadoop-0.12.2-core.jar. After
running say 50 nutch jobs(which includes inject/generate/fetch/parse etc.)
we start getting "Too many open files" error on our cluster. We are using
Linux box with kernel 2.6.9 and the open files number is 1024 on these
machine which is default. I read several mails from nutch-user, or
hadoop-user mailing lists. And i found only way was to increase the number
of open files using ulimit. Is there any other solution for this problem at
code level. BTW the value for io.sort.factor is 8 in our hadoop-site.xml.

Is anybody having any idea in this regard? Any help will be appreciated.

Here is the stacktrace of this error :

Error initializing task_0055_m_000000_3:java.io.FileNotFoundException :
/data/nutch/clean-hadoop5/tmp/hadoop-nutch/mapred/local/taskTracker/jobcache/job_0055/job.xml
(Too many open files) at java.io.FileOutputStream.open(Native Method) at
java.io.FileOutputStream.(FileOutputStream.java:179) at
java.io.FileOutputStream.(FileOutputStream.java:131) at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.(
RawLocalFileSystem.java:152) at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:191)
at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.(
ChecksumFileSystem.java:363) at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:346) at
org.apache.hadoop.fs.FileSystem.create(FileSystem.java:253) at
org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:84) at
org.apache.hadoop.fs.ChecksumFileSystem.copyToLocalFile(
ChecksumFileSystem.java:577) at
org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:766) at
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:352) at
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:863) at
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:531) at
org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:899) at
org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1589)



Regards,
Shailendra

Re: "Too many open files" error after running a number of jobs

Posted by Raghu Angadi <ra...@yahoo-inc.com>.

The stacktrace is on the client and not on datanode. If it is on linux, 
you can check /proc/pid/fd to see which fds are still open. Usually 1024 
should be a lot for the client (and even on datanode).

Raghu.
Andrzej Bialecki wrote:
> Shailendra Mudgal wrote:
>> Hi ,
>> We have upgraded our code to nutch-0.9 with hadoop-0.12.2-core.jar. After
>> running say 50 nutch jobs(which includes inject/generate/fetch/parse 
>> etc.)
>> we start getting "Too many open files" error on our cluster. We are using
>> Linux box with kernel 2.6.9 and the open files number is 1024 on these
>> machine which is default. I read several mails from nutch-user, or
>> hadoop-user mailing lists. And i found only way was to increase the 
>> number
>> of open files using ulimit. Is there any other solution for this 
>> problem at
>> code level. BTW the value for io.sort.factor is 8 in our hadoop-site.xml.
>>
>> Is anybody having any idea in this regard? Any help will be appreciated.
> 
> Apparently datanodes that perform intensive IO operations need a higher 
> limit. Try increasing this number to 16k or so.
> 
>

Re: "Too many open files" error after running a number of jobs

Posted by Andrzej Bialecki <ab...@getopt.org>.

Shailendra Mudgal wrote:
> Hi ,
> We have upgraded our code to nutch-0.9 with hadoop-0.12.2-core.jar. After
> running say 50 nutch jobs(which includes inject/generate/fetch/parse etc.)
> we start getting "Too many open files" error on our cluster. We are using
> Linux box with kernel 2.6.9 and the open files number is 1024 on these
> machine which is default. I read several mails from nutch-user, or
> hadoop-user mailing lists. And i found only way was to increase the number
> of open files using ulimit. Is there any other solution for this problem at
> code level. BTW the value for io.sort.factor is 8 in our hadoop-site.xml.
> 
> Is anybody having any idea in this regard? Any help will be appreciated.

Apparently datanodes that perform intensive IO operations need a higher 
limit. Try increasing this number to 16k or so.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: "Too many open files" error after running a number of jobs

Posted by Andrzej Bialecki <ab...@getopt.org>.

Shailendra Mudgal wrote:
> Hi ,
> We have upgraded our code to nutch-0.9 with hadoop-0.12.2-core.jar. After
> running say 50 nutch jobs(which includes inject/generate/fetch/parse etc.)
> we start getting "Too many open files" error on our cluster. We are using
> Linux box with kernel 2.6.9 and the open files number is 1024 on these
> machine which is default. I read several mails from nutch-user, or
> hadoop-user mailing lists. And i found only way was to increase the number
> of open files using ulimit. Is there any other solution for this problem at
> code level. BTW the value for io.sort.factor is 8 in our hadoop-site.xml.
> 
> Is anybody having any idea in this regard? Any help will be appreciated.

Apparently datanodes that perform intensive IO operations need a higher 
limit. Try increasing this number to 16k or so.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com