You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Stas Oskin <st...@gmail.com> on 2009/08/02 13:30:31 UTC

Re: "Too many open files" error, which gets resolved after some time

Hi.

I'd like to raise this issue once again, just to clarify a point.

If I have only one thread writing to HDFS, the amount of fd's should be 4,
resulting from:

1) input
2) output
3) epoll
4) stream itself

And these 4 fds should be cleared out after 10 seconds.

Is this correct?

Thanks in advance for the information!

2009/6/24 Stas Oskin <st...@gmail.com>

> Hi.
>
> So if I open one stream, it should be 4?
>
>
>
> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>
>>
>> how many threads do you have? Number of active threads is very important.
>> Normally,
>>
>> #fds = (3 * #threads_blocked_on_io) + #streams
>>
>> 12 per stream is certainly way off.
>>
>> Raghu.
>>
>>
>> Stas Oskin wrote:
>>
>>> Hi.
>>>
>>> In my case it was actually ~ 12 fd's per stream, which included pipes and
>>> epolls.
>>>
>>> Could it be that HDFS opens 3 x 3 (input - output - epoll) fd's per each
>>> thread, which make it close to the number I mentioned? Or it always 3 at
>>> maximum per thread / stream?
>>>
>>> Up to 10 sec looks quite the correct number, it seems it gets freed
>>> arround
>>> this time indeed.
>>>
>>> Regards.
>>>
>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>
>>>  To be more accurate, once you have HADOOP-4346,
>>>>
>>>> fds for epoll and pipes = 3 * threads blocked on Hadoop I/O
>>>>
>>>> Unless you have hundreds of threads at a time, you should not see
>>>> hundreds
>>>> of these. These fds stay up to 10sec even after the
>>>> threads exit.
>>>>
>>>> I am a bit confused about your exact situation. Please check number of
>>>> threads if you still facing the problem.
>>>>
>>>> Raghu.
>>>>
>>>>
>>>> Raghu Angadi wrote:
>>>>
>>>>  since you have HADOOP-4346, you should not have excessive epoll/pipe
>>>>> fds
>>>>> open. First of all do you still have the problem? If yes, how many
>>>>> hadoop
>>>>> streams do you have at a time?
>>>>>
>>>>> System.gc() won't help if you have HADOOP-4346.
>>>>>
>>>>> Ragu.
>>>>>
>>>>>  Thanks for your opinion!
>>>>>
>>>>>> 2009/6/22 Stas Oskin <st...@gmail.com>
>>>>>>
>>>>>>  Ok, seems this issue is already patched in the Hadoop distro I'm
>>>>>> using
>>>>>>
>>>>>>> (Cloudera).
>>>>>>>
>>>>>>> Any idea if I still should call GC manually/periodically to clean out
>>>>>>> all
>>>>>>> the stale pipes / epolls?
>>>>>>>
>>>>>>> 2009/6/22 Steve Loughran <st...@apache.org>
>>>>>>>
>>>>>>>  Stas Oskin wrote:
>>>>>>>
>>>>>>>>  Hi.
>>>>>>>>
>>>>>>>>  So what would be the recommended approach to pre-0.20.x series?
>>>>>>>>>
>>>>>>>>> To insure each file is used only by one thread, and then it safe to
>>>>>>>>> close
>>>>>>>>> the handle in that thread?
>>>>>>>>>
>>>>>>>>> Regards.
>>>>>>>>>
>>>>>>>>>  good question -I'm not sure. For anythiong you get with
>>>>>>>>>
>>>>>>>> FileSystem.get(),
>>>>>>>> its now dangerous to close, so try just setting the reference to
>>>>>>>> null
>>>>>>>> and
>>>>>>>> hoping that GC will do the finalize() when needed
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>
>>

Re: "Too many open files" error, which gets resolved after some time

Posted by Stas Oskin <st...@gmail.com>.
Hi Raghu.

Thanks for the clarification and for explaining the potential issue.

It is not just the fds, the applications that hit fd limits hit thread
> limits as well. Obviously Hadoop can not sustain this as the range of
> applications increases. It will be fixed one way or the other.
>

Can you please clarify the thread limit matter?

AFAIK it only happens if the allocated stack too large, and we speak about
thousands of threads ( a possible solution described here:
http://candrews.integralblue.com/2009/01/preventing-outofmemoryerror-native-thread/
).

So how it's tied to fd's?

Thanks.

Re: "Too many open files" error, which gets resolved after some time

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Stas Oskin wrote:
> Hi.
> 
> Thanks for the explanation.
> 
> Just to clarify, the extra thread waiting on writing, is happens in
> multi-threading as well?
> 
> Meaning if I have 10 writing threads for example, it would be actually 70
> fd's?

unfortunately, yes.

There are different proposals to fix this : async I/O in Hadoop, RPCs 
for data transfers.

It is not just the fds, the applications that hit fd limits hit thread 
limits as well. Obviously Hadoop can not sustain this as the range of 
applications increases. It will be fixed one way or the other.

Raghu.

> Regards.
> 
> 2009/8/3 Raghu Angadi <ra...@yahoo-inc.com>
> 
>> For writes, there is an extra thread waiting on i/o. So it would be 3 fds
>> more. To simplify earlier equation, on the client side :
>>
>> for writes :  max fds (for io bound load) = 7 * #write_streams
>> for reads  :  max fds (for io bound load) = 4 * #read_streams
>>
>> The main socket is cleared as soon as you close the stream.
>> The rest of fds stay for 10 sec (they get reused if you open more streams
>> meanwhile).
>>
>> I hope this is clear enough.
>>
>>
>> Raghu.
>>
>> Stas Oskin wrote:
>>
>>> Hi.
>>>
>>> I'd like to raise this issue once again, just to clarify a point.
>>>
>>> If I have only one thread writing to HDFS, the amount of fd's should be 4,
>>> resulting from:
>>>
>>> 1) input
>>> 2) output
>>> 3) epoll
>>> 4) stream itself
>>>
>>> And these 4 fds should be cleared out after 10 seconds.
>>>
>>> Is this correct?
>>>
>>> Thanks in advance for the information!
>>>
>>> 2009/6/24 Stas Oskin <st...@gmail.com>
>>>
>>>  Hi.
>>>> So if I open one stream, it should be 4?
>>>>
>>>>
>>>>
>>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>>
>>>>  how many threads do you have? Number of active threads is very
>>>>> important.
>>>>> Normally,
>>>>>
>>>>> #fds = (3 * #threads_blocked_on_io) + #streams
>>>>>
>>>>> 12 per stream is certainly way off.
>>>>>
>>>>> Raghu.
>>>>>
>>>>>
>>>>> Stas Oskin wrote:
>>>>>
>>>>>  Hi.
>>>>>> In my case it was actually ~ 12 fd's per stream, which included pipes
>>>>>> and
>>>>>> epolls.
>>>>>>
>>>>>> Could it be that HDFS opens 3 x 3 (input - output - epoll) fd's per
>>>>>> each
>>>>>> thread, which make it close to the number I mentioned? Or it always 3
>>>>>> at
>>>>>> maximum per thread / stream?
>>>>>>
>>>>>> Up to 10 sec looks quite the correct number, it seems it gets freed
>>>>>> arround
>>>>>> this time indeed.
>>>>>>
>>>>>> Regards.
>>>>>>
>>>>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>>>>
>>>>>>  To be more accurate, once you have HADOOP-4346,
>>>>>>
>>>>>>> fds for epoll and pipes = 3 * threads blocked on Hadoop I/O
>>>>>>>
>>>>>>> Unless you have hundreds of threads at a time, you should not see
>>>>>>> hundreds
>>>>>>> of these. These fds stay up to 10sec even after the
>>>>>>> threads exit.
>>>>>>>
>>>>>>> I am a bit confused about your exact situation. Please check number of
>>>>>>> threads if you still facing the problem.
>>>>>>>
>>>>>>> Raghu.
>>>>>>>
>>>>>>>
>>>>>>> Raghu Angadi wrote:
>>>>>>>
>>>>>>>  since you have HADOOP-4346, you should not have excessive epoll/pipe
>>>>>>>
>>>>>>>> fds
>>>>>>>> open. First of all do you still have the problem? If yes, how many
>>>>>>>> hadoop
>>>>>>>> streams do you have at a time?
>>>>>>>>
>>>>>>>> System.gc() won't help if you have HADOOP-4346.
>>>>>>>>
>>>>>>>> Ragu.
>>>>>>>>
>>>>>>>>  Thanks for your opinion!
>>>>>>>>
>>>>>>>>  2009/6/22 Stas Oskin <st...@gmail.com>
>>>>>>>>>  Ok, seems this issue is already patched in the Hadoop distro I'm
>>>>>>>>> using
>>>>>>>>>
>>>>>>>>>  (Cloudera).
>>>>>>>>>> Any idea if I still should call GC manually/periodically to clean
>>>>>>>>>> out
>>>>>>>>>> all
>>>>>>>>>> the stale pipes / epolls?
>>>>>>>>>>
>>>>>>>>>> 2009/6/22 Steve Loughran <st...@apache.org>
>>>>>>>>>>
>>>>>>>>>>  Stas Oskin wrote:
>>>>>>>>>>
>>>>>>>>>>   Hi.
>>>>>>>>>>>  So what would be the recommended approach to pre-0.20.x series?
>>>>>>>>>>>
>>>>>>>>>>>> To insure each file is used only by one thread, and then it safe
>>>>>>>>>>>> to
>>>>>>>>>>>> close
>>>>>>>>>>>> the handle in that thread?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards.
>>>>>>>>>>>>
>>>>>>>>>>>>  good question -I'm not sure. For anythiong you get with
>>>>>>>>>>>>
>>>>>>>>>>>>  FileSystem.get(),
>>>>>>>>>>> its now dangerous to close, so try just setting the reference to
>>>>>>>>>>> null
>>>>>>>>>>> and
>>>>>>>>>>> hoping that GC will do the finalize() when needed
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
> 


Re: "Too many open files" error, which gets resolved after some time

Posted by Stas Oskin <st...@gmail.com>.
Hi.

Thanks for the explanation.

Just to clarify, the extra thread waiting on writing, is happens in
multi-threading as well?

Meaning if I have 10 writing threads for example, it would be actually 70
fd's?

Regards.

2009/8/3 Raghu Angadi <ra...@yahoo-inc.com>

> For writes, there is an extra thread waiting on i/o. So it would be 3 fds
> more. To simplify earlier equation, on the client side :
>
> for writes :  max fds (for io bound load) = 7 * #write_streams
> for reads  :  max fds (for io bound load) = 4 * #read_streams
>
> The main socket is cleared as soon as you close the stream.
> The rest of fds stay for 10 sec (they get reused if you open more streams
> meanwhile).
>
> I hope this is clear enough.
>
>
> Raghu.
>
> Stas Oskin wrote:
>
>> Hi.
>>
>> I'd like to raise this issue once again, just to clarify a point.
>>
>> If I have only one thread writing to HDFS, the amount of fd's should be 4,
>> resulting from:
>>
>> 1) input
>> 2) output
>> 3) epoll
>> 4) stream itself
>>
>> And these 4 fds should be cleared out after 10 seconds.
>>
>> Is this correct?
>>
>> Thanks in advance for the information!
>>
>> 2009/6/24 Stas Oskin <st...@gmail.com>
>>
>>  Hi.
>>>
>>> So if I open one stream, it should be 4?
>>>
>>>
>>>
>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>
>>>  how many threads do you have? Number of active threads is very
>>>> important.
>>>> Normally,
>>>>
>>>> #fds = (3 * #threads_blocked_on_io) + #streams
>>>>
>>>> 12 per stream is certainly way off.
>>>>
>>>> Raghu.
>>>>
>>>>
>>>> Stas Oskin wrote:
>>>>
>>>>  Hi.
>>>>>
>>>>> In my case it was actually ~ 12 fd's per stream, which included pipes
>>>>> and
>>>>> epolls.
>>>>>
>>>>> Could it be that HDFS opens 3 x 3 (input - output - epoll) fd's per
>>>>> each
>>>>> thread, which make it close to the number I mentioned? Or it always 3
>>>>> at
>>>>> maximum per thread / stream?
>>>>>
>>>>> Up to 10 sec looks quite the correct number, it seems it gets freed
>>>>> arround
>>>>> this time indeed.
>>>>>
>>>>> Regards.
>>>>>
>>>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>>>
>>>>>  To be more accurate, once you have HADOOP-4346,
>>>>>
>>>>>> fds for epoll and pipes = 3 * threads blocked on Hadoop I/O
>>>>>>
>>>>>> Unless you have hundreds of threads at a time, you should not see
>>>>>> hundreds
>>>>>> of these. These fds stay up to 10sec even after the
>>>>>> threads exit.
>>>>>>
>>>>>> I am a bit confused about your exact situation. Please check number of
>>>>>> threads if you still facing the problem.
>>>>>>
>>>>>> Raghu.
>>>>>>
>>>>>>
>>>>>> Raghu Angadi wrote:
>>>>>>
>>>>>>  since you have HADOOP-4346, you should not have excessive epoll/pipe
>>>>>>
>>>>>>> fds
>>>>>>> open. First of all do you still have the problem? If yes, how many
>>>>>>> hadoop
>>>>>>> streams do you have at a time?
>>>>>>>
>>>>>>> System.gc() won't help if you have HADOOP-4346.
>>>>>>>
>>>>>>> Ragu.
>>>>>>>
>>>>>>>  Thanks for your opinion!
>>>>>>>
>>>>>>>  2009/6/22 Stas Oskin <st...@gmail.com>
>>>>>>>>
>>>>>>>>  Ok, seems this issue is already patched in the Hadoop distro I'm
>>>>>>>> using
>>>>>>>>
>>>>>>>>  (Cloudera).
>>>>>>>>>
>>>>>>>>> Any idea if I still should call GC manually/periodically to clean
>>>>>>>>> out
>>>>>>>>> all
>>>>>>>>> the stale pipes / epolls?
>>>>>>>>>
>>>>>>>>> 2009/6/22 Steve Loughran <st...@apache.org>
>>>>>>>>>
>>>>>>>>>  Stas Oskin wrote:
>>>>>>>>>
>>>>>>>>>   Hi.
>>>>>>>>>>
>>>>>>>>>>  So what would be the recommended approach to pre-0.20.x series?
>>>>>>>>>>
>>>>>>>>>>> To insure each file is used only by one thread, and then it safe
>>>>>>>>>>> to
>>>>>>>>>>> close
>>>>>>>>>>> the handle in that thread?
>>>>>>>>>>>
>>>>>>>>>>> Regards.
>>>>>>>>>>>
>>>>>>>>>>>  good question -I'm not sure. For anythiong you get with
>>>>>>>>>>>
>>>>>>>>>>>  FileSystem.get(),
>>>>>>>>>> its now dangerous to close, so try just setting the reference to
>>>>>>>>>> null
>>>>>>>>>> and
>>>>>>>>>> hoping that GC will do the finalize() when needed
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>
>

Re: "Too many open files" error, which gets resolved after some time

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
For writes, there is an extra thread waiting on i/o. So it would be 3 
fds more. To simplify earlier equation, on the client side :

for writes :  max fds (for io bound load) = 7 * #write_streams
for reads  :  max fds (for io bound load) = 4 * #read_streams

The main socket is cleared as soon as you close the stream.
The rest of fds stay for 10 sec (they get reused if you open more 
streams meanwhile).

I hope this is clear enough.

Raghu.

Stas Oskin wrote:
> Hi.
> 
> I'd like to raise this issue once again, just to clarify a point.
> 
> If I have only one thread writing to HDFS, the amount of fd's should be 4,
> resulting from:
> 
> 1) input
> 2) output
> 3) epoll
> 4) stream itself
> 
> And these 4 fds should be cleared out after 10 seconds.
> 
> Is this correct?
> 
> Thanks in advance for the information!
> 
> 2009/6/24 Stas Oskin <st...@gmail.com>
> 
>> Hi.
>>
>> So if I open one stream, it should be 4?
>>
>>
>>
>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>
>>> how many threads do you have? Number of active threads is very important.
>>> Normally,
>>>
>>> #fds = (3 * #threads_blocked_on_io) + #streams
>>>
>>> 12 per stream is certainly way off.
>>>
>>> Raghu.
>>>
>>>
>>> Stas Oskin wrote:
>>>
>>>> Hi.
>>>>
>>>> In my case it was actually ~ 12 fd's per stream, which included pipes and
>>>> epolls.
>>>>
>>>> Could it be that HDFS opens 3 x 3 (input - output - epoll) fd's per each
>>>> thread, which make it close to the number I mentioned? Or it always 3 at
>>>> maximum per thread / stream?
>>>>
>>>> Up to 10 sec looks quite the correct number, it seems it gets freed
>>>> arround
>>>> this time indeed.
>>>>
>>>> Regards.
>>>>
>>>> 2009/6/23 Raghu Angadi <ra...@yahoo-inc.com>
>>>>
>>>>  To be more accurate, once you have HADOOP-4346,
>>>>> fds for epoll and pipes = 3 * threads blocked on Hadoop I/O
>>>>>
>>>>> Unless you have hundreds of threads at a time, you should not see
>>>>> hundreds
>>>>> of these. These fds stay up to 10sec even after the
>>>>> threads exit.
>>>>>
>>>>> I am a bit confused about your exact situation. Please check number of
>>>>> threads if you still facing the problem.
>>>>>
>>>>> Raghu.
>>>>>
>>>>>
>>>>> Raghu Angadi wrote:
>>>>>
>>>>>  since you have HADOOP-4346, you should not have excessive epoll/pipe
>>>>>> fds
>>>>>> open. First of all do you still have the problem? If yes, how many
>>>>>> hadoop
>>>>>> streams do you have at a time?
>>>>>>
>>>>>> System.gc() won't help if you have HADOOP-4346.
>>>>>>
>>>>>> Ragu.
>>>>>>
>>>>>>  Thanks for your opinion!
>>>>>>
>>>>>>> 2009/6/22 Stas Oskin <st...@gmail.com>
>>>>>>>
>>>>>>>  Ok, seems this issue is already patched in the Hadoop distro I'm
>>>>>>> using
>>>>>>>
>>>>>>>> (Cloudera).
>>>>>>>>
>>>>>>>> Any idea if I still should call GC manually/periodically to clean out
>>>>>>>> all
>>>>>>>> the stale pipes / epolls?
>>>>>>>>
>>>>>>>> 2009/6/22 Steve Loughran <st...@apache.org>
>>>>>>>>
>>>>>>>>  Stas Oskin wrote:
>>>>>>>>
>>>>>>>>>  Hi.
>>>>>>>>>
>>>>>>>>>  So what would be the recommended approach to pre-0.20.x series?
>>>>>>>>>> To insure each file is used only by one thread, and then it safe to
>>>>>>>>>> close
>>>>>>>>>> the handle in that thread?
>>>>>>>>>>
>>>>>>>>>> Regards.
>>>>>>>>>>
>>>>>>>>>>  good question -I'm not sure. For anythiong you get with
>>>>>>>>>>
>>>>>>>>> FileSystem.get(),
>>>>>>>>> its now dangerous to close, so try just setting the reference to
>>>>>>>>> null
>>>>>>>>> and
>>>>>>>>> hoping that GC will do the finalize() when needed
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>