You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2014/01/26 18:32:37 UTC
HDFS open file limit
I have an application that wants to open a large set of files in HDFS simultaneously. Are there hard or practical limits to what can be opened at once by a single process? By the entire cluster in aggregate?
Thanks
John
RE: HDFS open file limit
Posted by John Lilley <jo...@redpoint.net>.
What exception would I expect to get if this limit was exceeded?
john
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Monday, January 27, 2014 8:12 AM
To: <us...@hadoop.apache.org>
Subject: Re: HDFS open file limit
Hi John,
There is a concurrent connections limit on the DNs that's set to a default of 4k max parallel threaded connections for reading or writing blocks. This is also expandable via configuration but usually the default value suffices even for pretty large operations given the replicas help spread read load around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net>> wrote:
I have an application that wants to open a large set of files in HDFS simultaneously. Are there hard or practical limits to what can be opened at once by a single process? By the entire cluster in aggregate?
Thanks
John
RE: HDFS open file limit
Posted by John Lilley <jo...@redpoint.net>.
What exception would I expect to get if this limit was exceeded?
john
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Monday, January 27, 2014 8:12 AM
To: <us...@hadoop.apache.org>
Subject: Re: HDFS open file limit
Hi John,
There is a concurrent connections limit on the DNs that's set to a default of 4k max parallel threaded connections for reading or writing blocks. This is also expandable via configuration but usually the default value suffices even for pretty large operations given the replicas help spread read load around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net>> wrote:
I have an application that wants to open a large set of files in HDFS simultaneously. Are there hard or practical limits to what can be opened at once by a single process? By the entire cluster in aggregate?
Thanks
John
RE: HDFS open file limit
Posted by John Lilley <jo...@redpoint.net>.
What exception would I expect to get if this limit was exceeded?
john
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Monday, January 27, 2014 8:12 AM
To: <us...@hadoop.apache.org>
Subject: Re: HDFS open file limit
Hi John,
There is a concurrent connections limit on the DNs that's set to a default of 4k max parallel threaded connections for reading or writing blocks. This is also expandable via configuration but usually the default value suffices even for pretty large operations given the replicas help spread read load around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net>> wrote:
I have an application that wants to open a large set of files in HDFS simultaneously. Are there hard or practical limits to what can be opened at once by a single process? By the entire cluster in aggregate?
Thanks
John
RE: HDFS open file limit
Posted by John Lilley <jo...@redpoint.net>.
What exception would I expect to get if this limit was exceeded?
john
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Monday, January 27, 2014 8:12 AM
To: <us...@hadoop.apache.org>
Subject: Re: HDFS open file limit
Hi John,
There is a concurrent connections limit on the DNs that's set to a default of 4k max parallel threaded connections for reading or writing blocks. This is also expandable via configuration but usually the default value suffices even for pretty large operations given the replicas help spread read load around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net>> wrote:
I have an application that wants to open a large set of files in HDFS simultaneously. Are there hard or practical limits to what can be opened at once by a single process? By the entire cluster in aggregate?
Thanks
John
Re: HDFS open file limit
Posted by Harsh J <ha...@cloudera.com>.
Hi John,
There is a concurrent connections limit on the DNs that's set to a default
of 4k max parallel threaded connections for reading or writing blocks. This
is also expandable via configuration but usually the default value suffices
even for pretty large operations given the replicas help spread read load
around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net> wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Harsh J <ha...@cloudera.com>.
Hi John,
There is a concurrent connections limit on the DNs that's set to a default
of 4k max parallel threaded connections for reading or writing blocks. This
is also expandable via configuration but usually the default value suffices
even for pretty large operations given the replicas help spread read load
around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net> wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by sudhakara st <su...@gmail.com>.
There is no open file limitation for HDFS. The 'Too many open file' limit
is for OS file system. Increase *system-wide maximum number of open files,
Per-User/Group/Process file descriptor limits.*
On Mon, Jan 27, 2014 at 1:52 AM, Bertrand Dechoux <de...@gmail.com>wrote:
> At least for each machine, there is the *ulimit *that need to be verified.
>
> Regards
>
> Bertrand
>
> Bertrand Dechoux
>
>
> On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
>
>> I have an application that wants to open a large set of files in HDFS
>> simultaneously. Are there hard or practical limits to what can be opened
>> at once by a single process? By the entire cluster in aggregate?
>>
>> Thanks
>>
>> John
>>
>>
>>
>>
>>
>
>
--
Regards,
...Sudhakara.st
Re: HDFS open file limit
Posted by sudhakara st <su...@gmail.com>.
There is no open file limitation for HDFS. The 'Too many open file' limit
is for OS file system. Increase *system-wide maximum number of open files,
Per-User/Group/Process file descriptor limits.*
On Mon, Jan 27, 2014 at 1:52 AM, Bertrand Dechoux <de...@gmail.com>wrote:
> At least for each machine, there is the *ulimit *that need to be verified.
>
> Regards
>
> Bertrand
>
> Bertrand Dechoux
>
>
> On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
>
>> I have an application that wants to open a large set of files in HDFS
>> simultaneously. Are there hard or practical limits to what can be opened
>> at once by a single process? By the entire cluster in aggregate?
>>
>> Thanks
>>
>> John
>>
>>
>>
>>
>>
>
>
--
Regards,
...Sudhakara.st
Re: HDFS open file limit
Posted by sudhakara st <su...@gmail.com>.
There is no open file limitation for HDFS. The 'Too many open file' limit
is for OS file system. Increase *system-wide maximum number of open files,
Per-User/Group/Process file descriptor limits.*
On Mon, Jan 27, 2014 at 1:52 AM, Bertrand Dechoux <de...@gmail.com>wrote:
> At least for each machine, there is the *ulimit *that need to be verified.
>
> Regards
>
> Bertrand
>
> Bertrand Dechoux
>
>
> On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
>
>> I have an application that wants to open a large set of files in HDFS
>> simultaneously. Are there hard or practical limits to what can be opened
>> at once by a single process? By the entire cluster in aggregate?
>>
>> Thanks
>>
>> John
>>
>>
>>
>>
>>
>
>
--
Regards,
...Sudhakara.st
Re: HDFS open file limit
Posted by sudhakara st <su...@gmail.com>.
There is no open file limitation for HDFS. The 'Too many open file' limit
is for OS file system. Increase *system-wide maximum number of open files,
Per-User/Group/Process file descriptor limits.*
On Mon, Jan 27, 2014 at 1:52 AM, Bertrand Dechoux <de...@gmail.com>wrote:
> At least for each machine, there is the *ulimit *that need to be verified.
>
> Regards
>
> Bertrand
>
> Bertrand Dechoux
>
>
> On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
>
>> I have an application that wants to open a large set of files in HDFS
>> simultaneously. Are there hard or practical limits to what can be opened
>> at once by a single process? By the entire cluster in aggregate?
>>
>> Thanks
>>
>> John
>>
>>
>>
>>
>>
>
>
--
Regards,
...Sudhakara.st
Re: HDFS open file limit
Posted by Bertrand Dechoux <de...@gmail.com>.
At least for each machine, there is the *ulimit *that need to be verified.
Regards
Bertrand
Bertrand Dechoux
On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Bertrand Dechoux <de...@gmail.com>.
At least for each machine, there is the *ulimit *that need to be verified.
Regards
Bertrand
Bertrand Dechoux
On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Harsh J <ha...@cloudera.com>.
Hi John,
There is a concurrent connections limit on the DNs that's set to a default
of 4k max parallel threaded connections for reading or writing blocks. This
is also expandable via configuration but usually the default value suffices
even for pretty large operations given the replicas help spread read load
around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net> wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Bertrand Dechoux <de...@gmail.com>.
At least for each machine, there is the *ulimit *that need to be verified.
Regards
Bertrand
Bertrand Dechoux
On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Bertrand Dechoux <de...@gmail.com>.
At least for each machine, there is the *ulimit *that need to be verified.
Regards
Bertrand
Bertrand Dechoux
On Sun, Jan 26, 2014 at 6:32 PM, John Lilley <jo...@redpoint.net>wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>
Re: HDFS open file limit
Posted by Harsh J <ha...@cloudera.com>.
Hi John,
There is a concurrent connections limit on the DNs that's set to a default
of 4k max parallel threaded connections for reading or writing blocks. This
is also expandable via configuration but usually the default value suffices
even for pretty large operations given the replicas help spread read load
around.
Beyond this you will mostly just run into configurable OS limitations.
On Jan 26, 2014 11:03 PM, "John Lilley" <jo...@redpoint.net> wrote:
> I have an application that wants to open a large set of files in HDFS
> simultaneously. Are there hard or practical limits to what can be opened
> at once by a single process? By the entire cluster in aggregate?
>
> Thanks
>
> John
>
>
>
>
>