You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Thanh Do <th...@cs.wisc.edu> on 2010/10/11 22:15:35 UTC

Reason to store 64 block file in a sub directory?

Hi all,

can anyone explain to me while do HDFS has the policy
 to store 64 block files in a single sub directory?
and if the number of block files increase,
it just simply creates another subdir and put the block files there.

Thanks
Thanh

Re: Reason to store 64 block file in a sub directory?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys,

That makes more sense to me now.

On Mon, Oct 11, 2010 at 4:59 PM, Dhruba Borthakur <dh...@gmail.com> wrote:

> The number is just an adhoc number. The policy is not to put too many block
> files in the same directory because some local filesystems behave badly if
> the number of files in the same directory exceed a certain value.
>
> -dhruba
>
>
> On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
> > Hi all,
> >
> > can anyone explain to me while do HDFS has the policy
> >  to store 64 block files in a single sub directory?
> > and if the number of block files increase,
> > it just simply creates another subdir and put the block files there.
> >
> > Thanks
> > Thanh
> >
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Re: Reason to store 64 block file in a sub directory?

Posted by Thanh Do <th...@cs.wisc.edu>.
Thanks guys,

That makes more sense to me now.

On Mon, Oct 11, 2010 at 4:59 PM, Dhruba Borthakur <dh...@gmail.com> wrote:

> The number is just an adhoc number. The policy is not to put too many block
> files in the same directory because some local filesystems behave badly if
> the number of files in the same directory exceed a certain value.
>
> -dhruba
>
>
> On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:
>
> > Hi all,
> >
> > can anyone explain to me while do HDFS has the policy
> >  to store 64 block files in a single sub directory?
> > and if the number of block files increase,
> > it just simply creates another subdir and put the block files there.
> >
> > Thanks
> > Thanh
> >
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>

Re: Reason to store 64 block file in a sub directory?

Posted by Dhruba Borthakur <dh...@gmail.com>.
The number is just an adhoc number. The policy is not to put too many block
files in the same directory because some local filesystems behave badly if
the number of files in the same directory exceed a certain value.

-dhruba


On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> can anyone explain to me while do HDFS has the policy
>  to store 64 block files in a single sub directory?
> and if the number of block files increase,
> it just simply creates another subdir and put the block files there.
>
> Thanks
> Thanh
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: Reason to store 64 block file in a sub directory?

Posted by Todd Lipcon <to...@cloudera.com>.
If I recall correctly, ext3 has O(n) performance for lookup of a
directory entry. So, having thousands of files in a directory is bad
for performance. Additionally, there's a max of 31998 files in a
directory, so you have to split into subdirs eventually.

-Todd

On Mon, Oct 11, 2010 at 8:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:
> Hi all,
>
> can anyone explain to me while do HDFS has the policy
>  to store 64 block files in a single sub directory?
> and if the number of block files increase,
> it just simply creates another subdir and put the block files there.
>
> Thanks
> Thanh
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Reason to store 64 block file in a sub directory?

Posted by Todd Lipcon <to...@cloudera.com>.
If I recall correctly, ext3 has O(n) performance for lookup of a
directory entry. So, having thousands of files in a directory is bad
for performance. Additionally, there's a max of 31998 files in a
directory, so you have to split into subdirs eventually.

-Todd

On Mon, Oct 11, 2010 at 8:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:
> Hi all,
>
> can anyone explain to me while do HDFS has the policy
>  to store 64 block files in a single sub directory?
> and if the number of block files increase,
> it just simply creates another subdir and put the block files there.
>
> Thanks
> Thanh
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Reason to store 64 block file in a sub directory?

Posted by Dhruba Borthakur <dh...@gmail.com>.
The number is just an adhoc number. The policy is not to put too many block
files in the same directory because some local filesystems behave badly if
the number of files in the same directory exceed a certain value.

-dhruba


On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do <th...@cs.wisc.edu> wrote:

> Hi all,
>
> can anyone explain to me while do HDFS has the policy
>  to store 64 block files in a single sub directory?
> and if the number of block files increase,
> it just simply creates another subdir and put the block files there.
>
> Thanks
> Thanh
>



-- 
Connect to me at http://www.facebook.com/dhruba