You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by YouPeng Yang <yy...@gmail.com> on 2013/05/09 17:42:54 UTC

issues with decrease the default.block.size

hi ALL

     I am going to setup a new hadoop  environment, .Because  of  there
are lots of small  files, I would  like to change  the  default.block.size
to 16MB
other than adopting the ways to merge  the files into large  enough (e.g
using  sequencefiles).
    I want to ask are  there  any bad influences or issues?

Regards

Re: issues with decrease the default.block.size

Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.

It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.


On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>>     I am going to setup a new hadoop  environment, .Because  of
>>  there  are
>> >>> lots of small  files, I would  like to change  the
>>  default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge  the files into large  enough
>> (e.g
>> >>> using  sequencefiles).
>> >>>    I want to ask are  there  any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: issues with decrease the default.block.size

Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.

It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.


On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>>     I am going to setup a new hadoop  environment, .Because  of
>>  there  are
>> >>> lots of small  files, I would  like to change  the
>>  default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge  the files into large  enough
>> (e.g
>> >>> using  sequencefiles).
>> >>>    I want to ask are  there  any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: issues with decrease the default.block.size

Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.

It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.


On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>>     I am going to setup a new hadoop  environment, .Because  of
>>  there  are
>> >>> lots of small  files, I would  like to change  the
>>  default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge  the files into large  enough
>> (e.g
>> >>> using  sequencefiles).
>> >>>    I want to ask are  there  any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: issues with decrease the default.block.size

Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.

It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.


On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards    *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>>     I am going to setup a new hadoop  environment, .Because  of
>>  there  are
>> >>> lots of small  files, I would  like to change  the
>>  default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge  the files into large  enough
>> (e.g
>> >>> using  sequencefiles).
>> >>>    I want to ask are  there  any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: issues with decrease the default.block.size

Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:

> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>>     I am going to setup a new hadoop  environment, .Because  of  there
>  are
> >>> lots of small  files, I would  like to change  the  default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge  the files into large  enough
> (e.g
> >>> using  sequencefiles).
> >>>    I want to ask are  there  any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>

Re: issues with decrease the default.block.size

Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:

> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>>     I am going to setup a new hadoop  environment, .Because  of  there
>  are
> >>> lots of small  files, I would  like to change  the  default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge  the files into large  enough
> (e.g
> >>> using  sequencefiles).
> >>>    I want to ask are  there  any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>

Re: issues with decrease the default.block.size

Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:

> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>>     I am going to setup a new hadoop  environment, .Because  of  there
>  are
> >>> lots of small  files, I would  like to change  the  default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge  the files into large  enough
> (e.g
> >>> using  sequencefiles).
> >>>    I want to ask are  there  any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>

Re: issues with decrease the default.block.size

Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.

*Thanks & Regards    *

∞
Shashwat Shriparv



On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:

> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>>     I am going to setup a new hadoop  environment, .Because  of  there
>  are
> >>> lots of small  files, I would  like to change  the  default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge  the files into large  enough
> (e.g
> >>> using  sequencefiles).
> >>>    I want to ask are  there  any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.

On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>>> lots of small  files, I would  like to change  the  default.block.size to
>>> 16MB
>>> other than adopting the ways to merge  the files into large  enough (e.g
>>> using  sequencefiles).
>>>    I want to ask are  there  any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.

On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>>> lots of small  files, I would  like to change  the  default.block.size to
>>> 16MB
>>> other than adopting the ways to merge  the files into large  enough (e.g
>>> using  sequencefiles).
>>>    I want to ask are  there  any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.

On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>>> lots of small  files, I would  like to change  the  default.block.size to
>>> 16MB
>>> other than adopting the ways to merge  the files into large  enough (e.g
>>> using  sequencefiles).
>>>    I want to ask are  there  any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.

On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>>> lots of small  files, I would  like to change  the  default.block.size to
>>> 16MB
>>> other than adopting the ways to merge  the files into large  enough (e.g
>>> using  sequencefiles).
>>>    I want to ask are  there  any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: issues with decrease the default.block.size

Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh

Yep.



Regards 






发自我的 iPhone

在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:

> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
> 
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>> 
>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>> lots of small  files, I would  like to change  the  default.block.size to
>> 16MB
>> other than adopting the ways to merge  the files into large  enough (e.g
>> using  sequencefiles).
>>    I want to ask are  there  any bad influences or issues?
>> 
>> Regards
> 
> 
> 
> -- 
> Harsh J

Re: issues with decrease the default.block.size

Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh

Yep.



Regards 






发自我的 iPhone

在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:

> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
> 
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>> 
>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>> lots of small  files, I would  like to change  the  default.block.size to
>> 16MB
>> other than adopting the ways to merge  the files into large  enough (e.g
>> using  sequencefiles).
>>    I want to ask are  there  any bad influences or issues?
>> 
>> Regards
> 
> 
> 
> -- 
> Harsh J

Re: issues with decrease the default.block.size

Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh

Yep.



Regards 






发自我的 iPhone

在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:

> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
> 
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>> 
>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>> lots of small  files, I would  like to change  the  default.block.size to
>> 16MB
>> other than adopting the ways to merge  the files into large  enough (e.g
>> using  sequencefiles).
>>    I want to ask are  there  any bad influences or issues?
>> 
>> Regards
> 
> 
> 
> -- 
> Harsh J

Re: issues with decrease the default.block.size

Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh

Yep.



Regards 






发自我的 iPhone

在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:

> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
> 
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>> 
>>     I am going to setup a new hadoop  environment, .Because  of  there  are
>> lots of small  files, I would  like to change  the  default.block.size to
>> 16MB
>> other than adopting the ways to merge  the files into large  enough (e.g
>> using  sequencefiles).
>>    I want to ask are  there  any bad influences or issues?
>> 
>> Regards
> 
> 
> 
> -- 
> Harsh J

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?

On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
>      I am going to setup a new hadoop  environment, .Because  of  there  are
> lots of small  files, I would  like to change  the  default.block.size to
> 16MB
> other than adopting the ways to merge  the files into large  enough (e.g
> using  sequencefiles).
>     I want to ask are  there  any bad influences or issues?
>
> Regards
>



-- 
Harsh J

RE: issues with decrease the default.block.size

Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation

http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/

http://wiki.apache.org/hadoop/HowManyMapsAndReduces

Thanks,
Manoj

From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size


hi ALL

     I am going to setup a new hadoop  environment, .Because  of  there  are lots of small  files, I would  like to change  the  default.block.size to 16MB
other than adopting the ways to merge  the files into large  enough (e.g using  sequencefiles).
    I want to ask are  there  any bad influences or issues?

Regards


RE: issues with decrease the default.block.size

Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation

http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/

http://wiki.apache.org/hadoop/HowManyMapsAndReduces

Thanks,
Manoj

From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size


hi ALL

     I am going to setup a new hadoop  environment, .Because  of  there  are lots of small  files, I would  like to change  the  default.block.size to 16MB
other than adopting the ways to merge  the files into large  enough (e.g using  sequencefiles).
    I want to ask are  there  any bad influences or issues?

Regards


Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?

On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
>      I am going to setup a new hadoop  environment, .Because  of  there  are
> lots of small  files, I would  like to change  the  default.block.size to
> 16MB
> other than adopting the ways to merge  the files into large  enough (e.g
> using  sequencefiles).
>     I want to ask are  there  any bad influences or issues?
>
> Regards
>



-- 
Harsh J

Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?

On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
>      I am going to setup a new hadoop  environment, .Because  of  there  are
> lots of small  files, I would  like to change  the  default.block.size to
> 16MB
> other than adopting the ways to merge  the files into large  enough (e.g
> using  sequencefiles).
>     I want to ask are  there  any bad influences or issues?
>
> Regards
>



-- 
Harsh J

RE: issues with decrease the default.block.size

Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation

http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/

http://wiki.apache.org/hadoop/HowManyMapsAndReduces

Thanks,
Manoj

From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size


hi ALL

     I am going to setup a new hadoop  environment, .Because  of  there  are lots of small  files, I would  like to change  the  default.block.size to 16MB
other than adopting the ways to merge  the files into large  enough (e.g using  sequencefiles).
    I want to ask are  there  any bad influences or issues?

Regards


Re: issues with decrease the default.block.size

Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?

On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
>      I am going to setup a new hadoop  environment, .Because  of  there  are
> lots of small  files, I would  like to change  the  default.block.size to
> 16MB
> other than adopting the ways to merge  the files into large  enough (e.g
> using  sequencefiles).
>     I want to ask are  there  any bad influences or issues?
>
> Regards
>



-- 
Harsh J

RE: issues with decrease the default.block.size

Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation

http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/

http://wiki.apache.org/hadoop/HowManyMapsAndReduces

Thanks,
Manoj

From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size


hi ALL

     I am going to setup a new hadoop  environment, .Because  of  there  are lots of small  files, I would  like to change  the  default.block.size to 16MB
other than adopting the ways to merge  the files into large  enough (e.g using  sequencefiles).
    I want to ask are  there  any bad influences or issues?

Regards