You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by YouPeng Yang <yy...@gmail.com> on 2013/05/09 17:42:54 UTC
issues with decrease the default.block.size
hi ALL
I am going to setup a new hadoop environment, .Because of there
are lots of small files, I would like to change the default.block.size
to 16MB
other than adopting the ways to merge the files into large enough (e.g
using sequencefiles).
I want to ask are there any bad influences or issues?
Regards
Re: issues with decrease the default.block.size
Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.
It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.
On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:
> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>> I am going to setup a new hadoop environment, .Because of
>> there are
>> >>> lots of small files, I would like to change the
>> default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge the files into large enough
>> (e.g
>> >>> using sequencefiles).
>> >>> I want to ask are there any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: issues with decrease the default.block.size
Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.
It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.
On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:
> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>> I am going to setup a new hadoop environment, .Because of
>> there are
>> >>> lots of small files, I would like to change the
>> default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge the files into large enough
>> (e.g
>> >>> using sequencefiles).
>> >>> I want to ask are there any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: issues with decrease the default.block.size
Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.
It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.
On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:
> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>> I am going to setup a new hadoop environment, .Because of
>> there are
>> >>> lots of small files, I would like to change the
>> default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge the files into large enough
>> (e.g
>> >>> using sequencefiles).
>> >>> I want to ask are there any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: issues with decrease the default.block.size
Posted by Ted Dunning <td...@maprtech.com>.
The block size controls lots of things in Hadoop.
It affects read parallelism, scalability, block allocation and other
aspects of operations either directly or indirectly.
On Sun, May 12, 2013 at 10:38 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:
> The block size is for allocation not storage on the disk.
>
> *Thanks & Regards *
>
> ∞
> Shashwat Shriparv
>
>
>
> On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Thanks. I failed to add: It should be okay to do if those cases are
>> true and the cluster seems under-utilized right now.
>>
>> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
>> <yy...@gmail.com> wrote:
>> > Hi harsh
>> >
>> > Yep.
>> >
>> >
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> >
>> >
>> > 发自我的 iPhone
>> >
>> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>> >
>> >> Are you looking to decrease it to get more parallel map tasks out of
>> >> the small files? Are you currently CPU bound on processing these small
>> >> files?
>> >>
>> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <
>> yypvsxf19870706@gmail.com> wrote:
>> >>> hi ALL
>> >>>
>> >>> I am going to setup a new hadoop environment, .Because of
>> there are
>> >>> lots of small files, I would like to change the
>> default.block.size to
>> >>> 16MB
>> >>> other than adopting the ways to merge the files into large enough
>> (e.g
>> >>> using sequencefiles).
>> >>> I want to ask are there any bad influences or issues?
>> >>>
>> >>> Regards
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>>
>>
>>
>> --
>> Harsh J
>>
>
>
Re: issues with decrease the default.block.size
Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.
*Thanks & Regards *
∞
Shashwat Shriparv
On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>> I am going to setup a new hadoop environment, .Because of there
> are
> >>> lots of small files, I would like to change the default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge the files into large enough
> (e.g
> >>> using sequencefiles).
> >>> I want to ask are there any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>
Re: issues with decrease the default.block.size
Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.
*Thanks & Regards *
∞
Shashwat Shriparv
On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>> I am going to setup a new hadoop environment, .Because of there
> are
> >>> lots of small files, I would like to change the default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge the files into large enough
> (e.g
> >>> using sequencefiles).
> >>> I want to ask are there any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>
Re: issues with decrease the default.block.size
Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.
*Thanks & Regards *
∞
Shashwat Shriparv
On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>> I am going to setup a new hadoop environment, .Because of there
> are
> >>> lots of small files, I would like to change the default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge the files into large enough
> (e.g
> >>> using sequencefiles).
> >>> I want to ask are there any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>
Re: issues with decrease the default.block.size
Posted by shashwat shriparv <dw...@gmail.com>.
The block size is for allocation not storage on the disk.
*Thanks & Regards *
∞
Shashwat Shriparv
On Fri, May 10, 2013 at 8:54 PM, Harsh J <ha...@cloudera.com> wrote:
> Thanks. I failed to add: It should be okay to do if those cases are
> true and the cluster seems under-utilized right now.
>
> On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
> <yy...@gmail.com> wrote:
> > Hi harsh
> >
> > Yep.
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> > 发自我的 iPhone
> >
> > 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> >
> >> Are you looking to decrease it to get more parallel map tasks out of
> >> the small files? Are you currently CPU bound on processing these small
> >> files?
> >>
> >> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com>
> wrote:
> >>> hi ALL
> >>>
> >>> I am going to setup a new hadoop environment, .Because of there
> are
> >>> lots of small files, I would like to change the default.block.size
> to
> >>> 16MB
> >>> other than adopting the ways to merge the files into large enough
> (e.g
> >>> using sequencefiles).
> >>> I want to ask are there any bad influences or issues?
> >>>
> >>> Regards
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.
On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>> I am going to setup a new hadoop environment, .Because of there are
>>> lots of small files, I would like to change the default.block.size to
>>> 16MB
>>> other than adopting the ways to merge the files into large enough (e.g
>>> using sequencefiles).
>>> I want to ask are there any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J
--
Harsh J
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.
On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>> I am going to setup a new hadoop environment, .Because of there are
>>> lots of small files, I would like to change the default.block.size to
>>> 16MB
>>> other than adopting the ways to merge the files into large enough (e.g
>>> using sequencefiles).
>>> I want to ask are there any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J
--
Harsh J
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.
On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>> I am going to setup a new hadoop environment, .Because of there are
>>> lots of small files, I would like to change the default.block.size to
>>> 16MB
>>> other than adopting the ways to merge the files into large enough (e.g
>>> using sequencefiles).
>>> I want to ask are there any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J
--
Harsh J
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Thanks. I failed to add: It should be okay to do if those cases are
true and the cluster seems under-utilized right now.
On Fri, May 10, 2013 at 8:29 PM, yypvsxf19870706
<yy...@gmail.com> wrote:
> Hi harsh
>
> Yep.
>
>
>
> Regards
>
>
>
>
>
>
> 发自我的 iPhone
>
> 在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
>
>> Are you looking to decrease it to get more parallel map tasks out of
>> the small files? Are you currently CPU bound on processing these small
>> files?
>>
>> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>>> hi ALL
>>>
>>> I am going to setup a new hadoop environment, .Because of there are
>>> lots of small files, I would like to change the default.block.size to
>>> 16MB
>>> other than adopting the ways to merge the files into large enough (e.g
>>> using sequencefiles).
>>> I want to ask are there any bad influences or issues?
>>>
>>> Regards
>>
>>
>>
>> --
>> Harsh J
--
Harsh J
Re: issues with decrease the default.block.size
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh
Yep.
Regards
发自我的 iPhone
在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
>
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>>
>> I am going to setup a new hadoop environment, .Because of there are
>> lots of small files, I would like to change the default.block.size to
>> 16MB
>> other than adopting the ways to merge the files into large enough (e.g
>> using sequencefiles).
>> I want to ask are there any bad influences or issues?
>>
>> Regards
>
>
>
> --
> Harsh J
Re: issues with decrease the default.block.size
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh
Yep.
Regards
发自我的 iPhone
在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
>
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>>
>> I am going to setup a new hadoop environment, .Because of there are
>> lots of small files, I would like to change the default.block.size to
>> 16MB
>> other than adopting the ways to merge the files into large enough (e.g
>> using sequencefiles).
>> I want to ask are there any bad influences or issues?
>>
>> Regards
>
>
>
> --
> Harsh J
Re: issues with decrease the default.block.size
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh
Yep.
Regards
发自我的 iPhone
在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
>
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>>
>> I am going to setup a new hadoop environment, .Because of there are
>> lots of small files, I would like to change the default.block.size to
>> 16MB
>> other than adopting the ways to merge the files into large enough (e.g
>> using sequencefiles).
>> I want to ask are there any bad influences or issues?
>>
>> Regards
>
>
>
> --
> Harsh J
Re: issues with decrease the default.block.size
Posted by yypvsxf19870706 <yy...@gmail.com>.
Hi harsh
Yep.
Regards
发自我的 iPhone
在 2013-5-10,13:27,Harsh J <ha...@cloudera.com> 写道:
> Are you looking to decrease it to get more parallel map tasks out of
> the small files? Are you currently CPU bound on processing these small
> files?
>
> On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
>> hi ALL
>>
>> I am going to setup a new hadoop environment, .Because of there are
>> lots of small files, I would like to change the default.block.size to
>> 16MB
>> other than adopting the ways to merge the files into large enough (e.g
>> using sequencefiles).
>> I want to ask are there any bad influences or issues?
>>
>> Regards
>
>
>
> --
> Harsh J
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?
On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
> I am going to setup a new hadoop environment, .Because of there are
> lots of small files, I would like to change the default.block.size to
> 16MB
> other than adopting the ways to merge the files into large enough (e.g
> using sequencefiles).
> I want to ask are there any bad influences or issues?
>
> Regards
>
--
Harsh J
RE: issues with decrease the default.block.size
Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation
http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Thanks,
Manoj
From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size
hi ALL
I am going to setup a new hadoop environment, .Because of there are lots of small files, I would like to change the default.block.size to 16MB
other than adopting the ways to merge the files into large enough (e.g using sequencefiles).
I want to ask are there any bad influences or issues?
Regards
RE: issues with decrease the default.block.size
Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation
http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Thanks,
Manoj
From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size
hi ALL
I am going to setup a new hadoop environment, .Because of there are lots of small files, I would like to change the default.block.size to 16MB
other than adopting the ways to merge the files into large enough (e.g using sequencefiles).
I want to ask are there any bad influences or issues?
Regards
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?
On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
> I am going to setup a new hadoop environment, .Because of there are
> lots of small files, I would like to change the default.block.size to
> 16MB
> other than adopting the ways to merge the files into large enough (e.g
> using sequencefiles).
> I want to ask are there any bad influences or issues?
>
> Regards
>
--
Harsh J
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?
On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
> I am going to setup a new hadoop environment, .Because of there are
> lots of small files, I would like to change the default.block.size to
> 16MB
> other than adopting the ways to merge the files into large enough (e.g
> using sequencefiles).
> I want to ask are there any bad influences or issues?
>
> Regards
>
--
Harsh J
RE: issues with decrease the default.block.size
Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation
http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Thanks,
Manoj
From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size
hi ALL
I am going to setup a new hadoop environment, .Because of there are lots of small files, I would like to change the default.block.size to 16MB
other than adopting the ways to merge the files into large enough (e.g using sequencefiles).
I want to ask are there any bad influences or issues?
Regards
Re: issues with decrease the default.block.size
Posted by Harsh J <ha...@cloudera.com>.
Are you looking to decrease it to get more parallel map tasks out of
the small files? Are you currently CPU bound on processing these small
files?
On Thu, May 9, 2013 at 9:12 PM, YouPeng Yang <yy...@gmail.com> wrote:
> hi ALL
>
> I am going to setup a new hadoop environment, .Because of there are
> lots of small files, I would like to change the default.block.size to
> 16MB
> other than adopting the ways to merge the files into large enough (e.g
> using sequencefiles).
> I want to ask are there any bad influences or issues?
>
> Regards
>
--
Harsh J
RE: issues with decrease the default.block.size
Posted by "S, Manoj" <ma...@intel.com>.
http://search-hadoop.com/m/pF9001VX6SH/default.block.size&subj=Re+about+block+size
http://search-hadoop.com/m/HItS5IClD21/block+size&subj=Newbie+question+on+block+size+calculation
http://www.bodhtree.com/blog/2012/09/28/hadoop-how-to-manage-huge-numbers-of-small-files-in-hdfs/
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Thanks,
Manoj
From: YouPeng Yang [mailto:yypvsxf19870706@gmail.com]
Sent: Thursday, May 09, 2013 9:13 PM
To: user@hadoop.apache.org
Subject: issues with decrease the default.block.size
hi ALL
I am going to setup a new hadoop environment, .Because of there are lots of small files, I would like to change the default.block.size to 16MB
other than adopting the ways to merge the files into large enough (e.g using sequencefiles).
I want to ask are there any bad influences or issues?
Regards